I managed to write this regular expression for getting the inner html from a td
tag,
<td[^>]*>(.*?)<\/td>
It is working fine. Except, neglecting the td tag in the matching. I just want to get the innerHTML
, not the outerHTML
. you can find a demo for my problem here .
Can anyone help me to get text in between the td
tag?
PS I am manipulating a string here not a html element.
Use DOM even for parsing HTML strings. HTML can be too tricky for a regex to stay effecient.
var s = 'this is a nice day<table><tr><td>aaaa <b>bold</b></td></tr><tr><td>bbbb</td></tr></table> here.'; var doc = document.createDocumentFragment(); var wrapper = document.createElement('myelt'); wrapper.innerHTML = s; doc.appendChild( wrapper ); arr = []; var n,walk=document.createTreeWalker(doc,NodeFilter.SHOW_ALL,null,false); while(n=walk.nextNode()) { if (n.nodeName.toUpperCase() === "TD") { arr.push(n.innerHTML); } } // See it works: console.log(arr); // or... for (var r = 0; r < arr.length; r++) { document.getElementById("r").innerHTML += arr[r] + "<br/>"; }
<div id="r"/>
You've actually already have the regex needed. It's just your confusing matches with captures. Your regex matches the outer HTML, but it captures the inner. Just do a match and get the first capture group. Check it out in this fiddle .
Here's the code
var s = '<table cellspacing="0px;" cellpadding="8px;"><tr><td align="right" style="padding-right:8px;line-height:18px;vertical-align:top;"><b>Import job summary</b></td><td align="left" style="max-width:300px;line-height:18px;vertical-align:top;"> 5 entries were imported successfully. 0 entries failed to import. </td></tr></table>',
re = /<td[^>]*>(.*?)<\/td>/g,
m = s.match(re),
inner = ['No match'];
if (m.length>0) {
// You have a capture
inner = m;
}
document.write( 'Inner is:<br>' + inner.join('<br>') );
Regards
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.