简体   繁体   English

如何获得之间的内容 <td></td> 仅当不存在嵌套标签时?

[英]How to get content between <td></td> only if no nested tags are present?

I realize there are loads of questions on getting something between something, even specifically HTML tags. 我意识到,在某些东西之间,尤其是HTML标记之间存在一些问题。 But my requirement differs because I wan't ignore <td></td> content if nested tags are present. 但是我的要求有所不同,因为如果存在嵌套标签,我将不会忽略<td></td>内容。 If there's still a duplicate, flag this and point me to that one. 如果仍然有重复项,请标记它并指向我。

Sample input : <td><p>column1</p></td><td>column2</td> 输入示例<td><p>column1</p></td><td>column2</td>
Expected output : column2 (awesome!) or >column2< 预期输出 :column2(棒极了!)或> column2 <

As per this question I tried <td>(.*?)<\\/td> and got 2 matches: 根据这个问题,我尝试了<td>(.*?)<\\/td>并得到了2个匹配项:

<td><p>column1</p></td>
<td>column2</td>

As per marked answer, I tried >[^<]*< and got this: 根据标记的答案,我尝试了>[^<]*<并得到了这个:

在此处输入图片说明

That's close. 快结束了 I am Ok with getting > and < but I want regex to ignore 1st <td> because it has <p> nested inside it. 我可以使用>和<确定,但是我希望正则表达式忽略第一个<td>因为它里面嵌套了<p>

Assumption : <p> will always be the inner most tag in case of nesting. 假设<p>在嵌套的情况下将始终是最里面的标签。 If input is <td><p>column1</p>postfix</td> , ignore such <td> . 如果输入为<td><p>column1</p>postfix</td> ,则忽略此类<td>

You should not use a regular expression to parse HTML as HTML is not a regular language. 您不应该使用正则表达式来解析HTML,因为HTML不是常规语言。 It's too complex to be parsed by regular expressions . 太复杂了,无法用正则表达式解析

What you can do is use the browser's built-in parser instead, then use DOM methods to get what you want. 您可以做的是改为使用浏览器的内置解析器,然后使用DOM方法获取所需的内容。

 var s = '<td><p>column1</p></td><td>column2</td>' var content = []; // Create a row to insert the markup into var tr = document.createElement('tr'); tr.innerHTML = s; // Get the cells var tds = tr.cells; // If a cell doesn't have any element content, put its // textContent into array for (var i=0, iLen = tds.length; i<iLen; i++) { if (tds[i].children.length == 0) { content.push(tds[i].textContent); } } console.log(content); 

 var html='<td><p>column1</p></td><td>column2</td>'; var regex=/<td>([^<]*)<\\/td>/ig; var result=regex.exec(html); console.info(result); console.info(result[1]); 

you can try this.result[1] is you want . 您可以尝试this.result [1]是您想要的。 if you want to replace the content between tag,you can write like this too : 如果要替换标签之间的内容,也可以这样编写:

  var html='<td><p>column1</p></td><td>column2</td>'; var regex=/<td>([^<]*)<\\/td>/ig; var newHtml=html.replace(regex,function(){ return '<td>'+'replacement'+'</td>'; }); console.info(newHtml); 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM