简体   繁体   English

在任何 HTML 标签之间选择单个单词?

[英]Select single word between any HTML tag?

I am looking to build a regular expression that will select a single word out of all text between HTML tags.我正在寻找构建一个正则表达式,它将从 HTML 标记之间的所有文本中选择一个单词。 I am looking for the occurrence of the word anywhere but inside HTML tags.我正在寻找除了 HTML 标签之外的任何地方出现的单词。 The issue is that the word I am looking to match may occur in the class or id of a tag - I would only like to match it when it is between the tags.问题是我要匹配的单词可能出现在标签的类或 id 中 - 我只想在它位于标签之间时匹配它。

Here is further clarification from my comment: I am looking for a regex to use in a loop that will find a string in another string that contains HTML.这是我的评论的进一步澄清:我正在寻找一个在循环中使用的正则表达式,它将在另一个包含 HTML 的字符串中找到一个字符串。 The large string will contain something like this:大字符串将包含如下内容:

<div class="a-class"<span class="some-class" data-content="some words containing target">some other text containing target</span>

I want the regex to match the word "target" only between the tags, not within the tag in the data-content attribute.我希望正则表达式只在标签之间匹配单词“目标”,而不是在 data-content 属性中的标签内。 I can use:我可以用:

/(\\btarget)\\b/ig

to find every instance of target.找到目标的每个实例。

If the word can be present anywhere ie even as a class name or id name then here is what you can do,如果这个词可以出现在任何地方,即甚至作为类名或 ID 名,那么你可以这样做,

Take <html> as the parent element and access all the contents within it using innerHTML, now you can find any word as follows,<html>为父元素,使用innerHTML访问其中的所有内容,现在可以找到如下任意一个词,

<html id="main">
    <div>
        <p class="yourword">
        </p>
    </div>
</html>

var str = document.getElementById("main").innerHTML;
var res = str.match(/yourword/gi);
alert(res);

The above string matches the word "yourword" from the entire document.上面的字符串匹配整个文档中的单词“yourword”。

Here is a demo which selects the string "sub".这是一个选择字符串“sub”的演示

http://jsfiddle.net/techsin/xt1j2cj8/3/ http://jsfiddle.net/techsin/xt1j2cj8/3/

here is one way to do it.这是一种方法。

var cont = $(".cont")
html = cont.html(),
    word = "Lorem";

word = word.replace(/(\s+)/, "(<[^>]+>)*$1(<[^>]+>)*");

var pattern = new RegExp("(" + word + ")", "gi");

html = html.replace(pattern, "<mark>$1</mark>");
html = html.replace(/(<mark>[^<>]*)((<[^>]+>)+)([^<>]*<\/mark>)/, "$1</mark>$2<mark>$4");

$(".cont").html(html);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM