通过正则表达式按ID获取元素

Question

I had a quick question regarding RegEx... 我对RegEx有一个快速的问题...

I have a string that looks something like the following: 我有一个类似于以下内容的字符串 ：

"This was written by <p id="auth">John Doe</p> today!"

What I want to do (with javascript) is basically extract out the 'John Doe' from any tag with the ID of "auth". 我想做的（使用javascript）基本上是从ID为“ auth”的任何标签中提取“ John Doe”。

Could anyone shed some light? 谁能给我一些启示？ I'm sorry to ask. 我很抱歉问。

Full story: I am using an XML parser to pass data into variables from a feed. 全文：我正在使用XML解析器将数据从提要中传递到变量中。 However, there is one tag in the XML document () that contains HTML passed into a string. 但是，XML文档（）中有一个标签，其中包含传递给字符串的HTML。 It looks something like this: 看起来像这样：

 <item>
  <title>This is a title</title>
  <description>
  "By <p id="auth">John Doe</p> text text text... so on"
  </description>
 </item>

So as you can see, I can't use an HTML/XML parser for that p tag, because it's in a string, not a document. 如您所见，我不能对该p标签使用HTML / XML解析器，因为它在字符串中，而不是文档中。

Answer 1

No need of regular expressions to do this. 无需正则表达式即可执行此操作。 Use the DOM instead. 请改用DOM。

var obj = document.getElementById('auth');
if (obj)
{
    alert(obj.innerHTML);
}

By the way, having multiples id with the same value in the same page is invalid (and will surely result in odd JS behavior). 顺便说一句，在同一页面中具有相同值的倍数id是无效的（并且肯定会导致奇怪的JS行为）。

If you want to have many auth on the same page use class instead of id . 如果要在同一页面上进行多次auth ，请使用class而不是id 。 Then you can use something like: 然后，您可以使用类似：

//IIRC getElementsByClassName is new in FF3 you might consider using JQuery to do so in a more "portable" way but you get the idea...
var objs = document.getElementsByClassName('auth');
if (objs)
{
    for (var i = 0; i < objs.length; i++)
        alert(obj[i].innerHTML);
}

EDIT: Since you want to parse a string that contain some HTML, you won't be able to use my answer as-iis. 编辑：由于您想解析包含一些HTML的字符串，您将无法使用我的答案as-iis。 Will your HTML string contain a whole HTML document? 您的HTML字符串会包含整个HTML文档吗？ Some part? 有一部分吗 Valid HTML? 有效的HTML吗？ Partial (broken) HTML? 部分（损坏的）HTML？

Answer 2

Here's a way to get the browser to do the HTML parsing for you: 这是一种使浏览器为您执行HTML解析的方法：

var string = "This was written by <p id=\"auth\">John Doe</p> today!";

var div = document.createElement("div");

div.innerHTML = string; // get the browser to parse the html

var children = div.getElementsByTagName("*");

for (var i = 0; i < children.length; i++)
{
    if (children[i].id == "auth")
    {
        alert(children[i].textContent);
    }
}

If you use a library like jQuery, you could hide the for loop and replace the use of textContent with something cross-browser. 如果使用类似jQuery的库，则可以隐藏for循环，并使用跨浏览器的东西来替换textContent的使用。

Answer 3

Perhaps something like 也许像

document.getElementById("auth").innerHTML.replace(/<^[^>]+>/g, '')

might work. 可能有用。 innerHTML is supported on all modern browsers. 所有现代浏览器均支持innerHTML。 (You may omit the replace if you don't care about removing HTML bits from the inner content.) （如果您不关心从内部内容中删除HTML位，则可以省略替换。）

If you have jQuery at your disposal, just do 如果您可以使用jQuery，请执行

$("#auth").text()

Answer 4

What I want to do (with javascript) is basically extract out the 'John Doe' from any tag with the ID of "auth". 我想做的（使用javascript）基本上是从ID为“ auth”的任何标签中提取“ John Doe”。

You can't have the same id ( auth ) for more than one element. 你不能有相同的ID（ auth为一个以上的元素）。 An id should be assigned once per element per page. 每个页面每个元素应分配一次ID。

If, however, you assign a class of auth to elements, you can go about something like this assuming we are dealing with paragraph elements: 但是，如果您将auth类分配给元素，则假设我们正在处理段落元素，则可以执行以下操作：

// find all paragraphs
var elms = document.getElementsByTagName('p');

for(var i = 0; i < elms.length; i++)
{
  // find elements with class auth
  if (elms[i].getAttribute('class') === 'auth') {
    var el = elms[i];

    // see if any paragraph contains the string
    if (el.innerHTML.indexOf('John Doe') != -1) {
      alert('Found ' + el.innerHTML);
    }
  }
}

Answer 5

If the content of the tag contains only text, you could use this: 如果标记的内容仅包含文本，则可以使用以下方法：

function getText (htmlStr, id) {
  return new RegExp ("<[^>]+\\sid\\s*=\\s*([\"'])"
    + id 
    + "\\1[^>]*>([^<]*)<"
  ).exec (htmlStr) [2];
}


var htmlStr = "This was written by <p id=\"auth\">John Doe</p> today!";
var id = "auth";
var text = getText (htmlStr, id);
alert (text === "John Doe");

Answer 6

Assuming you only have 1 auth per string, you might go with something like this: 假设每个字符串仅具有1个auth ，则可能需要这样的命令：

var str = "This was written by <p id=\"auth\">John Doe</p> today!",
    p = str.split('<p id="auth">'),
    q = p[1].split('</p>'),
    a = q[0];
alert(a);

Simple enough. 很简单。 Split your string on your paragraph, then split the second part on the paragraph close, and the first part of the result will be your value. 在段落上分割字符串，然后在段落上分割第二部分，结果的第一部分将成为您的值。 Every time. 每次。

通过正则表达式按ID获取元素

问题描述

6 个解决方案

解决方案1
2 2010-08-04 19:52:25

解决方案2
2 已采纳 2010-08-04 20:26:02

解决方案3
0 2010-08-04 19:51:18

解决方案4
0 2010-08-04 19:55:59

解决方案5
0 2010-08-04 20:16:25

解决方案6
0 2010-08-04 20:28:27

通过正则表达式按ID获取元素

问题描述

6 个解决方案

解决方案1 2 2010-08-04 19:52:25

解决方案2 2 已采纳 2010-08-04 20:26:02

解决方案3 0 2010-08-04 19:51:18

解决方案4 0 2010-08-04 19:55:59

解决方案5 0 2010-08-04 20:16:25

解决方案6 0 2010-08-04 20:28:27

解决方案1
2 2010-08-04 19:52:25

解决方案2
2 已采纳 2010-08-04 20:26:02

解决方案3
0 2010-08-04 19:51:18

解决方案4
0 2010-08-04 19:55:59

解决方案5
0 2010-08-04 20:16:25

解决方案6
0 2010-08-04 20:28:27