简体   繁体   English

使用jQuery在XML标签内提取HTML字符串

[英]Extracting HTML string within XML tag with jQuery

I've been working at this for a week and I'm stumped. 我已经为此工作了一个星期,但我很沮丧。

I'm trying to parse an RSS feed from SharePoint using jQuery. 我正在尝试使用jQuery从SharePoint解析RSS源。 Using $.find works great on extracting the data between valid XML tags in the feed, but unfortunately one of the tags stores several HTML tags instead of the nice and clean strings like the others. 使用$.find可以很好地提取提要中有效XML标记之间的数据,但是不幸的是,其中一个标记存储了几个HTML标记,而不是像其他字符串那样干净利落的字符串。

I have the tag extracted and stored as a string using the following: 我使用以下方法提取了标记并将其存储为字符串:

$(xml).find("item").each(function () {
var description = $(this).find('description').text();
})

Which gives me the contents of the description tag: 这给了我描述标签的内容:

<![CDATA[<div><b>Title:</b> Welcome!</div>
<div><b>Modified:</b> 6/10/2014 7:58 AM</div>
<div><b>Created:</b> 6/3/2014 2:55 PM</div>
<div><b>Created By:</b> John Smith</div>
<div><b>Modified By:</b> Samuel Smith</div>
<div><b>Version:</b> 1.0</div>
<div><b>AlertContent:</b> Stop the presses.</div>
<div><b>Team:</b> USA.</div>]]>

Now my problem is extracting and storing the useful bits. 现在我的问题是提取和存储有用的位。 Is there a way to only extract the text following AlertContent:</b> ? 有没有一种方法只能提取AlertContent:</b>的文本AlertContent:</b> It seems this might be possible using regular expressions, but I don't know how to make a filter that would start at the end of the bold tag and extend all the way until the start of the closing div tag. 使用正则表达式似乎可以实现此目的,但是我不知道如何制作一个从粗体标签的末尾开始一直扩展到结束div标签开始的过滤器。 Or is there a better way through jQuery's methods? 还是通过jQuery的方法有更好的方法?

Sure you're quite right; 确定你是对的; regular expressions can help you do that. 正则表达式可以帮助您做到这一点。 Here is how you can do it: 这是您可以执行的操作:

var alertContent = description.replace(/^.*AlertContent:</b>([^<]*).*$/i, '$1');

WORKING JSFIDDLE DEMO 工作JSFIDDLE演示

I'm sure you've heard the warnings about parsing xml with regex. 我确定您已经听说过有关使用regex解析xml的警告。 Nevertheless, in case you'd like to know how to do it with regex, this simple pattern will do it: 不过,如果您想知道如何使用正则表达式,可以使用以下简单模式进行操作:

AlertContent:<\/b>([^<]*)
  • We start by matching AlertContent:</b> 我们首先匹配AlertContent:</b>
  • Then the negative character class [^<]* matches all characters that are not a < and the parentheses capture them to Group 1 然后,负字符类[^<]*匹配所有非<字符,并且括号将它们捕获到组1中

All we need to do is read Group 1. Here is sample code to do it: 我们需要做的只是阅读第1组。这是执行此操作的示例代码:

var regex = /AlertContent:<\/b>([^<]*)/;
var match = regex.exec(string);
if (match != null) {
    alert = match[1];
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM