[英]Javascript: regex with variables to match part of a string containing HTML code
I'm trying to match a regex (containing 1 variable) against a page of HTML code stored as a string. 我正在尝试将正则表达式(包含1个变量)与作为字符串存储的HTML代码页面进行匹配。
The HTML string is an array, each element containing something as shown below. HTML字符串是一个数组,每个元素包含如下所示的内容。 (I have split on a certain tag).
(我在某个标签上分开了)。 Each element of the array contains some data of a House (name, amount of square meters, etc).
数组的每个元素都包含房屋的一些数据(名称,平方米数量等)。 Fictional of course.
当然是虚构的。 The point is that I need to match only 1 of these houses by matching the text between the first TD tags, and the part that I need is the VALUE (digits) in the last INPUT tag of the form.
关键是,通过匹配第一个TD标签之间的文本,我只需要匹配这些房子中的一个,我需要的部分是表格的最后一个INPUT标签中的VALUE(数字)。
<TR BGCOLOR=#D4C0A1>
<TD WIDTH=40%><NOBR>Luminous Arc 2</NOBR></TD>
<TD WIDTH=10%><NOBR>154 sqm</NOBR></TD>
<TD WIDTH=10%><NOBR>6460 gold</NOBR></TD>
<TD WIDTH=40%><NOBR>rented</NOBR></TD>
<TD><TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>
<FORM ACTION= METHOD=post><TR><TD>
<INPUT TYPE=hidden NAME=world VALUE=Olympa>
<INPUT TYPE=hidden NAME=town VALUE="Yalahar">
<INPUT TYPE=hidden NAME=state VALUE=>
<INPUT TYPE=hidden NAME=type VALUE=houses>
<INPUT TYPE=hidden NAME=order VALUE=>
<INPUT TYPE=hidden NAME=houseid VALUE=37010>
<INPUT TYPE=image NAME="View" ALT="View" SRC="" BORDER=0 WIDTH=120 HEIGHT=18>
</TD></TR></FORM></TABLE></TD></TR>
I constructed the following RegEx: 我构造了以下RegEx:
var regex = new RegExp(house + "[\\\\s\\\\S]+name=houseid value=([0-9]+)>", "i");
where house
is the name of the house (in this example, Luminous Arc 2
) and the part I need would be the houseid 37010
. 其中
house
是house
的名称(在此示例中为Luminous Arc 2
),我需要的部分将是房屋37010
。
I figured this Regex should work quite fine and give me the hit that I need, however houses[i].match(regex)
returns null every time. 我认为此Regex应该可以正常工作,并为我提供所需的命中率,但是
houses[i].match(regex)
每次都返回null。 I get no match in the string. 字符串中没有匹配项。
I have tried several approaches so far, including attempting to convert the string to a DOM Object to split up on TR tags (the conversion failed). 到目前为止,我已经尝试了几种方法,包括尝试将字符串转换为DOM对象以拆分TR标签(转换失败)。 I feel that I am close, but I am stuck.
我感觉自己很近,但是被困住了。
Does anyone see why my regex might fail to work? 有人知道为什么我的正则表达式可能无法正常工作吗?
Kenneth 肯尼斯
You could add the string to your html (in a display:none
div or something like that), and then just access the DOM like you would anywhere. 您可以将字符串添加到html中(在
display:none
div或类似的名称中),然后像在任何地方一样访问DOM。
For example: 例如:
<div id="stringContainer"></div>
var searchstring = "Luminous Arc 2";
searchstring = searchstring.replace(/ /g, ' ') // Convert   to
var c = document.getElementById("stringContainer");
c.innerHTML = '<table>'+houses+'</table>';
var h = c.getElementsByTagName('tr');
for(var i = 0, l = h.length; i < l; i++){ // Loop through the found elements
var name = h[i].firstChild.nextSibling.getElementsByTagName('nobr')[0]; // Get the house's name.
if(name && name.innerHTML == searchstring){ // If the name matches the search string. (innerHTML returns instead of  . hence the replace earlier.)
console.log(h[i].getElementsByTagName('input')[5].value) // log the value.
}
}
Assuming the variable houses
is: 假设可变
houses
为:
var houses = '<TR BGCOLOR=#D4C0A1>\n\
<TD WIDTH=40%><NOBR>Luminous Arc 2</NOBR></TD>\n\
<TD WIDTH=10%><NOBR>154 sqm</NOBR></TD>\n\
<TD WIDTH=10%><NOBR>6460 gold</NOBR></TD>\n\
<TD WIDTH=40%><NOBR>rented</NOBR></TD>\n\
<TD>\n\
<TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>\n\
<FORM ACTION= METHOD=post>\n\
<TR>\n\
<TD>\n\
<INPUT TYPE=hidden NAME=world VALUE=Olympa>\n\
<INPUT TYPE=hidden NAME=town VALUE="Yalahar">\n\
<INPUT TYPE=hidden NAME=state VALUE=>\n\
<INPUT TYPE=hidden NAME=type VALUE=houses>\n\
<INPUT TYPE=hidden NAME=order VALUE=>\n\
<INPUT TYPE=hidden NAME=houseid VALUE=37010>\n\
<INPUT TYPE=image NAME="View" ALT="View" SRC="" BORDER=0 WIDTH=120 HEIGHT=18>\n\
</TD>\n\
</TR>\n\
</FORM>\n\
</TABLE>\n\
</TD>\n\
</TR>\n\
<TR BGCOLOR=#D4C0A1>\n\
<TD WIDTH=40%><NOBR>Dark Arc 2</NOBR></TD>\n\
<TD WIDTH=10%><NOBR>154 sqm</NOBR></TD>\n\
<TD WIDTH=10%><NOBR>6460 gold</NOBR></TD>\n\
<TD WIDTH=40%><NOBR>rented</NOBR></TD>\n\
<TD>\n\
<TABLE BORDER=0 CELLSPACING=0 CELLPADDING=0>\n\
<FORM ACTION= METHOD=post>\n\
<TR>\n\
<TD>\n\
<INPUT TYPE=hidden NAME=world VALUE=Olympa>\n\
<INPUT TYPE=hidden NAME=town VALUE="Yalahar">\n\
<INPUT TYPE=hidden NAME=state VALUE=>\n\
<INPUT TYPE=hidden NAME=type VALUE=houses>\n\
<INPUT TYPE=hidden NAME=order VALUE=>\n\
<INPUT TYPE=hidden NAME=houseid VALUE=37010>\n\
<INPUT TYPE=image NAME="View" ALT="View" SRC="" BORDER=0 WIDTH=120 HEIGHT=18>\n\
</TD>\n\
</TR>\n\
</FORM>\n\
</TABLE>\n\
</TD>\n\
</TR>';
I tried your regex with Cerbrus's houses
variable and it works fine. 我用Cerbrus的
houses
变量尝试了您的正则表达式,它工作正常。
(I added the lazy quantifier ?
to [\\\\s\\\\S]+
, but it works fine without it as well.) (我在
[\\\\s\\\\S]+
添加了惰性量词?
,但是如果没有它也可以正常工作。)
var house = "Luminous Arc 2";
var regex = new RegExp( house + "[\\s\\S]+?name=houseid value=([0-9]+)>", "i" );
houses.match( regex )[1]; // "37010"
Presumably then, your house
variable has the wrong value or houses[i]
is not accessing the right string. 大概就是这样,您的
house
变量值错误,或者houses[i]
没有访问正确的字符串。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.