[英]Regex do not match content but whole searched string
I'm using this regex to match an "href" attribute in a <a>
tag: 我正在使用此正则表达式来匹配<a>
标记中的“ href”属性:
var href_matches = postRep.match(/href="(.*?)"/g);
The regex matches correctly the href except it returns the whole "href=http:example.com" string. regex正确匹配href,除了它返回整个“ href = http:example.com”字符串。
How do I manage to get only the href value (eg. "example.com")? 如何设法仅获取href值(例如“ example.com”)?
You can either run exec()
on the regex : 您可以在regex上运行exec()
:
var url_match = /href="(.*?)"/g.exec(postRep);
or remove the global flag 或删除全局标志
var url_match = postRep.match(/href="(.*?)"/);
Using String's match() function won't return captured groups if the global modifier is set. 如果设置了全局修饰符,则使用String的match()函数将不会返回捕获的组。
Just another idea. 只是另一个想法。
You can try something like this function: 您可以尝试执行以下功能:
function getHrefs(inputString) {
var out = [];
inputString.replace(/\bhref\b=['"]([^'"]+)['"]/gi, function(result, backreference) {
out.push(backreference);
return '';
});
return out;
}
Improved solution (much shortest): 改进的解决方案(最短):
function getHrefs(inputString) {
return (inputString.match(/\bhref\b=['"][^'"]+(?=['"])/gi) || []).map(s => s.replace(/^href=["']/,""));
}
Edit: 编辑:
There is other option - exec. 还有其他选择-执行。 But with exec you will need loop to get all matches (if you need this). 但是使用exec时,您将需要循环来获取所有匹配项(如果需要)。
You can use regex lookbehinds to check if the "href=" is there without actually including it in the match. 您可以使用正则表达式lookbehinds来检查“ href =“是否存在,而无需在匹配中实际包含它。 For example, the regex (?<=href=)example\\.com
applied to href=example.com
should only match example.com
. 例如,应用于href=example.com
的正则表达式(?<=href=)example\\.com
应该只与example.com
匹配。
EDIT: This method only works in languages that support regex lookbehinds. 编辑:此方法仅在支持正则表达式lookbehinds的语言中有效。 Javascript doesn't support this feature. Javascript不支持此功能。 (thanks to Georgi Naumov for pointing this out) (感谢Georgi Naumov指出了这一点)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.