[英]Regex php find a character within html tag
I'm stuck on a stubborn problem I can't seem to solve. 我陷入了似乎无法解决的顽固问题。
I'm trying to find a specific character only when it is inside an html tag (not between). 我试图只在html标记内找到特定字符(不在两者之间)。
To test this I have 2 test strings: 为了测试这一点,我有2个测试字符串:
this is <a href="www.somesite.com">sentence</a>
I'd like to find all the period characters within < > html tags so the match should be 2 periods within www.somesite.com, I cannot get the match correctly. 我想在<> html标签中找到所有的句点字符,因此匹配项应该是www.somesite.com中的2个句点,我无法正确获取匹配项。 Can someone please take a look at my regex and see what I am missing?
有人可以看看我的正则表达式,看看我缺少什么吗?
(<[^>]*>?(\.))>?
Try this: 尝试这个:
$re = "/>[^<]*<(*SKIP)(*F)|searchText/mi"; //before | part avoid tag inner text and after | part search only tag inside text.
$str = "<div><a href=\"www.searchText.com\">This is <a href=\"www.searchText.com\">sentence</a> tI want to test.</a></div>";
preg_match_all($re, $str, $matches);
Given the string " This is <a href="www.somesite.com">sentence</a> I want to test.
" the regex: 给定字符串“
This is <a href="www.somesite.com">sentence</a> I want to test.
”正则表达式:
\.(?=\w)
will match the periods in the URL but not at the end of the sentence. 将匹配URL中的句点,但不匹配句子的结尾。 Note that the regex is not URL specific, it just finds a period followed immediately by a word character using a positive lookahead.
请注意 ,正则表达式不是特定于URL的,它仅使用正向查找来找到一个句点,后跟一个单词字符。
Having said that you should really be parsing HTML with something like PHPDomDocument 话虽如此,您实际上应该使用PHPDomDocument之类的东西来解析HTML。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.