简体   繁体   English

在HTML标签之间提取文字并计数

[英]Exract text between HTML tags and count them

so lets say I have written an article with many tags like [code]this is a code[/code] , and I would know how many code tags are in the article and what is the text inside. 可以这么说,我写了一篇文章,其中包含许多标签,例如[code]this is a code[/code] ,我会知道文章中有多少个code标签,里面的文字是什么。

I tried preg_matches and preg_replaces , but nothing has worked so far. 我尝试了preg_matchespreg_replaces ,但是到目前为止没有任何效果。 What would be the appropriate way to do it? 合适的方法是什么?

$pattern = '/\[code\](.*?)\[\/code\]/s';

preg_match_all($pattern, $code, $matches);

echo count($matches)."\n";

var_dump($matches);

This should be interesting for your: 这对您应该很有趣:

/\[code\]([^]]+)\[\/code\]/ 

You need to use a match_all to get all the values. 您需要使用match_all来获取所有值。 BTW there must be some flaw with code like this : 顺便说一句,这样的代码一定存在一些缺陷:

[code]blabla [code]bleh bleh[/code][/code]

Since regex cannot parse with multiple levels of depth. 由于正则表达式无法解析多个深度级别。 At least when the depth is unknown. 至少当深度未知时。

Edit 编辑

The /\\[code\\](.*)\\[\\/code\\]/ can be useful too, but will not catch the inner block. /\\[code\\](.*)\\[\\/code\\]/也可能有用,但不会捕获内部块。 The first one does match only the inner one. 第一个只匹配内部的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM