简体   繁体   English

用正则表达式匹配PHP中的标签集

[英]Matching sets of tags in PHP with Regular Expression

I am currently working on protecting my AJAX Chat against exploits by checking all text in PHP before it is passed to the client. 我目前正在通过检查PHP中的所有文本(传递给客户端之前)来保护AJAX聊天免受攻击。 So far I have been successful with my mission except for one part where I require to match sets of image tags. 到目前为止,我已经成功完成了我的任务,除了需要匹配图像标签集的一部分。

Overall I wish to have it pick up any instance of there being a newline character between a set tags which I have sort of managed, but the solution I have is greedy and matches newline characters outside of tags as well if there are multiple sets of tags. 总的来说,我希望它能拾取在我管理过的一组标签之间存在换行符的任何实例,但是我的解决方案是贪婪的,并且如果有多组标签,也可以匹配标签之外的换行符。

At the moment I have the following which works if I wanted to match just [img]{newline}[/img] 目前,如果我只想匹配[img]{newline}[/img]

if(preg_match('/\[\bimg\].*\x0A.*\[\/\bimg\]/', $text)){ //code }

But if I wanted to do [img]image.jpg[/img]{newline}[img]image.jpg[/img] , it only sees the very first and end tags which I do not want. 但是,如果我想执行[img]image.jpg[/img]{newline}[img]image.jpg[/img] ,它只会看到我不想要的最开始和结尾的标签。

So now I ask, how do you make it match each set of tags properly? 因此,现在我问,您如何使其与每组标签正确匹配?

Edit: For clarification. 编辑:为澄清。 Any newline characters inside tags are bad, so I want to detect them. 标记内的所有换行符都是错误的,因此我想检测它们。 Any newline characters outside tags are good and I want to ignore them. 标签外的任何换行符都很好,我想忽略它们。 The reason being, if the client processes a newline character inside of a tag, it crashes. 原因是,如果客户端在标签内处理换行符,则会崩溃。

Just make it ungreedy by putting ? 只是通过放置使其不和谐? after the two .* 在两个之后.*

But note that your current solution will not match this: 但是请注意,您当前的解决方案将与此不匹配:

[img]
look, two newlines!
[/img]

I'm not sure why you want to do this, but you can make . 我不确定为什么要这么做,但是可以做到. match newlines by adding the s modifier to your regex. 通过将s修饰符添加到您的正则表达式来匹配换行符。 Then it's just "(\\[img\\](.*?)\\[/img\\])is" to match it, and you can even capture that group and individually check it for newlines if you want. 然后,它就是"(\\[img\\](.*?)\\[/img\\])is" ,您甚至可以捕获该组并根据需要单独检查是否有换行符。

Try setting the s modifier, like this: 尝试设置s修饰符,如下所示:

if (preg_match('/\[\bimg\].*\x0A.*\[\/\bimg\]/s', $text)) { code }

See also the PHP Documentation for Regex modifiers 另请参见正则表达式修饰符PHP文档

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM