简体   繁体   中英

Matching sets of tags in PHP with Regular Expression

I am currently working on protecting my AJAX Chat against exploits by checking all text in PHP before it is passed to the client. So far I have been successful with my mission except for one part where I require to match sets of image tags.

Overall I wish to have it pick up any instance of there being a newline character between a set tags which I have sort of managed, but the solution I have is greedy and matches newline characters outside of tags as well if there are multiple sets of tags.

At the moment I have the following which works if I wanted to match just [img]{newline}[/img]

if(preg_match('/\[\bimg\].*\x0A.*\[\/\bimg\]/', $text)){ //code }

But if I wanted to do [img]image.jpg[/img]{newline}[img]image.jpg[/img] , it only sees the very first and end tags which I do not want.

So now I ask, how do you make it match each set of tags properly?

Edit: For clarification. Any newline characters inside tags are bad, so I want to detect them. Any newline characters outside tags are good and I want to ignore them. The reason being, if the client processes a newline character inside of a tag, it crashes.

Just make it ungreedy by putting ? after the two .*

But note that your current solution will not match this:

[img]
look, two newlines!
[/img]

I'm not sure why you want to do this, but you can make . match newlines by adding the s modifier to your regex. Then it's just "(\\[img\\](.*?)\\[/img\\])is" to match it, and you can even capture that group and individually check it for newlines if you want.

Try setting the s modifier, like this:

if (preg_match('/\[\bimg\].*\x0A.*\[\/\bimg\]/s', $text)) { code }

See also the PHP Documentation for Regex modifiers

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM