I need to detect whether a string contains HTML tags.
if(!preg_match('(?<=<)\w+(?=[^<]*?>)', $string)){
return $string;
}
The above regex gives me an error:
preg_match() [function.preg-match]: Unknown modifier '\'
I'm not well up on regex so not sure what the problem was. I tried escaping the \\ and it didn't do anything.
Is there a better solution than regex? If not, what would be the correct regex to work with the preg_match?
A simple solution is:
if($string != strip_tags($string)) {
// contains HTML
}
The benefit of this over a regex is it's easier to understand, however I could not comment on the speed of execution of either solution.
you need to 'delimit' the regex with some character or another. Try this:
if(!preg_match('#(?<=<)\w+(?=[^<]*?>)#', $string)){
return $string;
}
If you just want to detect/replace certain tags: This function will search for certain html tags and encapsulate them in brackets - which is pretty senseless - just modify it to whatever you want to do with the tags.
$html = preg_replace_callback(
'|\</?([a-zA-Z]+[1-6]?)(\s[^>]*)?(\s?/)?\>|',
function ($found) {
if(isset($found[1]) && in_array(
$found[1],
array('div','p','span','b','a','strong','center','br','h1','h2','h3','h4','h5','h6','hr'))
) {
return '[' . $found[0] . ']';
};
},
$html
);
Explaination of the regex:
\< ... \> //start and ends with tag brackets
\</? //can start with a slash for closing tags
([a-zA-Z]+[1-6]?) //the tag itself (for example "h1")
(\s[^>]*)? //anything such as class=... style=... etc.
(\s?/)? //allow self-closing tags such as <br />
If purpose is just to check if string contain html tag or not. No matter html tags are valid or not. Then you can try this.
function is_html($string) {
// Check if string contains any html tags.
return preg_match('/<\s?[^\>]*\/?\s?>/i', $string);
}
This works for all valid or invalid html tags. You can check confirm here https://regex101.com/r/2g7Fx4/3
I would recommend you to allow defined tags only! You don't want the user to type the <script>
tag, which could cause a XSS vulnerability.
Try it with:
$string = '<strong>hello</strong>';
$pattern = "/<(p|span|b|strong|i|u) ?.*>(.*)<\/(p|span|b|strong|i|u)>/"; // Allowed tags are: <p>, <span>, <b>, <strong>, <i> and <u>
preg_match($pattern, $string, $matches);
if (!empty($matches)) {
echo 'Good, you have used a HTML tag.';
}
else {
echo 'You didn\'t use a HTML tag or it is not allowed.';
}
我会使用strlen()
因为如果不这样做,则会进行逐个字符的比较,这可能会很慢,但我希望比较在发现差异后立即退出。
Parsing HTML in general is a hard problem, there is some good material here:
But regarding your question ('better' solution) - can be more specific regarding what you are trying to achieve, and what tools are available to you?
If your not good at regular expressions (like me) I find lots of regex libraries out there that usually help me accomplish my task.
Here is a little tutorial that will explain what your trying to do in php.
Here is one of those libraries I was referring to.
if this function returns TRUE
then it means string 'contains HTML Tags' , else you will get false
it means 'does not contains HTML'
<?php
/**
* check String contains HTML String or not...
*
* @param string $htmlStr
*
* @return boolean (here, true = is HTML , false = not HTML)
*/
function isHTML_OR_Not_Validate(string $htmlStr){
return $htmlStr != strip_tags($htmlStr) ? true : false;
}
?>
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.