简体   繁体   English

使用正则表达式删除php中字符串中的空间

[英]Remove space in string in php with regular expression

I would like to remove the space in between the html tags through regular expression in php. 我想通过php中的正则表达式删除html标记之间的空间。 May i know what is the rule? 我可以知道这是什么规则吗? Without removing the space in the text. 不删除文本中的空格。

For example, i would like to remove particularly the space between <tr> and <td> tag. 例如,我想特别删除<tr><td>标记之间的空格。

From: 从:

<tr>
    <td>Hello there</td>
<tr>

to: 至:

<tr><td>Hello there</td></tr>

Thanks. 谢谢。

First off: markup (HTML) and regex don't mix well . 首先, 标记(HTML)和正则表达式混合不好 Be that as it may, you can remove spaces in between tags with the following regex quite easily: 尽管如此,您可以使用以下正则表达式轻松删除标签之间的空格:

$clean = preg_replace('/>\s+</', '><', $string);

This will remove spaces that are found in between tags if there's nothing else in between: 如果标签之间没有其他内容,则会删除在标签之间找到的空格:

<p>Foobar <b>is</b> not a word <i>as such</i>    <p>

will be "translated" into: 将被“翻译”为:

<p>Foobar <b>is</b> not a word <i>as such</i><p>

That's fine, but still, it'd be better (and safer) to parse, sanitize and then echo the markup using the DOMDocument class. 很好,但是使用DOMDocument类分析,清理然后回显标记会更好(更安全)。 But before you start hacking away, and write thousands of lines of code to esnure you're processing valid markup, ask yourself this simple question: 但是在您开始黑客攻击并编写数千行代码以确保您正在处理有效的标记之前,请问自己一个简单的问题:

How can I make sure that the markup I'm processing is well-formed, and valid to begin with? 如何确保正在处理的标记格式正确,并且一开始就有效?

Instead of writing code that works around bad markup, look into ways of making sure the data you're processing is of good quality to begin with. 与其编写可解决不良标记的代码,不如从一开始就研究确保您要处理的数据具有高质量的方法。
Anyway, here's a simple example of how to use the DOMDocument class: 无论如何,这是一个有关如何使用DOMDocument类的简单示例:

$dom = new DOMDocument;
$dom->loadHTML($string);
echo $dom->saveHTML();//echoes sanitized markup

This assumes the $string is a full DOM (including <html> , doctype and all other tags that implies). 假设$string是完整的DOM(包括<html> ,doctype和所有其他暗含的标记)。 If you don't have such a string, you'll have to use saveXML : 如果没有这样的字符串,则必须使用saveXML

echo $dom->getElementsByTagName('body')->item(0)->saveXML();

Where body is the root node of your markup. 其中body是标记的根节点。 See the docs for examples and details 请参阅文档以获取示例和详细信息

If the string you have is what you've included in your question, then all spaces need to be removed. 如果您的字符串是问题中包含的字符串,则需要删除所有空格。 In that case, regex is just not necessary : 在这种情况下,正则表达式是没有必要的

$string = '<tr>
     <td>';
echo str_replace(' ', '', $string);//removes all spaces...

Ah well, browse through the documents of the DOMDocument class, it's worth the effort. 嗯,浏览DOMDocument类的文档是值得的。 Honest :) 诚实 :)

This question is more complicated than it looks. 这个问题比看起来要复杂。 It's easy to remove all spaces between all tags, like 删除所有标签之间的所有空格很容易,例如

<tr>  <td>   -> <tr><td>

but this naive approach will produce wrong results: 但是这种幼稚的方法会产生错误的结果:

<i>hi</i> <b>there</b>  -> <i>hi</i><b>there</b>

To remove whitespace correctly you have to analyze the type of its parent node and only remove when the node doesn't allow text content ( http://www.w3.org/TR/html4/sgml/dtd.html might be helpful). 正确删除空格您必须分析其父节点的类型,并且仅在该节点不允许文本内容时才删除( http://www.w3.org/TR/html4/sgml/dtd.html可能会有帮助) 。

Definitely not something you can achieve with a regular expression! 绝对不是用正则表达式可以实现的!

$str = "<td> </td>";
$str2 = "<td></td>";

var_dump(preg_match('/\s/',$str));
var_dump(preg_match('/\s/',$str2));

Result 1 returns true 结果1返回true

Result 2 returns false 结果2返回假

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 空格的php正则表达式 - php regular expression for white space 在PHP中,正则表达式从字符串中删除井号(如果存在) - In PHP, regular expression to remove the pound sign from a string, if exists 使用 PHP 正则表达式从字符串中的数字中删除换行符 - Remove newline character from a digit in the string using PHP regular expression 从字符串中删除 ID=1234 的正则表达式 (PHP) - regular expression to remove ID=1234 from a string (PHP) 找到特定单词后删除左字符串-PHP正则表达式 - Remove left string after found specific word - PHP regular expression PHP:如何通过正则表达式从精确字符中删除到字符串的结尾 - PHP: how to remove by regular expression from a precise characther to the end of the string 如何在PHP中使用正则表达式删除特定字符之间的字符串? - How to remove a string between the specific characters using regular expression in PHP? 使用PHP从字符串中删除包含其内部文本的链接的正则表达式 - Regular expression to remove links with their inner text from a string with PHP PHP - 从字符串中删除除日期以外的所有内容的正则表达式 - PHP - Regular expression to remove everything else but dates from a string PHP-正则表达式,用于删除字符串开头和结尾的单引号 - PHP - Regular Expression to Remove Single Quotes From Beginning and End of a String
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM