如何在PHP中转换HTML-ENTITIES和preg_replace

Question

I'm trying to convert   我正在尝试转换  to whitespace . whitespace 。

and then use preg_replace to do some Regex. 然后使用preg_replace做一些正则表达式。

like this. 像这样。

$title = "&nbsp;TEST&nbsp;Ok.2-2";
$title = mb_convert_encoding($title, 'UTF-8', 'HTML-ENTITIES');
//$title = html_entity_decode($title, ENT_NOQUOTES, 'UTF-8');
//( MEAN: I can use mb_convert_encoding() or html_entity_decode())
//GOT the same out put = TEST < Ok.2-2.

//So now I have TEST < Ok.2-2
//I want to make a space on Ok so I use preg_replace()
$replace = "~\s+(ok[.]?)~i";
$title = preg_replace($replace, ' OK. ', $title, -1);
$title = preg_replace('/\s+/', ' ', $title);
$title = trim($title);

//The result = TEST < Ok.2-2 (not work!)
echo($title);

with this code the mb_convert_encoding and html_entity_decode is work well but when I try to use preg_replace to regex the whitespace it seem it not found the whitespace that converted. 与此代码的mb_convert_encoding和html_entity_decode是很好的工作，但是当我尝试使用preg_replace到正则表达式中的空白似乎它没有找到转换的空白。

Now out put: TEST < Ok.2-2 现在输出： TEST < Ok.2-2

Expected out put: TEST < OK. 2-2 预期输出： TEST < OK. 2-2 TEST < OK. 2-2

NOW MY SOLUTION 现在我的解决方案

I added the str_replace to hardcode replace a   我将str_replace添加到硬编码中，以替换  to whitespace and use mb_convert_encoding or html_entity_decode to convert another htmlentity. 空格，并使用mb_convert_encoding或html_entity_decode转换另一个htmlentity。

$title = '&nbsp;TEST&nbsp;&lt;&nbsp;Ok.2-2';
$title = str_replace('&nbsp;', ' ', $title);
$title = mb_convert_encoding($title, 'UTF-8', 'HTML-ENTITIES');
//$title = html_entity_decode($title, ENT_NOQUOTES, 'UTF-8');
//( MEAN: I can use mb_convert_encoding() or html_entity_decode())
//GOT the same out put = TEST < Ok.2-2.

//So now I have TEST < Ok.2-2
//I want to make a space on Ok so I use preg_replace()
$replace = '~\s+(ok[.]?)~i';
$title = preg_replace($replace, ' OK. ', $title, -1);
$title = preg_replace('/\s+/', ' ', $title);
$title = trim($title);

//The result TEST < OK. 2-2 (WORK!)
echo($title);

NOW my out put: TEST < OK. 2-2 现在我的输出： TEST < OK. 2-2 TEST < OK. 2-2

MY expected: TEST < OK. 2-2 我的期望： TEST < OK. 2-2 TEST < OK. 2-2

Any suggestion for best solution? 对最佳解决方案有什么建议吗？

Answer 1

I think this will give you what you are after. 我认为这将为您提供所需的服务。

$title = trim(
     preg_replace('~\s+~', ' ', 
          str_ireplace(array('&nbsp;', ' ok.'), array(' ', ' OK. '), 
     "&nbsp;TEST&nbsp;Ok.2-2")
     )
);

This will: 这将：

Strip leading and trailing white spaces ( trim ) 去除前后空格（ trim ）
Replace multiple white spaces with a single white space ( preg_replace('~\\s+~', ' ' ) 用单个空格替换多个空格（ preg_replace('~\\s+~', ' ' ）
Replace   替换  to a single space ( str_ireplace ) 到一个空格（ str_ireplace ）
Replace ok. 替换ok. case insensitive to OK. 不区分大小写OK. ( str_ireplace ) （ str_ireplace ）

Output: 输出：

TEST OK. 测试OK。 2-2 2-2

Your HTML entity decode example is correct, http://sandbox.onlinephpfunctions.com/code/eed7e30d507f7197585f29c1fdde9e7744fc572d 您的HTML实体解码示例是正确的， http：//sandbox.onlinephpfunctions.com/code/eed7e30d507f7197585f29c1fdde9e7744fc572d

$title = html_entity_decode("&nbsp;TEST&nbsp;Ok.2-2", ENT_NOQUOTES, 'UTF-8');
echo $title;

Output: 输出：

TEST Ok.2-2 测试2-2

Edit: 编辑：

<?php
$title = '&nbsp;TEST&nbsp;&lt;&nbsp;Ok.2-2';
$title = trim(preg_replace('~\s+~', ' ', str_ireplace(array('&nbsp;', '&lt;', 'Ok.'), array(' ', '', ' OK. '), $title)));
echo $title;

It's probably safer to just remove the 2 entities with the str_replace. 仅使用str_replace删除2个实体可能更安全。 If your string were <h1> TEST < Ok.2-2</h1> and you decoded then removed all < your string would not function as it had. 如果您的字符串是<h1> TEST < Ok.2-2</h1>并且您进行了解码，然后删除了所有<您的字符串将无法正常运行。

Output: 输出：

TEST OK. 测试OK。 2-2 2-2

如何在PHP中转换HTML-ENTITIES和preg_replace

问题描述

1 个解决方案

解决方案1
0 2015-06-12 15:15:55

如何在PHP中转换HTML-ENTITIES和preg_replace

问题描述

1 个解决方案

解决方案1 0 2015-06-12 15:15:55

解决方案1
0 2015-06-12 15:15:55