[英]How to convert HTML-ENTITIES and preg_replace in PHP
I'm trying to convert
我正在尝试转换
to whitespace
. whitespace
。
and then use preg_replace
to do some Regex. 然后使用
preg_replace
做一些正则表达式。
like this. 像这样。
$title = " TEST Ok.2-2";
$title = mb_convert_encoding($title, 'UTF-8', 'HTML-ENTITIES');
//$title = html_entity_decode($title, ENT_NOQUOTES, 'UTF-8');
//( MEAN: I can use mb_convert_encoding() or html_entity_decode())
//GOT the same out put = TEST < Ok.2-2.
//So now I have TEST < Ok.2-2
//I want to make a space on Ok so I use preg_replace()
$replace = "~\s+(ok[.]?)~i";
$title = preg_replace($replace, ' OK. ', $title, -1);
$title = preg_replace('/\s+/', ' ', $title);
$title = trim($title);
//The result = TEST < Ok.2-2 (not work!)
echo($title);
with this code the mb_convert_encoding
and html_entity_decode
is work well but when I try to use preg_replace
to regex the whitespace it seem it not found the whitespace that converted. 与此代码的
mb_convert_encoding
和html_entity_decode
是很好的工作,但是当我尝试使用preg_replace
到正则表达式中的空白似乎它没有找到转换的空白。
Now out put: TEST < Ok.2-2
现在输出:
TEST < Ok.2-2
Expected out put: TEST < OK. 2-2
预期输出:
TEST < OK. 2-2
TEST < OK. 2-2
NOW MY SOLUTION 现在我的解决方案
I added the str_replace
to hardcode replace a
我将
str_replace
添加到硬编码中,以替换
to whitespace and use mb_convert_encoding or html_entity_decode to convert another htmlentity. 空格,并使用mb_convert_encoding或html_entity_decode转换另一个htmlentity。
$title = ' TEST < Ok.2-2';
$title = str_replace(' ', ' ', $title);
$title = mb_convert_encoding($title, 'UTF-8', 'HTML-ENTITIES');
//$title = html_entity_decode($title, ENT_NOQUOTES, 'UTF-8');
//( MEAN: I can use mb_convert_encoding() or html_entity_decode())
//GOT the same out put = TEST < Ok.2-2.
//So now I have TEST < Ok.2-2
//I want to make a space on Ok so I use preg_replace()
$replace = '~\s+(ok[.]?)~i';
$title = preg_replace($replace, ' OK. ', $title, -1);
$title = preg_replace('/\s+/', ' ', $title);
$title = trim($title);
//The result TEST < OK. 2-2 (WORK!)
echo($title);
NOW my out put: TEST < OK. 2-2
现在我的输出:
TEST < OK. 2-2
TEST < OK. 2-2
MY expected: TEST < OK. 2-2
我的期望:
TEST < OK. 2-2
TEST < OK. 2-2
Any suggestion for best solution? 对最佳解决方案有什么建议吗?
I think this will give you what you are after. 我认为这将为您提供所需的服务。
$title = trim(
preg_replace('~\s+~', ' ',
str_ireplace(array(' ', ' ok.'), array(' ', ' OK. '),
" TEST Ok.2-2")
)
);
This will: 这将:
trim
) trim
) preg_replace('~\\s+~', ' '
) preg_replace('~\\s+~', ' '
)
to a single space ( str_ireplace
) str_ireplace
) ok.
ok.
case insensitive to OK.
OK.
( str_ireplace
) str_ireplace
) Output: 输出:
TEST OK.
测试OK。 2-2
2-2
Your HTML entity decode example is correct, http://sandbox.onlinephpfunctions.com/code/eed7e30d507f7197585f29c1fdde9e7744fc572d 您的HTML实体解码示例是正确的, http://sandbox.onlinephpfunctions.com/code/eed7e30d507f7197585f29c1fdde9e7744fc572d
$title = html_entity_decode(" TEST Ok.2-2", ENT_NOQUOTES, 'UTF-8');
echo $title;
Output: 输出:
TEST Ok.2-2
测试2-2
Edit: 编辑:
<?php
$title = ' TEST < Ok.2-2';
$title = trim(preg_replace('~\s+~', ' ', str_ireplace(array(' ', '<', 'Ok.'), array(' ', '', ' OK. '), $title)));
echo $title;
It's probably safer to just remove the 2 entities with the str_replace. 仅使用str_replace删除2个实体可能更安全。 If your string were
<h1> TEST < Ok.2-2</h1>
and you decoded then removed all <
your string would not function as it had. 如果您的字符串是
<h1> TEST < Ok.2-2</h1>
并且您进行了解码,然后删除了所有<
您的字符串将无法正常运行。
Output: 输出:
TEST OK.
测试OK。 2-2
2-2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.