简体   繁体   English

我有一个带有“\ ”的字符串,我需要用“”替换它 str_replace 失败

[英]I have a string with “\u00a0”, and I need to replace it with “” str_replace fails

I need to clean a string that comes (copy/pasted) from various Microsoft Office suite applications ( Excel , Access , and Word ), each with its own set of encoding.我需要清理来自(复制/粘贴)来自各种 Microsoft Office 套件应用程序( ExcelAccessWord )的字符串,每个应用程序都有自己的一组编码。

I'm using json_encode for debugging purposes in order to being able to see every single encoded character.我将 json_encode 用于调试目的,以便能够看到每个编码的字符。

I'm able to clean everything I found so far (\\r \\n) with str_replace, but with \  I have no luck.我可以用 str_replace 清理我目前找到的所有东西(\\r\\n),但是用 \  我没有运气。

$string = 'mail@mail.com\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0;mail@mail.com'; //this is the output from json_encode

$clean = str_replace("\u00a0", "",$string);

returns:返回:

mail@mail.com\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0;mail@mail.com

That is exactly the same;那是完全一样的; it completely ignores \ .它完全忽略了\ 。

Is there a way around this?有没有解决的办法? Also, I'm feeling I'm reinventing the wheel, is there a function/class that completely strips EVERY possibile char of EVERY possible encoding?另外,我觉得我在重新发明轮子,是否有一个函数/类可以完全去除所有可能编码的所有可能字符?

____EDIT____ ____编辑____

After the first two replies I need to clarify that my example DOES work, because it's the output from json_encode, not the actual string!在前两个回复之后,我需要澄清我的示例确实有效,因为它是 json_encode 的输出,而不是实际的字符串!

通过在包含 \  的字符串上组合ord()substr() ,我发现以下诅咒起作用:

$text = str_replace( chr( 194 ) . chr( 160 ), ' ', $text );

I just had the same problem.我只是遇到了同样的问题。 Apparently PHP's json_encode will return null for any string with a 'non-breaking space' in it.显然,对于任何包含“不间断空格”的字符串,PHP 的 json_encode 都会返回 null。

The Solution is to replace this with a regular space:解决方案是将其替换为常规空间:

str_replace(chr(160),' ');

I hope this helps somebody - it took me an hour to figure out.我希望这对某人有所帮助 - 我花了一个小时才弄明白。

Works for me, when I copy/paste your code.当我复制/粘贴您的代码时,对我有用。 Try replacing the double quotes in your str_replace() with single quotes, or escaping the backslash ( "\\\ " ).尝试用单引号替换str_replace()中的双引号,或转义反斜杠 ( "\\\ " )。

尝试这个:

$str = str_replace("\u{00a0}", ' ', $str);

这个也有效,我在某处找到了

$str = trim($str, chr(0xC2).chr(0xA0));

A minor point: \  is actually a non-breaking space character, cf http://www.fileformat.info/info/unicode/char/a0/index.htm一个小问题: \  实际上是一个不间断的空格字符,参见http://www.fileformat.info/info/unicode/char/a0/index.htm

So it might be more correct to replace it with " "所以用“”替换它可能更正确

这对我有用:

$str = preg_replace( "~\x{00a0}~siu", " ", $str );

You have to do this with single quotes like this:你必须用这样的单引号做到这一点:

str_replace('\u00a0', "",$string);

Or, if you like to use double quotes, you have to escape the backslash - which would look like this:或者,如果您喜欢使用双引号,则必须转义反斜杠 - 如下所示:

str_replace("\\u00a0", "",$string);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM