简体   繁体   English

在PHP中删除任何字符,但不删除符号和字母

[英]remove in php any character but not symbols and letters

how I can use str_ireplace or other functions to remove any characters but not letters,numbers or symbols that are commonly used in HTML as : " ' ; : . - + = ... etc. I also wants to remove /n, white spaces, tabs and other. 我如何使用str_ireplace或其他函数删除任何字符,但不删除HTML中常用的字母,数字或符号: " ' ; : . - + = ...等我还想删除/ n,空格,标签和其他。

I need that text, comes from doing ("textContent"). 我需要那个文本,来自做(“textContent”)。 innerHTML in IE10 and Chrome, which a php variable are the same size, regardless of which browser do it.Therefore I need the same encoding in both texts and characters that are rare or different are removed. IE10和Chrome中的innerHTML,php变量大小相同,无论哪个浏览器都这样做。因此我需要在两个文本中使用相同的编码,并删除罕见或不同的字符。

I try this, but it dont work for me: 我试试这个,但它不适合我:

        $textForMatch=iconv(mb_detect_encoding($text, mb_detect_order(), true), "UTF-8", $text);
        $textoForMatc = str_replace(array('\s', "\n", "\t", "\r"), '', $textoForMatch);

$text contains the result of the function ("textContent"). $ text包含函数的结果(“textContent”)。 innerHTML, I want to delete characters as é³.. innerHTML,我想删除字符为 ó³..

The easiest option is to simply use preg_replace with a whitelist. 最简单的选择是简单地将preg_replace与白名单一起使用。 Ie use a pattern listing the things you want to keep, and replace anything not in that list: 即使用列出您要保留的内容的模式,并替换不在该列表中的任何内容:

$input = 'The quick brown 123 fox said "�é³". Man was I surprised';
$stripped = preg_replace('/[^-\w:";:+=\.\']/', '', $input);
$output = 'Thequickbrownfoxsaid"".ManwasIsurprised';

regex explanation 正则表达式的解释

/       - start regex
[^      - Begin inverted character class, match NON-matching characters
-       - litteral character
\w      - Match word characters. Equivalent to A-Za-z0-9_
:";:+=  - litteral characters
\.      - escaped period (because a dot has meaning in a regex)
\'      - escaped quote (because the string is in single quotes)
]       - end character class
/       - end of regex

This will therefore remove anything that isn't words, numbers or the specific characters listed in the regex. 因此,这将删除正则表达式中列出的任何非单词,数字或特定字符的内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM