[英]PHP regex solution - Remove some special characters and replace some with text
I have a PHP variable, say 我有一个PHP变量,比如
$myvariable = "te xt!@ na#@)(me+=&t^*ext?>;.'na^%me";
I want to replace special characters and group of special characters including blank space with a single underscore _
. 我想用单个下划线_
替换特殊字符和特殊字符组(包括空格)。 The string may contain &
and it may be replaced with and
. 该字符串可以包含&
,并且可以用and
代替。
The result of previous variable should be; 前一个变量的结果应为;
te_xt_na_me_andt_ext_na_me
How can I do this in PHP? 如何在PHP中做到这一点?
This assumes, anything but "characters" is regarded disposable. 假设除 “字符”之外的任何东西都视为一次性的。
$patterns = array(
'/&/' => 'and', // Ampersand to "and"
'/[^[:alpha:]]+/' => '_' // Anything *but* a character to underscore
);
$result = preg_replace(array_keys($patterns), array_values($patterns), $input);
The last pattern replaces groups of one or more occurences of "non-word" characters according to the current locale 1 (and thus including white-space). 最后一个模式根据当前语言环境1 (因此包括空白)替换一个或多个出现的“非单词”字符的组。
1 Side-note (might be irrelevant): if the server the script runs on has en_US
as locale, the following replacements occur: 1旁注(可能不相关):如果运行脚本的服务器的语言环境为en_US
,则会发生以下替换:
$input = 'app!le___s &! orän=%ges';
$result = 'app_le_s_and_or_n_ges';
If the locale is de_DE
, this would be the result: 如果语言环境是de_DE
,则结果如下:
$result = 'app_le_s_and_orän_ges';
Because ä
is part of [[:alpha:]]
in this particular locale. 因为ä
是此特定语言环境中[[:alpha:]]
一部分。 The obvious solution to circumvent this would be to substitute the character class for [a-zA-Z]
. 避免此问题的明显解决方案是用字符类替换[a-zA-Z]
。
this should do it: 这应该做到这一点:
$myvariable = str_replace('&','and',$myvariable)
$myvariable = preg_replace ('/[^a-z]+/i', '_' , $myvariable)
see: http://php.net/manual/de/function.preg-replace.php 请参阅: http : //php.net/manual/de/function.preg-replace.php
the caret (^) inside the squared brackets means to look for everything, that is not declared in the brackets. 方括号内的脱字号(^)表示查找所有未在方括号中声明的内容。 So every special character is not "az". 因此,每个特殊字符都不是“ az”。 The plus signalises, that multiple occurences should be matched. 加号表示应该多次匹配。 The 'i' behind the delimiting slash means to do a case-insensitive search. 分隔斜杠后面的“ i”表示不区分大小写。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.