简体   繁体   English

如何在PHP中删除标点符号

[英]how to strip punctuation in php

How can I strip punctuation except for these characters . 除了这些字符外,如何删除标点符号. = $ ' - % = $ ' - %

这是一种整洁的方法:

preg_replace("#[[:punct:]]#", "", $target);

Since you need to match some Unicode characters ( ) it would be sensible to use a regular expression. 由于您需要匹配一些Unicode字符( ),因此使用正则表达式是明智的。 The pattern \\p{P} matches any known punctuation, and the assertion excludes your desired special characters from vanishing: 模式\\p{P}匹配任何已知的标点符号,并且断言排除了所需的特殊字符的消失:

 $text = preg_replace("/(?![.=$'€%-])\p{P}/u", "", $text);
<?
$whatToStrip = array("?","!",",",";"); // Add what you want to strip in this array
$test = "Hi! Am I here?";
echo $test."\n\n";
echo str_replace($whatToStrip, "", $test);

Demo here 在这里演示

or, of course, shorter : 或者,当然,更短:

$test = str_replace(array("?","!",",",";"), "", $test);

Source from 1st example of str_replace manual 源自str_replace手册的第一个示例

preg_replace("[^-\w\d\s\.=$'€%]",'',$subject)

尽管指定要剥离的字符会更正确和更容易,而不是不想剥离的字符(来自未知集合)也可以。

Try: 尝试:

preg_replace("/[^\w-\p{L}\p{N}\p{Pd}\$\.€%']/", "", 'YOUR DATA');

You didn't mentioned if you wanted spaces or not, so that will strip that too. 您没有提到是否要使用空格,因此也会删除空格。

The problem: 问题:

Need to save string as alphaNum with specific punctuation and don't want to completely discard characters with special punctuation. 需要将字符串另存为带有特定标点的alphaNum,并且不想完全丢弃具有特殊标点的字符。

The solution: 解决方案:

class ClassName {

  protected static $cleanChars = array(
    '&lt;' => '', '&gt;' => '', '&#039;' => '', '&amp;' => '',
    '&quot;' => '', 'À' => 'A', 'Á' => 'A', 'Â' => 'A', 'Ã' => 'A', 'Ä' => 'Ae',
    '&Auml;' => 'A', 'Å' => 'A', 'Ā' => 'A', 'Ą' => 'A', 'Ă' => 'A', 'Æ' => 'Ae',
    'Ç' => 'C', 'Ć' => 'C', 'Č' => 'C', 'Ĉ' => 'C', 'Ċ' => 'C', 'Ď' => 'D', 'Đ' => 'D',
    'Ð' => 'D', 'È' => 'E', 'É' => 'E', 'Ê' => 'E', 'Ë' => 'E', 'Ē' => 'E',
    'Ę' => 'E', 'Ě' => 'E', 'Ĕ' => 'E', 'Ė' => 'E', 'Ĝ' => 'G', 'Ğ' => 'G',
    'Ġ' => 'G', 'Ģ' => 'G', 'Ĥ' => 'H', 'Ħ' => 'H', 'Ì' => 'I', 'Í' => 'I',
    'Î' => 'I', 'Ï' => 'I', 'Ī' => 'I', 'Ĩ' => 'I', 'Ĭ' => 'I', 'Į' => 'I',
    'İ' => 'I', 'IJ' => 'IJ', 'Ĵ' => 'J', 'Ķ' => 'K','Ł' => 'K', 'Ľ' => 'K',
    'Ĺ' => 'K', 'Ļ' => 'K', 'Ŀ' => 'K', 'Ñ' => 'N', 'Ń' => 'N', 'Ň' => 'N',
    'Ņ' => 'N', 'Ŋ' => 'N', 'Ò' => 'O', 'Ó' => 'O', 'Ô' => 'O', 'Õ' => 'O',
    'Ö' => 'Oe', '&Ouml;' => 'Oe', 'Ø' => 'O', 'Ō' => 'O', 'Ő' => 'O', 'Ŏ' => 'O',
    'Œ' => 'OE', 'Ŕ' => 'R', 'Ř' => 'R', 'Ŗ' => 'R', 'Ś' => 'S', 'Š' => 'S',
    'Ş' => 'S', 'Ŝ' => 'S', 'Ș' => 'S', 'Ť' => 'T', 'Ţ' => 'T', 'Ŧ' => 'T',
    'Ț' => 'T', 'Ù' => 'U', 'Ú' => 'U', 'Û' => 'U', 'Ü' => 'Ue', 'Ū' => 'U',
    '&Uuml;' => 'Ue', 'Ů' => 'U', 'Ű' => 'U', 'Ŭ' => 'U', 'Ũ' => 'U', 'Ų' => 'U',
    'Ŵ' => 'W', 'Ý' => 'Y', 'Ŷ' => 'Y', 'Ÿ' => 'Y', 'Ź' => 'Z', 'Ž' => 'Z',
    'Ż' => 'Z', 'Þ' => 'T', 'à' => 'a', 'á' => 'a', 'â' => 'a', 'ã' => 'a',
    'ä' => 'ae', '&auml;' => 'ae', 'å' => 'a', 'ā' => 'a', 'ą' => 'a', 'ă' => 'a',
    'æ' => 'ae', 'ç' => 'c', 'ć' => 'c', 'č' => 'c', 'ĉ' => 'c', 'ċ' => 'c',
    'ď' => 'd', 'đ' => 'd', 'ð' => 'd', 'è' => 'e', 'é' => 'e', 'ê' => 'e',
    'ë' => 'e', 'ē' => 'e', 'ę' => 'e', 'ě' => 'e', 'ĕ' => 'e', 'ė' => 'e',
    'ƒ' => 'f', 'ĝ' => 'g', 'ğ' => 'g', 'ġ' => 'g', 'ģ' => 'g', 'ĥ' => 'h',
    'ħ' => 'h', 'ì' => 'i', 'í' => 'i', 'î' => 'i', 'ï' => 'i', 'ī' => 'i',
    'ĩ' => 'i', 'ĭ' => 'i', 'į' => 'i', 'ı' => 'i', 'ij' => 'ij', 'ĵ' => 'j',
    'ķ' => 'k', 'ĸ' => 'k', 'ł' => 'l', 'ľ' => 'l', 'ĺ' => 'l', 'ļ' => 'l',
    'ŀ' => 'l', 'ñ' => 'n', 'ń' => 'n', 'ň' => 'n', 'ņ' => 'n', 'ʼn' => 'n',
    'ŋ' => 'n', 'ò' => 'o', 'ó' => 'o', 'ô' => 'o', 'õ' => 'o', 'ö' => 'oe',
    '&ouml;' => 'oe', 'ø' => 'o', 'ō' => 'o', 'ő' => 'o', 'ŏ' => 'o', 'œ' => 'oe',
    'ŕ' => 'r', 'ř' => 'r', 'ŗ' => 'r', 'š' => 's', 'ù' => 'u', 'ú' => 'u',
    'û' => 'u', 'ü' => 'ue', 'ū' => 'u', '&uuml;' => 'ue', 'ů' => 'u', 'ű' => 'u',
    'ŭ' => 'u', 'ũ' => 'u', 'ų' => 'u', 'ŵ' => 'w', 'ý' => 'y', 'ÿ' => 'y',
    'ŷ' => 'y', 'ž' => 'z', 'ż' => 'z', 'ź' => 'z', 'þ' => 't', 'ß' => 'ss',
    'ſ' => 'ss', 'ый' => 'iy', 'А' => 'A', 'Б' => 'B', 'В' => 'V', 'Г' => 'G',
    'Д' => 'D', 'Е' => 'E', 'Ё' => 'YO', 'Ж' => 'ZH', 'З' => 'Z', 'И' => 'I',
    'Й' => 'Y', 'К' => 'K', 'Л' => 'L', 'М' => 'M', 'Н' => 'N', 'О' => 'O',
    'П' => 'P', 'Р' => 'R', 'С' => 'S', 'Т' => 'T', 'У' => 'U', 'Ф' => 'F',
    'Х' => 'H', 'Ц' => 'C', 'Ч' => 'CH', 'Ш' => 'SH', 'Щ' => 'SCH', 'Ъ' => '',
    'Ы' => 'Y', 'Ь' => '', 'Э' => 'E', 'Ю' => 'YU', 'Я' => 'YA', 'а' => 'a',
    'б' => 'b', 'в' => 'v', 'г' => 'g', 'д' => 'd', 'е' => 'e', 'ё' => 'yo',
    'ж' => 'zh', 'з' => 'z', 'и' => 'i', 'й' => 'y', 'к' => 'k', 'л' => 'l',
    'м' => 'm', 'н' => 'n', 'о' => 'o', 'п' => 'p', 'р' => 'r', 'с' => 's',
    'т' => 't', 'у' => 'u', 'ф' => 'f', 'х' => 'h', 'ц' => 'c', 'ч' => 'ch',
    'ш' => 'sh', 'щ' => 'sch', 'ъ' => '', 'ы' => 'y', 'ь' => '', 'э' => 'e',
    'ю' => 'yu', 'я' => 'ya'
  );

  public static function clean($string, $allowed=array(), $base="a-zA-Z0-9 "){
    if(empty($allowed) && !$base){ return false; }
    $ignore = "";
    if(is_array($allowed)){
      foreach($allowed as $a){
        $ignore .= preg_quote($a);
      }
    }
    return preg_replace( "/[^{$base}{$ignore}\s]/", "", $string );
  }

  public static function alphaNum($string, $allowed=array(), $convert=false){
    if($convert){
      $string = strtr($string, self::$cleanChars);
    }
    return self::clean($string, $allowed, 'a-zA-Z0-9 ');
  }

}

Examples: 例子:

Strip all punctuation: 删除所有标点符号:

ClassName::alpaNum($string); ClassName :: alpaNum($ string);

Strip all punctuation but convert special chars: 删除所有标点符号,但转换特殊字符:

ClassName::alphaNum($string, null, true); ClassName :: alphaNum($ string,null,true);

Alpha Num + additional punctuation: Alpha Num +其他标点符号:

ClassName::alphaNum($string, array('_', '-', ',', '.')); ClassName :: alphaNum($ string,array('_','-',',','。'));

Alpha Num + additional punctuation and convert: Alpha Num +其他标点符号并转换:

ClassName::alphaNum($string, array('_', '-', ',', '.'), true); ClassName :: alphaNum($ string,array('_','-',',','。'),true);

Conclusion: If you are expecting special chars and don't completely want to discard them you can convert them before checking alphaNum. 结论:如果您期望特殊字符而又不想完全舍弃它们,则可以在检查alphaNum之前将其转换。 (eg. on sanitizing file names etc.) (例如,清理文件名等)

If discarding the special chars does not have any real impact and is not really expected on the system you can call it without conversion of punctuation to save processing power. 如果丢弃特殊字符不会对系统产生任何实际影响,并且在系统上不是真正预期的,则可以不进行标点转换而调用它以节省处理能力。 (eg. on setting keys for large arrays from strings) (例如,为字符串中的大型数组设置键)

I got the cleanChars var from here: (I slightly modified it) https://github.com/vanillaforums/Garden/blob/master/library/core/class.format.php 我从这里得到了cleanChars var:(我稍加修改了) https://github.com/vanillaforums/Garden/blob/master/library/core/class.format.php

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM