简体   繁体   English

使用正则表达式删除特殊的键盘字符/形状或?

[英]Removing special keyboard characters/shapes with regex or?

I am using YQL to scrape some data, and then parsing it into Amazon's simpledb. 我正在使用YQL抓取一些数据,然后将其解析为Amazon的simpledb。 I am getting some errors when attempting to insert certain titles into the DB, because some titles from the xml file that I am parsing contain characters like the one's below. 尝试将某些标题插入数据库时​​出现错误,因为我正在解析的xml文件中的某些标题包含如下字符。

◆ ▒ ♠ ✖ ¸ . ´ ¨

I am sure that's not all the possible special characters. 我敢肯定不是所有可能的特殊字符。 It's just the one's I've noticed so far that are causing the errors. 到目前为止,我所注意到的只是导致错误的原因。

These are not standard keyboard characters. 这些不是标准的键盘字符。 Is there a simple way to remove/disallow these types of characters (regex, etc..) without finding every one of them and including them in a regex? 是否有一种简单的方法来删除/禁止这些类型的字符(正则表达式等),而无需查找每个字符并将其包含在正则表达式中?

Thanks 谢谢

$text = preg_replace('/[^a-zA-Z0-9_ -]/s', '', $text);

This will trim your text so it only contains letters or numbers, spaces and underlines/dashes. 这将修剪您的文本,使其仅包含字母或数字,空格和下划线/破折号。

Reference http://www.phpfreaks.com/forums/index.php?topic=223131.0 参考 http://www.phpfreaks.com/forums/index.php?topic=223131.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM