[英]How to strip Ruby string of all characters that are not word characters?
How do I strip a string in Ruby of all characters that aren't word characters (az, any digit), replacing them with a blank? 如何在Ruby中剥离不是单词字符(az,任何数字)的所有字符的字符串,将其替换为空格?
For instance, for the string "not-using-social-media" I want to strip this to "not using social media" 例如,对于字符串“ not-using-social-media”,我想将其剥离为“不使用社交媒体”
For the string "16 Surprising Small Business Statistics (Infographic)", I want to strip this to "16 Surprising Small Business Statistics Infographic" 对于字符串“ 16个令人惊讶的小型企业统计信息(图表)”,我想将其剥离为“ 16个令人惊讶的小型企业统计信息图表”
This does not use a regex. 这不使用正则表达式。 It replaces everything which is not in "a-zA-Z0-9 " with a space, then squeezes runs of spaces to one space and removes trailing and tailing whitespace. 它用空格替换不在 “ a-zA-Z0-9”中的所有内容,然后将空格行压缩到一个空格,并删除尾随空格。
str = "not-using-social-media 16 Surprising Small Business Statistics (Infographic)"
p str.tr("^a-zA-Z0-9 ", " ").squeeze(" ").strip
#=>"not using social media 16 Surprising Small Business Statistics Infographic"
I would do either: 我会:
phrase = '16 Surprising Small Business Statistics (Infographic)'
p phrase.gsub(/[^a-zA-Z0-9]+/, ' ').strip
#=> "16 Surprising Small Business Statistics Infographic"
p phrase.gsub(/[^[:alnum:]]+/, ' ').strip
#=> "16 Surprising Small Business Statistics Infographic"
A couple of notes: 一些注意事项:
+
is added so that consecutive non-alphanumeric characters are replaced with a single space. 添加+
,以便用单个空格替换连续的非字母数字字符。 .strip
is added on the assumption you do not want the leading/trailing spaces created. 假设您不希望创建前导/尾随空格,则添加.strip
。 \\w
since that would also include underscores. 正则表达式不使用\\w
因为它还会包含下划线。 Regex is your friend - http://www.ruby-doc.org/core-1.9.3/Regexp.html 正则表达式是你的朋友- http://www.ruby-doc.org/core-1.9.3/Regexp.html
This is the bracket expression you'll want - /[[:alpha:]]/ 这是您想要的方括号表达式-/ [[:::]] /
The easiest solution is simply using delete('^')
. 最简单的解决方案是使用delete('^')
。 It deletes everything except for what comes after ^
. 它删除除^
之后的所有内容。
a='hello-world+'
a.delete('^A-Za-z') #=> 'helloworld'
a='Hello +World'
a.delete('^A-Za-z ') #=> 'Hello World'
a='01234 ABC'
a.delete('^0-9') #=> '01234'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.