如何去除不是单词字符的所有字符的Ruby字符串？

Question

How do I strip a string in Ruby of all characters that aren't word characters (az, any digit), replacing them with a blank? 如何在Ruby中剥离不是单词字符（az，任何数字）的所有字符的字符串，将其替换为空格？

For instance, for the string "not-using-social-media" I want to strip this to "not using social media" 例如，对于字符串“ not-using-social-media”，我想将其剥离为“不使用社交媒体”

For the string "16 Surprising Small Business Statistics (Infographic)", I want to strip this to "16 Surprising Small Business Statistics Infographic" 对于字符串“ 16个令人惊讶的小型企业统计信息（图表）”，我想将其剥离为“ 16个令人惊讶的小型企业统计信息图表”

Answer 1

This does not use a regex. 这不使用正则表达式。 It replaces everything which is not in "a-zA-Z0-9 " with a space, then squeezes runs of spaces to one space and removes trailing and tailing whitespace. 它用空格替换不在 “ a-zA-Z0-9”中的所有内容，然后将空格行压缩到一个空格，并删除尾随空格。

str = "not-using-social-media 16 Surprising Small Business Statistics (Infographic)"
p str.tr("^a-zA-Z0-9 ", " ").squeeze(" ").strip
#=>"not using social media 16 Surprising Small Business Statistics Infographic"

Answer 2

I would do either: 我会：

phrase = '16 Surprising Small Business Statistics (Infographic)'

p phrase.gsub(/[^a-zA-Z0-9]+/, ' ').strip
#=> "16 Surprising Small Business Statistics Infographic"

p phrase.gsub(/[^[:alnum:]]+/, ' ').strip
#=> "16 Surprising Small Business Statistics Infographic"

A couple of notes: 一些注意事项：

The + is added so that consecutive non-alphanumeric characters are replaced with a single space. 添加+ ，以便用单个空格替换连续的非字母数字字符。
The .strip is added on the assumption you do not want the leading/trailing spaces created. 假设您不希望创建前导/尾随空格，则添加.strip 。
The regex does not use \\w since that would also include underscores. 正则表达式不使用\\w因为它还会包含下划线。

Answer 3

Regex is your friend - http://www.ruby-doc.org/core-1.9.3/Regexp.html 正则表达式是你的朋友- http://www.ruby-doc.org/core-1.9.3/Regexp.html

This is the bracket expression you'll want - /[[:alpha:]]/ 这是您想要的方括号表达式-/ [[:::]] /

Answer 4

The easiest solution is simply using delete('^') . 最简单的解决方案是使用delete('^') 。 It deletes everything except for what comes after ^ . 它删除除^之后的所有内容。

a='hello-world+'

a.delete('^A-Za-z')  #=> 'helloworld'

a='Hello +World'

a.delete('^A-Za-z ') #=> 'Hello World'

a='01234 ABC'
a.delete('^0-9') #=> '01234'

如何去除不是单词字符的所有字符的Ruby字符串？

问题描述

4 个解决方案

解决方案1
4 2013-11-04 22:34:51

解决方案2
1 2013-11-04 22:28:00

解决方案3
0 2013-11-04 22:20:18

解决方案4
0 2013-11-05 04:05:11

如何去除不是单词字符的所有字符的Ruby字符串？

问题描述

4 个解决方案

解决方案1 4 2013-11-04 22:34:51

解决方案2 1 2013-11-04 22:28:00

解决方案3 0 2013-11-04 22:20:18

解决方案4 0 2013-11-05 04:05:11

解决方案1
4 2013-11-04 22:34:51

解决方案2
1 2013-11-04 22:28:00

解决方案3
0 2013-11-04 22:20:18

解决方案4
0 2013-11-05 04:05:11