简体   繁体   English

Ruby中的正则表达式匹配字符串的第一和第二个空格

[英]Regex in Ruby to match first and second space of strings

I´m trying to get a regex to match first and second space and ":" from second column of strings like below in order to replace them with "|". 我正试图获得一个正则表达式来匹配第一和第二个空格,以及如下所示的第二个字符串列中的“:”,以将它们替换为“ |”。 I built the regex below but matches the opposite, since matches any word, not first and second space, nor ":". 我在下面构建了正则表达式,但匹配相反的内容,因为它匹配任何单词,而不是第一和第二个空格,也不是“:”。 Maybe someone could give a hand with this. 也许有人可以帮忙。

(\S+\s*) (\S+\s*) (\S+\s*)  # My current regex

Strings are like these. 字符串就是这样。

Usw 12:12 Desktop
Usw 1:2 Netbooks
Usw 1:345 Servers, mainframes and supercomputers

I´d like to convert these strings from above to this 我想将这些字符串从上面转换为

Usw|12|12|Desktop
Usw|1|2|Netbooks
Usw|1|345|Servers, mainframes and supercomputers

We can use gsub here: 我们可以在这里使用gsub

input = String.new("Usw 1:345 Servers, mainframes and supercomputers")
puts input.gsub(/(\S+)\s*(\d+):(\d+)\s*(.*)/, '\1|\2|\3|\4')

Usw|1|345|Servers, mainframes and supercomputers

Demo 演示版

The basic idea here is that we capture the non space content before the first space, after the first space, and the remainder of the string after the second space. 这里的基本思想是,我们捕获第一个空格之前,第一个空格之后的非空格内容,以及第二个空格之后的字符串的其余部分。 Then we build the output string you want using those capture groups. 然后,我们使用这些捕获组构建所需的输出字符串。

I believe the solution might be even simpler: 我相信解决方案可能会更简单:

'Usw 12:12 Desktop'.split(/\s|:/, 4).join('|')
#⇒ "Usw|12|12|Desktop"

The above will fail if the first column has colons, but I am pretty sure this is not the case. 如果第一列包含冒号,则以上操作将失败,但是我很确定事实并非如此。

You can use String#sub to replace the characters step by step: 您可以使用String#sub逐步替换字符:

str.sub(/\\s/,'|').sub(/:/,'|').sub(/\\s/,'|')

Tests 测验

test = [ "Usw 12:12 Desktop",
         "Usw 1:2 Netbooks",
         "Usw 1:345 Servers, mainframes and supercomputers" ]

test.map { |str| str.sub(/\s/,'|').sub(/:/,'|').sub(/\s/,'|') }                                                                      
 #=> [ "Usw|12|12|Desktop",
 #     "Usw|1|2|Netbooks",
 #     "Usw|1|345|Servers, mainframes and supercomputers" ]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM