简体   繁体   English

使用Ruby将字符串拆分为单词和标点符号

[英]Splitting a string into words and punctuation with Ruby

I'm working in Ruby and I want to split a string and its punctuation into an array, but I want to consider apostrophes and hyphens as parts of words. 我在Ruby工作,我想将一个字符串及其标点分割成一个数组,但我想将撇号和连字符视为单词的一部分。 For example, 例如,

s = "here...is a     happy-go-lucky string that I'm writing"

should become 应该成为

["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"].

The closest I've gotten is still inadequate because it doesn't properly consider hyphens and apostrophes as part of the word. 我得到的最接近的仍然是不充分的,因为它没有正确地将连字符和撇号视为单词的一部分。

This is the closest I've gotten so far: 这是我到目前为止最接近的:

s.scan(/\w+|\W+/).select {|x| x.match(/\S/)}

which yields 产量

["here", "...", "is", "a", "happy", "-", "go", "-", "lucky", "string", "that", "I", "'", "m", "writing"]

.

You can try the following: 您可以尝试以下方法:

s.scan(/[\w'-]+|[[:punct:]]+/)
#=> ["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"]

You were close: 你很亲密:

s.scan(/[\w'-]+|[.,!?]+/)

The idea is we match either words with possibly ' / - in them or punctuation characters. 这个想法是我们匹配任何可能带有' / -单词或标点字符。

After nearly giving up then tinkering some more, I appear to have solved the puzzle. 在几乎放弃然后修补一些之后,我似乎已经解决了这个难题。 This seems to work: s.scan(/[\\w'-]+|\\W+/).select {|x| x.match(/\\S/)} 这似乎有效: s.scan(/[\\w'-]+|\\W+/).select {|x| x.match(/\\S/)} s.scan(/[\\w'-]+|\\W+/).select {|x| x.match(/\\S/)} . s.scan(/[\\w'-]+|\\W+/).select {|x| x.match(/\\S/)} It yields ["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"] . 它产生["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"]

Is there an even cleaner way to do it though, without having to use #select ? 有没有更#select方法来做到这一点,而不必使用#select

Use the split method. 使用split方法。

Example: 例:

str = "word, anotherWord, foo"
puts str.split(",")

It returns 它回来了

word
anotherWord
foo

Hope it works for you! 希望这对你有用!

Also you can chek this http://ruby.about.com/od/advancedruby/a/split.htm 你也可以这个http://ruby.about.com/od/advancedruby/a/split.htm

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM