I'm working in Ruby and I want to split a string and its punctuation into an array, but I want to consider apostrophes and hyphens as parts of words. For example,
s = "here...is a happy-go-lucky string that I'm writing"
should become
["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"].
The closest I've gotten is still inadequate because it doesn't properly consider hyphens and apostrophes as part of the word.
This is the closest I've gotten so far:
s.scan(/\w+|\W+/).select {|x| x.match(/\S/)}
which yields
["here", "...", "is", "a", "happy", "-", "go", "-", "lucky", "string", "that", "I", "'", "m", "writing"]
.
You can try the following:
s.scan(/[\w'-]+|[[:punct:]]+/)
#=> ["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"]
You were close:
s.scan(/[\w'-]+|[.,!?]+/)
The idea is we match either words with possibly '
/ -
in them or punctuation characters.
After nearly giving up then tinkering some more, I appear to have solved the puzzle. This seems to work: s.scan(/[\\w'-]+|\\W+/).select {|x| x.match(/\\S/)}
s.scan(/[\\w'-]+|\\W+/).select {|x| x.match(/\\S/)}
. It yields ["here", "...", "is", "a", "happy-go-lucky", "string", "that", "I'm", "writing"]
.
Is there an even cleaner way to do it though, without having to use #select
?
Use the split
method.
Example:
str = "word, anotherWord, foo"
puts str.split(",")
It returns
word
anotherWord
foo
Hope it works for you!
Also you can chek this http://ruby.about.com/od/advancedruby/a/split.htm
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.