简体   繁体   中英

Ruby: Extracting Words From String

I'm trying to parse words out of a string and put them into an array. I've tried the following thing:

@string1 = "oriented design, decomposition, encapsulation, and testing. Uses "
puts @string1.scan(/\s([^\,\.\s]*)/)

It seems to do the trick, but it's a bit shaky (I should include more special characters for example). Is there a better way to do so in ruby?

Optional: I have a cs course description. I intend to extract all the words out of it and place them in a string array, remove the most common word in the English language from the array produced, and then use the rest of the words as tags that users can use to search for cs courses.

The split command.

   words = @string1.split(/\W+/)

will split the string into an array based on a regular expression. \\W means any "non-word" character and the "+" means to combine multiple delimiters.

For me the best to spliting sentences is:

line.split(/[^[[:word:]]]+/)

Even with multilingual words and punctuation marks work perfectly:

line = 'English words, Polski Żurek!!! crème fraîche...'
line.split(/[^[[:word:]]]+/)
=> ["English", "words", "Polski", "Żurek", "crème", "fraîche"] 

Well, you could split the string on spaces if that's your delimiter of interest

@string1.split(' ')

Or split on word boundaries

\W  # Any non-word character

\b  # Any word boundary character

Or on non-words

\s  # Any whitespace character

Hint: try testing each of these on http://rubular.com

And note that ruby 1.9 has some differences from 1.8

对于Rails,您可以使用以下内容:

@string1.split(/\s/).delete_if(&:blank?)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM