Regex to find a substring using regex

Question

I'm using regex with Groovy(Grails) to find a substring which is a combination of capitalized alphabets, underscores and digits only.

The regex

"THIS_WORD" ==~ /([A-Z_0-9]*)/

returns true ( but the following statement

def str = "Wlkjjf alkjdfas Wk;ljdfs fk THIS_WORD dsklafjf kjd".findAll{([A-Z_0-9]*)/}
println str

returns [W, W, T, H, I, S, _, W, O, R, D]

I need only the word THIS_WORD not alphabet W that is repeated twice. What am I missing here?

Answer 1

也许您可以使用{2，}而不是*来获取所有具有1个以上字符的匹配项：

def str = "Wlkjjf als Wk;lfs fk THIS_WORD dsjf kjd".findAll(/[A-Z_0-9]{2,}/)

Answer 2

means 0 or more whereas a + means 1 or more. To do 2 or more you would need to use the {MIN,MAX} syntax after the []

([A-Z0-9_]{2,})

After learning a bit about groovy and testing on the groovy console at http://groovyconsole.appspot.com/ I found this worked.

def str = "Wlkjjf alkjdfas Wk;ljdfs fk THIS_WORD dsklafjf kjd".findAll(/([A-Z_0-9]{2,})/)
println str

Answer 3

def str = "Wlkjjf alkjdfas Wk;ljdfs fk THIS_WORD dsklafjf kjd".findAll{([A-Z_0-9]*)/}

This doesn't compile. Perhaps you meant this:

"Wlkjjf alkjdfas Wk;ljdfs fk THIS_WORD dsklafjf kjd".findAll(/[A-Z_0-9]*/)

which gives

[W, , , , , , , , , , , , , , , , W, , , , , , , , , , , , THIS_WORD, , , , , , , , , , , , , , ]

If you are looking for all upper-case words, a regex like this will work better:

"Wlkjjf alkjdfas Wk;ljdfs fk THIS_WORD dsklafjf kjd".findAll(/\b[A-Z_0-9]+\b/)