简体   繁体   中英

Regex capturing from a non capture group in ruby

I am trying to fix a bit of regex I have for a chatops bot for lita. I have the following regex:

/^(?:how\s+do\s+I\s+you\s+get\s+far\s+is\s+it\s+from\s+)?(.+)\s+to\s+(.+)/i

This is supposed to capture the words before and after 'to', with optional words in front that can form questions like: How do I get from x to y, how far from x to y, how far is it from x to y.

expected output:

match 1 : "x"
match 2 : "y"

For the most part my optional words work as expected. But when I pull my response matches, I get the words leading up to the first capture group included.

So, how far is it from sfo to lax should return:

sfo and lax .

But instead returns:

how far is it from sfo and lax

Your glitch is that the first chunk of your regex doesn't make sense.

To choose from multiple options, use this syntax:

(a|b|c)

What I think you're trying to do is this:

/^(?:(?:how|do|I|you|get|far|is|it|from)\s+)*(.+)\s+to\s+(.+)/i

The regexp says to skip all the words in the multiple options, regardless of order.

If you want to preserve word order, you can use regexps such as this pseudocode:

… how (can|do|will) (I|you|we) (get|go|travel) from …

When you want to match words , \\w is the most natural pattern I'd use (eg, it is used in word count tools.)

To capture any 1 word before and after a "to" can be done with (\\w+\\sto\\s+\\w*) regex.

To return them as 2 different groups, you can use (\\w+)\\s+to\\s+(\\w+) .

Have a look at the demo .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM