简体   繁体   中英

Regex: matching up to the first occurrence of word with character 'a' in it

I need a regular expression to match the first word with character 'a' in it for each line. For example my test string is this:

bbsc abcd aaaagdhskss
dsaa asdd aaaagdfhdghd
wwer wwww awww wwwd

Only the ones in BOLD fonts should be matched. How can I do that? I can match all the words with 'a' in it, but can't figure out how to only match the first occurrence.

Under the assumption that the only characters being used are word characters, ie \\w characters, and white space then use:

/^(?:[^a ]+ +)*([^a ]*a\w*)\b/gm
  1. ^ Matches the start of the line
  2. (?:[^a ]+ +)* Matches 0 or more occurrences of words composed of any character other than an a followed by one or more spaces in a non-capturing group.
  3. ([^a ]*a\\w*)\\b Matches a word ending on a word boundary (it is already guaranteed to begin on a word boundary) that contains an a . The word-boundary constraint allows for the word to be at the end of the line.

The first word with an a in it will be in group #1.

See demo

If we cannot assume that only word ( \\w ) and white space characters are present, then use:

^(?:[^a ]+ +)*(\w*a\w*)\b

The difference is in scanning the first word with an a in it, (\\w*a\\w*) , where we are guaranteed that we are scanning a string composed of only word characters.

What are you using? In many programs you can set limit. If possible: \\b[bz]*a[az]* with 1 limit.

If it is not possible, use group to capture and match latter: ([bz]*a[az]*).*

Try:

^(?:[^a ]+ )*(\w*a\w*) .*$

Basically what it says is: capture a bunch of words that are composed of anything but the letter a (or <space> ) then capture a word that must include the letter a .

should hold the first word with a .应该保持的第一个字a

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM