简体   繁体   中英

Regex to match a list of one or more comma-separated words, unless the string ends in a comma

I have written the following regex long time ago that must match at least 3 words and works in both Latin and Cyrillic characters: regex = '([^,;\d]{2,}[,;]{1,}){2,}[^,;\d]{2,}'

I would like to rewrite it to match hello but fail to match hello, because of the comma. However, I would still like it to match hello, and, more, words .

Example matches: hello , hello, test69 , hello, test69, matches

Example non-matches: hello, hello test69 , hello test69 matches

You can use

^\w+(?:, *\w+)*$

In Python, you can use a shorter version if you use re.fullmatch :

re.fullmatch(r'\w+(?:, *\w+)*', text)

See the regex demo .

Note that in case your spaces can be any whitespaces, replace the with \s in the regex. If your words can only contain letters, replace each \w with [^\W\d_] . If your words can only contain letters and digits, replace every \w with [^\W_] .

Details :

  • ^ - start of string
  • \w+ - one or more word chars
  • (?:, *\w+)* - zero or more repetitions of a comma, zero or more spaces, and then one or more word chars
  • $ - end of string.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM