简体   繁体   中英

extract string using regular expression

fix_release='Ubuntu 16.04 LTS'

p = re.compile(r'(Ubuntu)\b(\d+[.]\d+)\b')
fix_release = p.search(fix_release)
logger.info(fix_release) #fix_release is None

I want to extract the string 'Ubuntu 16.04'

But, result is None.... How can I extract the correct sentence?

You confused the word boundary \\b with white space, the former matches the boundary between a word character and a non word character and consumes zero character, you can simply use r'Ubuntu \\d+\\.\\d+' for your case:

fix_release='Ubuntu 16.04 LTS'
p = re.compile(r'Ubuntu \d+\.\d+')
p.search(fix_release).group(0)
# 'Ubuntu 16.04'

Try this Regex:

Ubuntu\\s*\\d+(?:\\.\\d+)?

Click for Demo

Explanation:

  • Ubuntu - matches Ubuntu literally
  • \\s* - matches 0+ occurrences of a white-space, as many as possible
  • \\d+ - matches 1+ digits, as many as possible
  • (?:\\.\\d+)? - matches a . followed by 1+ digits, as many as possible. A ? at the end makes this part optional.

Note: In your regex, you are using \\b for the spaces. \\b returns 0 length matches between a word-character and a non-word character. You can use \\s instead

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM