I want to get the My Text Content
that immediately follows AB.00.000
.
I could able to get this AB.00.000
by using the below regular expression,
([A-Z]{2,3}\.[0-9]{2}\.[0-9]{3})
How do I get the text next to the AB.00.000
in Python?
Here is the input string:
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard
AB.00.000 My Text Content
$!#"!
23:50
My Phone
It seems you want to get the whole rest of the line after your pattern is found.
You may use
r'\b[A-Z]{2,3}\.[0-9]{2}\.[0-9]{3}\b\s*(.*)'
See the regex demo . Note that \\b
is a word boundary that requires a char other than a letter/digit/ _
before or after a word char (or start/end of string). The \\s*(.*)
is what your solution is missing badly:
\\s*
- 0+ whitespaces (.*)
- Capturing group #1: any 0 or more chars other than line break chars, as many as possible, ie the rest of the line. If the pattern must reside at the beginning of a line a regex way to extract the text you need will look like
r'(?m)^[A-Z]{2,3}\.[0-9]{2}\.[0-9]{3}\b\s*(.*)'
See another regex demo . (?m)
(= re.M
option) makes ^
match start of a line, not only start of the whole string, position.
Python:
m = re.search(r'\b[A-Z]{2,3}\.[0-9]{2}\.[0-9]{3}\b\s*(.*)')
if m:
print(m.group(1))
Note that to access the first (and only here) parenthesized part of the match you need to access the match group via .group(1)
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.