Regular expression for find the text

Question

I want to get the My Text Content that immediately follows AB.00.000 .

I could able to get this AB.00.000 by using the below regular expression,

([A-Z]{2,3}\.[0-9]{2}\.[0-9]{3})

How do I get the text next to the AB.00.000 in Python?

Here is the input string:

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard 

AB.00.000 My Text Content

$!#"!

23:50

My Phone

Answer 1

It seems you want to get the whole rest of the line after your pattern is found.

You may use

r'\b[A-Z]{2,3}\.[0-9]{2}\.[0-9]{3}\b\s*(.*)'

See the regex demo . Note that \\b is a word boundary that requires a char other than a letter/digit/ _ before or after a word char (or start/end of string). The \\s*(.*) is what your solution is missing badly:

\\s* - 0+ whitespaces
(.*) - Capturing group #1: any 0 or more chars other than line break chars, as many as possible, ie the rest of the line.

If the pattern must reside at the beginning of a line a regex way to extract the text you need will look like

r'(?m)^[A-Z]{2,3}\.[0-9]{2}\.[0-9]{3}\b\s*(.*)'

See another regex demo . (?m) (= re.M option) makes ^ match start of a line, not only start of the whole string, position.

Python:

m = re.search(r'\b[A-Z]{2,3}\.[0-9]{2}\.[0-9]{3}\b\s*(.*)')
if m:
    print(m.group(1))

Note that to access the first (and only here) parenthesized part of the match you need to access the match group via .group(1) .

Regular expression for find the text

Question

1 answers

solution1
1 ACCPTED 2019-07-04 08:27:54

Regular expression for find the text

Question

1 answers

solution1 1 ACCPTED 2019-07-04 08:27:54

solution1
1 ACCPTED 2019-07-04 08:27:54