Python using re to match string in a specific pattern

Question

I am trying to use python re to match a string with a specific pattern. The problem I met is, I have this expected sentence:

"It is X. not X`

X can be anything; A word, or a bunch of word, or number, or digits.

The pattern I build is:

It is \w+. not \w+

just using

string.replace("X", "\w+")

It works if X is a word, or bunch of words, or int, but not for digits. How can I build my pattern in order to match everything in this pattern?

Answer 1

The . is a special character in a regular expression that will match any character. So .+ will match one or more characters.

r"It is .+\. not .+"

Not that the period is escaped \\. , this is because in that case, you want to match an actual period.

Answer 2

Because .+ won't work in some cases, for example

It is quote. not a double-quote

It is a dog. not a cat

I would use this one instead :

(?<=It is ).+(?=\\.)|(?<=not ).+$

Explanation

(?<=It is ).+(?=\\.) Any consecutive characters precedeed by It is and followed by a point

| OR

(?<=not ).*$ Any consecutive characters precedeed by not and followed by end of line anchor

(?<=It is ).*(?=\\.)|(?<=not ).*$

Demo

Answer 3

I have figured out, can use str.replace("X", "(\\w+|\\d+\\.\\d+)") to approach the problem. Hope can help others having the same issue.

Python using re to match string in a specific pattern

Question

3 answers

solution1
0 2017-05-19 03:33:04

solution2
0 2017-05-19 05:46:30

solution3
0 ACCPTED 2017-05-19 09:07:57

Python using re to match string in a specific pattern

Question

3 answers

solution1 0 2017-05-19 03:33:04

solution2 0 2017-05-19 05:46:30

solution3 0 ACCPTED 2017-05-19 09:07:57

solution1
0 2017-05-19 03:33:04

solution2
0 2017-05-19 05:46:30

solution3
0 ACCPTED 2017-05-19 09:07:57