Regular expression substitution in Python

Question

I have a string

line = "haha (as jfeoiwf) avsrv arv (as qwefo) afneoifew"

From this I want to remove all instances of "(as...)" using some regular expression. I want the output to look like

line = "haha avsrv arv afneoifew"

I tried:

line = re.sub(r'\(+as .*\)','',line)

But this yields:

line = "haha afneoifew"

Answer 1

To get non-greedy behaviour , you have to use *? instead of * , ie re.sub(r'\\(+as .*?\\) ','',line) . To get the desired string, you also have to add a space, ie re.sub(r'\\(+as .*?\\) ','',line) .

Answer 2

The problem is that your regexp matches this whole group : (as jfeoiwf) avsrv arv (as qwefo) , hence your result.

You can use :

>>> import re
>>> line = "haha (as jfeoiwf) avsrv arv (as qwefo) afneoifew"
>>> line = re.sub(r'\(+as [a-zA-Z]*\)','',line)
>>> line
'haha  avsrv arv  afneoifew'

Hope it'll be helpful.

Answer 3

You were very close. You need to use lazy quantifier '?' after .*. In default it will try to capture biggest group it possibly can. With lazy quantifier it'll actually try to match smallest possible groups.

line = re.sub(r'\(+as .*?\) ','',line)

Answer 4

尝试：

re.sub(u".\(as \w+\).", ' ',line)

Answer 5

尝试：

re.sub(r'\(as[^\)]*\)', '', line)

Regular expression substitution in Python

Question

5 answers

solution1
4 ACCPTED 2016-06-02 06:52:44

solution2
2 2016-06-02 06:52:52

solution3
2 2016-06-02 06:55:23

solution4
2 2016-06-02 06:57:37

solution5
1 2016-06-02 06:52:45

Regular expression substitution in Python

Question

5 answers

solution1 4 ACCPTED 2016-06-02 06:52:44

solution2 2 2016-06-02 06:52:52

solution3 2 2016-06-02 06:55:23

solution4 2 2016-06-02 06:57:37

solution5 1 2016-06-02 06:52:45

solution1
4 ACCPTED 2016-06-02 06:52:44

solution2
2 2016-06-02 06:52:52

solution3
2 2016-06-02 06:55:23

solution4
2 2016-06-02 06:57:37

solution5
1 2016-06-02 06:52:45