Regex - Python matching between string and first occurence

Question

I'm having a hard time grasping regex no matter how much documentation I read up on. I'm trying to match everything between aa string and the first occurrence of & this is what I have

link =  "group.do?sys_id=69adb887157e450051e85118b6ff533c&amp;&"
rex = re.compile("group\.do\?sys_id=(.?)&")
sysid = rex.search(link).groups()[0]

I'm using https://regex101.com/#python to help me validate my regex and I can kinda get rex = re.compile("user_group.do?sys_id=(.*)&") to work but the .* is greedy and matches to the last & and im looking to match to the first &

I thought .? matches zero to 1 time

Answer 1

You don't necessarily need regular expressions here. Use urlparse instead:

>>> from urlparse import urlparse, parse_qs 
>>> parse_qs(urlparse(link).query)['sys_id'][0]
'69adb887157e450051e85118b6ff533c'

In case of Python 3 change the import to:

from urllib.parse import urlparse, parse_qs

Answer 2

You can simply regex out to the &amp instead of the final & like so:

import re
link =  "user_group.do?sys_id=69adb887157e450051e85118b6ff533c&amp;&"
rex = re.compile("user_group\.do\?sys_id=(.*)&amp;&")
sysid = rex.search(link).groups()[0]

print(sysid)

Answer 3

.*

is greedy but

.*?

should not be in regex.

.?

would only look for any character 0-1 times while

.*?

will look for it up to the earliest matching occurrence. I hope that explains it.

Regex - Python matching between string and first occurence

Question

3 answers

solution1
7 ACCPTED 2016-06-13 19:48:45

solution2
2 2016-06-13 19:48:54

solution3
2 2016-06-13 19:53:45

Regex - Python matching between string and first occurence

Question

3 answers

solution1 7 ACCPTED 2016-06-13 19:48:45

solution2 2 2016-06-13 19:48:54

solution3 2 2016-06-13 19:53:45

solution1
7 ACCPTED 2016-06-13 19:48:45

solution2
2 2016-06-13 19:48:54

solution3
2 2016-06-13 19:53:45