Python RegEx matching substrings on various conditions

Question

Been struggling with this one for a while now - I simply can't wrap my brain around it.

Given the following string variations:

some text
some text http://a.link.to/something
some text - http://a.link.to/something
some text: http://a.link.to/something
http://a.link.to/something

I am looking for a RegEx that would produce the following:

{'text': 'some text',
 'link': ''}

{'text': 'some text',
 'link': 'http://a.link.to/something'}

{'text': '',
 'link': 'http://a.link.to/something'}

Cheers!

Answer 1

Use named capturing groups in re.match function so that you could be able to create dictionary with user defined keys.

>>> s = '''some text
some text http://a.link.to/something
some text - http://a.link.to/something
some text: http://a.link.to/something
http://a.link.to/something'''
>>> for i in s.split('\n'):
        re.match(r'(?P<text>(?:(?!http://).)*?)\W*\b(?P<link>http://.*)?$', i).groupdict()


{'link': None, 'text': 'some text'}
{'link': 'http://a.link.to/something', 'text': 'some text'}
{'link': 'http://a.link.to/something', 'text': 'some text'}
{'link': 'http://a.link.to/something', 'text': 'some text'}
{'link': 'http://a.link.to/something', 'text': ''}

Answer 2

You can use a regex like this:

(.+?)(http.*)?$

Working demo

在此处输入图片说明

As you can see is not fully achieving what you want for the case of:

some text - http://a.link.to/something

Since it generates:

{'text': 'some text - ',  'link': 'http://a.link.to/something'}
                    ^--- Dash here

But you can do a pre or post clean to the text.

I'm posting the answer since it might help you.

Python RegEx matching substrings on various conditions

Question

2 answers

solution1
3 ACCPTED 2015-03-05 14:25:03

solution2
1 2015-03-05 14:27:31

Python RegEx matching substrings on various conditions

Question

2 answers

solution1 3 ACCPTED 2015-03-05 14:25:03

solution2 1 2015-03-05 14:27:31

solution1
3 ACCPTED 2015-03-05 14:25:03

solution2
1 2015-03-05 14:27:31