Supose I've got this two strings:
s1 = 'hello 4, this is stackoverflow, looking for help (1345-today is wednesday)'
s2 = 'hello again, this is a (bit-more complicated), string (67890123 - tomorrow is thursday)'
I want to use regex to match the pattern (number-words)
and then split the strings to get a list of tuples:
final = [('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday'),
('hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday')]
I tried with \\([0-9]+-(.*?)\\)
but without success.
What am I doing wrong? Any idea to get a workaround?
Thank you in advance!!
This might nudge you in the right direction:
>>> re.findall(r'^(.*) \((.+?)\)$', s1)
[('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday')]
You may use this regex in findall
:
>>> regx = re.compile(r'^(.*?)\s*\((\d+\s*-\s*\w+[^)]*)\)')
>>> arr = ['hello 4, this is stackoverflow, looking for help (1345-today is wednesday)', 'hello again, this is a (bit-more complicated), string (67890123 - tomorrow is thursday)']
>>> for el in arr:
... regx.findall(el)
...
[('hello 4, this is stackoverflow, looking for help', '1345-today is wednesday')]
[('hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday')]
RegEx Details:
^(.*?)
: Match 0 or more characters at the start in group #1 \\s*
: Match 0 or more whitespaces \\((\\d+\\s*-\\s*\\w+[^)]*)\\)
: Match (<number>-word ..)
string and capture what is inside brackets in capture group #2 Alternatively , you may use this regex in split
:
>>> import re
>>> reg = re.compile(r'(?<!\s)\s*(?=\((\d+\s*-\s*\w+[^)]*)\))')
>>> for el in arr:
... reg.split(el)[:-1]
...
['hello 4, this is stackoverflow, looking for help', '1345-today is wednesday']
['hello again, this is a (bit-more complicated), string', '67890123 - tomorrow is thursday']
RegEx Details:
(?<!\\s)
: If we don't have a whitespace at previous position \\s*
: Match 0+ whitespaces (?=\\((\\d+\\s*-\\s*\\w+[^)]*)\\))
: Lookahead to assert a string ahead of us which is (<number>-word ..)
. Note that we are using a capture group to get string inside (...)
in the result of split
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.