In the following text, I want to extract the keys with their values. I've written the following regex but it does not matches the values across multiple lines. regex: --(.*)=.*(?=(.|--|\\n|\\Z)*)
--some text here not to be matched
--key1=this is a
multiline statement
statement
--random text not to be matched
--key2=val2
--key3=val3
--random text here not to be matched
So, after matching the output should be
--key1=this is a
multiline statement
statement
--key2=val2
--key3=val3
You can try this:
import re
s = """
--some text here not to be matched
--key1=this is a
multiline statement
statement
--random text not to be matched
--key2=val2
--key3=val3
--random text here not to be matched
"""
new_data = re.findall('\-\-\w+\=[a-zA-Z\s\n]+', s)
for i in new_data:
print(i)
Output:
--key1=this is a
multiline statement
statement
--key2=val
--key3=val
Ajax's answer will fail if any of the values contain -
. Instead, do a negative lookaround to ensure that the vals do not contain --
.
This regex will do that: --.+=((?!--)[\\S\\s])+
Perhaps the OP provided a simplistic example and in actual code, regex will be required, but the example above can be filtered without regex
The central insight in this method of filtering out the junk lines is to remove all lines that start with --
but doesn't contain =
.
text = """--some text here not to be matched
--key1=this is a
multiline statement
statement
--random text not to be matched
--key2=val2
--key3=val3
--random text here not to be matched"""
valid_lines = [l for l in text.split('\n') if not (l.startswith('--') and '=' not in l)]
result = '\n'.join(valid_lines)
print(result)
# output
--key1=this is a
multiline statement
statement
--key2=val2
--key3=val3
to construct a dictionary out of the result text:
mydata = {data.split('=')[0]:data.split('=')[1].strip('\n') for data in result.strip('-').split('--')}
print(mydata)
#outputs:
{'key1': 'this is a\n multiline statement\n statement', 'key2': 'val2', 'key3': 'val3'}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.