简体   繁体   中英

match key-value pair

In the following text, I want to extract the keys with their values. I've written the following regex but it does not matches the values across multiple lines. regex: --(.*)=.*(?=(.|--|\\n|\\Z)*)

--some text here not to be matched
--key1=this is a
 multiline statement
 statement
--random text not to be matched
--key2=val2
--key3=val3
--random text here not to be matched

So, after matching the output should be

--key1=this is a
 multiline statement
 statement
--key2=val2
--key3=val3

You can try this:

import re
s = """
 --some text here not to be matched
 --key1=this is a
 multiline statement
 statement
 --random text not to be matched
 --key2=val2
 --key3=val3
 --random text here not to be matched
"""
new_data = re.findall('\-\-\w+\=[a-zA-Z\s\n]+', s)
for i in new_data:
  print(i)

Output:

--key1=this is a
multiline statement
statement
--key2=val
--key3=val

Ajax's answer will fail if any of the values contain - . Instead, do a negative lookaround to ensure that the vals do not contain -- .

This regex will do that: --.+=((?!--)[\\S\\s])+

Regex101 link

Perhaps the OP provided a simplistic example and in actual code, regex will be required, but the example above can be filtered without regex

The central insight in this method of filtering out the junk lines is to remove all lines that start with -- but doesn't contain = .

text = """--some text here not to be matched
   --key1=this is a
    multiline statement
    statement
   --random text not to be matched
   --key2=val2
   --key3=val3
   --random text here not to be matched"""

valid_lines = [l for l in text.split('\n') if not (l.startswith('--') and '=' not in l)]

result = '\n'.join(valid_lines)

print(result)
# output
--key1=this is a
 multiline statement
 statement
--key2=val2
--key3=val3

to construct a dictionary out of the result text:

mydata = {data.split('=')[0]:data.split('=')[1].strip('\n') for data in result.strip('-').split('--')}
print(mydata)
#outputs:
{'key1': 'this is a\n multiline statement\n statement', 'key2': 'val2', 'key3': 'val3'}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM