I'm new to python, and trying to find a way to take the rules like this:
Rule number: 23 [conversion_flag=1 cover=48 (1%) prob=0.71]
X1Sbmor=1
X1SMFBP=1
Rule number: 14 [conversion_flag=0 cover=186 (5%) prob=0.45]
X1Sbmor=0
X1S3IwL=1
Rule number: 22 [conversion_flag=0 cover=15 (0%) prob=0.33]
X1Sbmor=1
X1SO4PP=0
...
...
And parse it into a dictionary like this:
# if prob> 0.4 then key = 'group_1', and values are lists of tuples like this:
{'group_1':[(X1Sbmor=1 & X1SMFBP=1),(X1Sbmor=0 & X1S3IwL=1)]}
# if prob< 0.4 then key= 'group_2', and values are list of tuples like this:
{'group_2':[(X1Sbmor=1 & X1SO4PP=0)]}
I'm sure there's a way to automatically parse the rules out, and write into a dictionary as described above. But I cannot figure it out.
Hope this helps :)
import re
regex = r'Rule number: (\d+) \[conversion_flag=(\d+) cover=(\d+) (\(\d+%\)) prob=(\d+.\d+)\](?:((\w+)=(\d+)\n+(\w+)=(\d+)|\n)*)'
text = """ Rule number: 23 [conversion_flag=1 cover=48 (1%) prob=0.71]
X1Sbmor=1
X1SMFBP=1
Rule number: 14 [conversion_flag=0 cover=186 (5%) prob=0.45]
X1Sbmor=0
X1S3IwL=1
Rule number: 22 [conversion_flag=0 cover=15 (0%) prob=0.33]
X1Sbmor=1
X1SO4PP=0"""
matches = re.findall(regex,text)
final = { 'group_1': [], 'group_2': []}
for match in matches:
if match[4] > '0.4' :
final['group_1'].append((match[6] + '=' + match[7]+ ' & ' + match[8] + '=' + match[9],))
else :
final['group_2'].append((match[6] + '=' + match[7]+ ' & ' + match[8] + '=' + match[9],))
print(final)
Output
python test.py
{'group_2': [('X1Sbmor=1 & X1SO4PP=0',)], 'group_1': [('X1Sbmor=1 & X1SMFBP=1',), ('X1Sbmor=0 & X1S3IwL=1',)]}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.