[英]Python regex search and grouping
04:38:06.151 [http-bio-5443-exec-8] INFO A.g.r.q.r.flex.RubixService - [NO-ID] [Mar 26, 2014 04:38:06 UTC] - [5a-y2-24C363B223F1CB534B8DCA5693ADF2E9] --0->servAggregateMultiple:61224380032:[[{"responseMeasures": ["SubscriberCnt","UpBytes","DownBytes","SessionCount","SubscriberPercentage","UpBytesPercentage","DownBytesPercentage","SessionCountPercentage"]
"sortProperty":"SubscriberCnt"
我在一个文本文件中有此数据。 我想执行搜索并找到responseMeasures,其中有此数据块。 如果找到它,我想在此之后存储所有内容:
["SubscriberCnt","UpBytes","DownBytes","SessionCount","SubscriberPercentage","UpBytesPercentage","DownBytesPercentage","SessionCountPercentage"]
我想将其作为组存储在变量中。 类似地,我希望找到findProperty并将“ SubscriberCnt”作为一个组存储在变量中。
这是我的代码段:
import re
fo = open("data.txt", "r")
data = fo.readlines()
for n in data:
matchMeasure = re.search("responseMeasures:", n)
if matchMeasure:
measureData = matchMeasure.groups()
print "Measures are" + str(measureData)
matchDimension = re.search("sortProperty", n)
if matchSort:
sortData = matchSort.groups()
print "Group by" + str(sortData)
这段代码可能不是最好的方法,而只是显示我在寻找什么。
我想要的预期输出:-
Meausres are:-
SubscriberCnt
UpBytes
DownBytes
SessionCount
SubscriberPercentage
UpBytesPercentage
DownBytesPercentage
SessionCountPercentage"
Group by SubscriberCnt
您可以使用re.findall来实现
with open('data.txt','r') as f:
first_filter = re.findall(r"[\w']+", f.read())
measures = first_filter[first_filter.index('responseMeasures')+1:first_filter.index('sortProperty')]
sortby = first_filter[first_filter.index('sortProperty')+1::]
for measure in measures:
print measure
print "Group by:"
for sort_type in sortby:
print sort_type
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.