[英]Python re.findall organize list
我有一個包含如下條目的文本文件:
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <soap:Body> <Applications_GetResponse xmlns="http://www.country.com"> <Applications> <CS_Application> <Name>Spain</Name> <Key>2345364564</Key> <Status>NORMAL</Status> <Modules> <CS_Module> <Name>zaragoza</Name> <Key>8743249725</Key> <DevelopmentEffort>0</DevelopmentEffort> <LogicalDBConnections/> </CS_Module> <CS_Module> <Name>malaga</Name> <Key>8743249725</Key> <DevelopmentEffort>0</DevelopmentEffort> <LogicalDBConnections/> </CS_Module> </Modules> <CreatedBy>7</CreatedBy> </CS_Application> <CS_Application> <Name>UK</Name> <Key>2345364564</Key> <Status>NORMAL</Status> <Modules> <CS_Module> <Name>london</Name> <Key>8743249725</Key> <DevelopmentEffort>0</DevelopmentEffort> <LogicalDBConnections/> </CS_Module> <CS_Module> <Name>liverpool</Name> <Key>8743249725</Key> <DevelopmentEffort>0</DevelopmentEffort> <LogicalDBConnections/> </CS_Module> </Modules> <CreatedBy>7</CreatedBy> </CS_Application> </Applications> </Applications_GetResponse> </soap:Body> </soap:Envelope>
我想分析它並獲得城市序列中的國家名稱。
我用 python re.finall 嘗試了一些東西,但我沒有得到類似的東西
print("HERE APPLICATIONS")
applications = re.findall('<CS_Application><Name>(.*?)</Name>', response_apply.text)
print(applications)
print("HERE MODULES")
modules = re.findall('<CS_Module><Name>(.*?)</Name>', response_apply.text)
print(modules)
返回:
host-10$ sudo python3 capture.py
HERE APPLICATIONS
['Spain', 'UK']
HERE MODULES
['zaragoza', 'malaga', 'london', 'liverpool']
預期的結果是,我希望結果是這樣的:
HERE
The Country: Spain - Cities: zaragoza,malaga
The Country: UK - Cities: london,liverpool
正則表達式不好解析 xml。最好使用 xml 解析器。如果你想要正則表達式解決方案,那么希望下面的代碼對你有所幫助。
import re
s = """\n<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">\n <soap:Body>\n <Applications_GetResponse xmlns="http://www.country.com">\n <Applications>\n <CS_Application>\n <Name>Spain</Name>\n <Key>2345364564</Key>\n <Status>NORMAL</Status>\n <Modules>\n <CS_Module>\n <Name>zaragoza</Name>\n <Key>8743249725</Key>\n <DevelopmentEffort>0</DevelopmentEffort>\n <LogicalDBConnections/>\n </CS_Module>\n <CS_Module>\n <Name>malaga</Name>\n <Key>8743249725</Key>\n <DevelopmentEffort>0</DevelopmentEffort>\n <LogicalDBConnections/>\n </CS_Module>\n </Modules>\n <CreatedBy>7</CreatedBy>\n </CS_Application>\n <CS_Application>\n <Name>UK</Name>\n <Key>2345364564</Key>\n <Status>NORMAL</Status>\n <Modules>\n <CS_Module>\n <Name>london</Name>\n <Key>8743249725</Key>\n <DevelopmentEffort>0</DevelopmentEffort>\n <LogicalDBConnections/>\n </CS_Module>\n <CS_Module>\n <Name>liverpool</Name>\n <Key>8743249725</Key>\n <DevelopmentEffort>0</DevelopmentEffort>\n <LogicalDBConnections/>\n </CS_Module>\n </Modules>\n <CreatedBy>7</CreatedBy>\n </CS_Application>\n </Applications>\n </Applications_GetResponse>\n </soap:Body>\n</soap:Envelope>\n"""
pattern1 = re.compile(r'<CS_Application>([\s\S]*?)</CS_Application>')
pattern2 = re.compile(r'<Name>(.*)?</Name>')
for m in re.finditer(pattern1, s):
ss = m.group(1)
res = []
for mm in re.finditer(pattern2, ss):
res.append(mm.group(1))
print("The Country: "+res[0]+" - Cities: "+",".join(res[1:len(res)]))
我有一個包含如下條目的文本文件:
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <soap:Body> <Applications_GetResponse xmlns="http://www.country.com"> <Applications> <CS_Application> <Name>Spain</Name> <Key>2345364564</Key> <Status>NORMAL</Status> <Modules> <CS_Module> <Name>zaragoza</Name> <Key>8743249725</Key> <DevelopmentEffort>0</DevelopmentEffort> <LogicalDBConnections/> </CS_Module> <CS_Module> <Name>malaga</Name> <Key>8743249725</Key> <DevelopmentEffort>0</DevelopmentEffort> <LogicalDBConnections/> </CS_Module> </Modules> <CreatedBy>7</CreatedBy> </CS_Application> <CS_Application> <Name>UK</Name> <Key>2345364564</Key> <Status>NORMAL</Status> <Modules> <CS_Module> <Name>london</Name> <Key>8743249725</Key> <DevelopmentEffort>0</DevelopmentEffort> <LogicalDBConnections/> </CS_Module> <CS_Module> <Name>liverpool</Name> <Key>8743249725</Key> <DevelopmentEffort>0</DevelopmentEffort> <LogicalDBConnections/> </CS_Module> </Modules> <CreatedBy>7</CreatedBy> </CS_Application> </Applications> </Applications_GetResponse> </soap:Body> </soap:Envelope>
我想對其進行分析並按城市的順序獲得國家的名稱。
我用 python re.finall 嘗試了一些東西,但我沒有得到類似的東西
print("HERE APPLICATIONS")
applications = re.findall('<CS_Application><Name>(.*?)</Name>', response_apply.text)
print(applications)
print("HERE MODULES")
modules = re.findall('<CS_Module><Name>(.*?)</Name>', response_apply.text)
print(modules)
返回:
host-10$ sudo python3 capture.py
HERE APPLICATIONS
['Spain', 'UK']
HERE MODULES
['zaragoza', 'malaga', 'london', 'liverpool']
預期的結果是,我希望結果是這樣的:
HERE
The Country: Spain - Cities: zaragoza,malaga
The Country: UK - Cities: london,liverpool
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.