[英]Parse multiple XML files to one list of dictionaries in Python
我有一个案例,在解析多个 XML 文件时,实际上我希望解析 XML 的结果成为单个字典列表而不是多个字典列表。
import glob
from bs4 import BeautifulSoup
def open_xml(filenames):
for filename in filenames:
with open(filename) as fp:
soup = BeautifulSoup(fp, 'html.parser')
parse_xml_files(soup)
def parse_xml_files(soup):
stringToListOfDict = []
.
.
.
for info in infos:
dict = {}
types = info.find_all('type')
values = info.find_all('value')
for type in types:
dict[type.attrs['p']] = type.text
stringToListOfDict.append({'Date': Date, 'Time': Time, 'NodeName': node})
for value in values:
for result in value.find_all('x'):
label = dict[result.attrs['y']]
value = result.text
if label:
stringToListOfDict[-1][label] = value
print(stringToListOfDict)
def main():
open_xml(filenames = glob.glob("*.xml"))
if __name__ == '__main__':
main()
使用我上面的代码,它总是在下面生成两个字典列表(例如,对于两个 XML 文件):
[{'Date': '2020-11-19', 'Time': '18:15', 'NodeName': 'LinuxSuSe','Speed': '16'}]
[{'Date': '2020-11-19', 'Time': '18:30', 'NodeName': 'LinuxRedhat','Speed': '16'}]
所需的 output 应该是一个只有两个字典的列表:
[{'Date': '2020-11-19', 'Time': '18:15', 'NodeName': 'LinuxSuSe','Speed': '16'},{'Date': '2020-11-19', 'Time': '18:30', 'NodeName':'LinuxRedhat','Speed': '16'}]
非常感谢您的反馈
print()
仅用于在屏幕上发送信息,它不会将所有结果合并到一个列表中。
您的名称parse_xml_files
具有误导性,因为它解析单个文件,而不是所有文件。 而这个 function 应该使用return
来发送单个文件的结果,在open_xml
你应该得到这个结果添加到一个列表中 - 然后你应该将所有文件放在一个列表中。
未测试:
def open_xml(filenames):
all_files = []
for filename in filenames:
with open(filename) as fp:
soup = BeautifulSoup(fp, 'html.parser')
result = parse_xml_file(soup) # <-- get result from parse_xml_file
all_files += result # <-- append result to list
print(all_files) # <-- display all results
def parse_xml_file(soup):
stringToListOfDict = []
# ... code ...
for info in infos:
dict = {}
types = info.find_all('type')
values = info.find_all('value')
for type in types:
dict[type.attrs['p']] = type.text
stringToListOfDict.append({'Date': Date, 'Time': Time, 'NodeName': node})
for value in values:
for result in value.find_all('x'):
label = dict[result.attrs['y']]
value = result.text
if label:
stringToListOfDict[-1][label] = value
#print(stringToListOfDict)
return stringToListOfDict # <-- send to open_xml
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.