[英]Parsing every file in a directory in python?
所以我有这段代码:
import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
for segment in root.iter("s"):
for word in segment.iter("w"):
print word.text,
print "\n"
这将解析xml文件test.xml并打印已解析的输出。 但是,我有大量需要在目录中解析的这些xml文件。 如何修改代码,使其遍历目录中的每个文件并对其应用此功能?
谢谢!
这应该工作:
def printParsed(filename):
tree = ET.parse(filename)
root = tree.getroot()
for segment in root.iter("s"):
for word in segment.iter("w"):
print word.text,
print "\n"
if __name__ == "__main__":
from os import listdir
from os.path import isfile, join
mypath ='path/to/your/xml/files'
onlyfiles = [ f for f in listdir(mypath) if isfile(join(mypath,f)) ]
for f in onlyfiles:
# only does stuff if the file ends in xml
if f[-3:] = '.xml':
printParsed(f)
您可以将文件保存为parser.py
,然后像python parser.py
一样运行它。 如果需要,还可以删除if __name__ == "__main__"
部分。
它返回目录中所有文件的列表。
码:
import xml.etree.ElementTree as ET
import os
listofxml = os.listdir("./")
for xml in listofxml:
tree = ET.parse(xml)
root = tree.getroot()
for segment in root.iter("s"):
for word in segment.iter("w"):
print word.text,
print "\n"
如果不是所有文件都是xml,则可以拆分并检查:
import xml.etree.ElementTree as ET
import os
listofxml = os.listdir("./")
for xml in listofxml:
format = xml.split('.')
if format[-1] == 'xml':
tree = ET.parse(xml)
root = tree.getroot()
for segment in root.iter("s"):
for word in segment.iter("w"):
print word.text,
print "\n"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.