[英]Read xml file with pandas
I want to read an xml file to pandas.我想将 xml 文件读取到 pandas。 Here is sample XML file这是示例 XML 文件
<author lang="en" class="1">
<documents>
<document><![CDATA[Hellow how are you]]></document>
<document><![CDATA[I am good]]></document>
<document><![CDATA[What about you]]></document>
</documents>
</author>
This is what I have tried这是我尝试过的
from xml.dom import minidom
xmldoc = minidom.parse('text.xml')
itemlist = xmldoc.getElementsByTagName('document')
But I don't know how to move ahead and get values from itemlist
.但我不知道如何继续并从itemlist
获取值。 When I print it, I for following output当我打印它时,我关注 output
[<DOM Element: document at 0x170c9b229d0>,
<DOM Element: document at 0x170c9b22a60>,
<DOM Element: document at 0x170c9b22af0>]
How can I get strings out of it?我怎样才能从中取出字符串?
I believe you need我相信你需要
import pandas as pd
from xml.dom import minidom
xmldoc = minidom.parse('text.xml')
itemlist = xmldoc.getElementsByTagName('document')
df = pd.DataFrame({"document": (i.firstChild.nodeValue for i in itemlist)})
print(df)
Output: Output:
document
0 Hellow how are you
1 I am good
2 What about you
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.