[英]Reading an xml file using element tree
I have one xml file. 我有一个xml文件。 Its looks like,
看起来像
<root>
<Group>
<ChapterNo>1</ChapterNo>
<ChapterName>A</ChapterName>
<Line>1</Line>
<Content>zfsdfsdf</Content>
<Synonyms>fdgd</Synonyms>
<Translation>assdfsdfsdf</Translation>
</Group>
<Group>
<ChapterNo>1</ChapterNo>
<ChapterName>A</ChapterName>
<Line>2</Line>
<Content>ertreter</Content>
<Synonyms>retreter</Synonyms>
<Translation>erterte</Translation>
</Group>
<Group>
<ChapterNo>2</ChapterNo>
<ChapterName>B</ChapterName>
<Line>1</Line>
<Content>sadsafs</Content>
<Synonyms>sdfsdfsd</Synonyms>
<Translation>sdfsdfsd</Translation>
</Group>
<Group>
<ChapterNo>2</ChapterNo>
<ChapterName>B</ChapterName>
<Line>2</Line>
<Content>retete</Content>
<Synonyms>retertret</Synonyms>
<Translation>retertert</Translation>
</Group>
</root>
I tried in this way....... 我尝试过这种方式.......
root = ElementTree.parse('data.xml').getroot()
ChapterNo = root.find('ChapterNo').text
ChapterName = root.find('ChapterName').text
GitaLine = root.find('Line').text
Content = root.find('Content').text
Synonyms = root.find('Synonyms').text
Translation = root.find('Translation').text
But it shows an error 但是显示错误
ChapterNo=root.find('ChapterNo').text
AttributeError: 'NoneType' object has no attribute 'text'`
Now i want to get the all ChapterNo,ChapterName, etc are separately using element tree and I want to insert these dats into the database.... Any one can help me? 现在我想获取所有的ChapterNo,ChapterName等,分别使用元素树,我想将这些数据插入数据库中。...有人可以帮助我吗?
Rgds, RGDS,
Nimmy Nimmy
To parse your simple two-level data structure and assemble a dict for each group, all you need to do is this: 要解析简单的两级数据结构并为每个组组合一个字典,您需要做的是:
>>> # what you did to get `root`
>>> from pprint import pprint as pp
>>> for group in root:
... d = {}
... for elem in group:
... d[elem.tag] = elem.text
... pp(d) # or whack it ito a database
...
{'ChapterName': 'A',
'ChapterNo': '1',
'Content': 'zfsdfsdf',
'Line': '1',
'Synonyms': 'fdgd',
'Translation': 'assdfsdfsdf'}
{'ChapterName': 'A',
'ChapterNo': '1',
'Content': 'ertreter',
'Line': '2',
'Synonyms': 'retreter',
'Translation': 'erterte'}
{'ChapterName': 'B',
'ChapterNo': '2',
'Content': 'sadsafs',
'Line': '1',
'Synonyms': 'sdfsdfsd',
'Translation': 'sdfsdfsd'}
{'ChapterName': 'B',
'ChapterNo': '2',
'Content': 'retete',
'Line': '2',
'Synonyms': 'retertret',
'Translation': 'retertert'}
>>>
Look, Ma, no xpath! 瞧,妈,没有xpath!
ChapterNo
is not a direct child of root
, so root.find('ChapterNo')
won't work. ChapterNo
不是root
的直接子代,因此root.find('ChapterNo')
将不起作用。 You'll need to use xpath syntax to find the data. 您将需要使用xpath语法来查找数据。
Also, there are multiple occurrences of ChapterNo, ChapterName, etc, so you should use findall
and iterate through the results to get the text for each one. 另外,有多次出现的ChapterNo,ChapterName等,因此您应该使用
findall
并遍历结果以获取每个文本。
chapter_nos = [e.text for e in root.findall('.//ChapterNo')]
and so on. 等等。
Here's a small example using sqlalchemy
to define a object that will extract and store the data in a sqlite database. 这是一个使用
sqlalchemy
定义一个对象的小示例,该对象将提取数据并将其存储在sqlite数据库中。
from sqlalchemy import create_engine, Unicode, Integer, Column, UnicodeText
from sqlalchemy.orm import create_session
from sqlalchemy.ext.declarative import declarative_base
engine = create_engine('sqlite:///chapters.sqlite', echo=True)
Base = declarative_base(bind=engine)
class ChapterLine(Base):
__tablename__ = 'chapterlines'
chapter_no = Column(Integer, primary_key=True)
chapter_name = Column(Unicode(200))
line = Column(Integer, primary_key=True)
content = Column(UnicodeText)
synonyms = Column(UnicodeText)
translation = Column(UnicodeText)
@classmethod
def from_xmlgroup(cls, element):
l = cls()
l.chapter_no = int(element.find('ChapterNo').text)
l.chapter_name = element.find('ChapterName').text
l.line = int(element.find('Line').text)
l.content = element.find('Content').text
l.synonyms = element.find('Synonyms').text
l.translation = element.find('Translation').text
return l
Base.metadata.create_all() # creates the table
Here's how to use it: 使用方法如下:
from xml.etree import ElementTree as etree
session = create_session(bind=engine, autocommit=False)
doc = etree.parse('myfile.xml').getroot()
for group in doc.findall('Group'):
l = ChapterLine.from_xmlgroup(group)
session.add(l)
session.commit()
I have tested this code in your xml data and it works fine, inserting everything into the database. 我已经在您的xml数据中测试了此代码,并且工作正常,可以将所有内容插入数据库。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.