[英]AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'encode'
[英]AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'children'
我有这个 XML 文件:
<population>
<person id="101">
<attributes>
<attribute name="age" class="java.lang.Integer" >53</attribute>
</attributes>
<plan score="-0.38" selected="yes">
<activity type="outside" link="81312" facility="outside_208" x="649324.9906891582" y="6866581.699995641" end_time="08:22:00" >
</activity>
<leg mode="car" dep_time="08:22:00" trav_time="00:10:13">
<route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
</leg>
<activity type="work" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
<attributes>
<attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
</attributes>
</activity>
<leg mode="car" dep_time="17:15:22" trav_time="00:07:05">
<route type="links" start_link="138852" end_link="189898" trav_time="00:07:05" distance="4604.544053407517" vehicleRefId="7262234">138852</route>
</leg>
<activity type="outside" link="189898" facility="outside_249" x="648729.9598002436" y="6866057.250182923" end_time="17:20:35" >
</activity>
</plan>
<plan score="-0.38" selected="no">
<activity type="inside" link="81312" facility="outside_208" x="649324.9906891582" y="6866581.699995641" end_time="08:22:00" >
</activity>
<leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
<route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
</leg>
<activity type="shopping" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
<attributes>
<attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
</attributes>
</activity>
<leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
<route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
</leg>
<activity type="work" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
<attributes>
<attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
</attributes>
</activity>
<leg mode="pt" dep_time="17:15:22" trav_time="00:07:05">
<route type="links" start_link="138852" end_link="189898" trav_time="00:07:05" distance="4604.544053407517" vehicleRefId="7262234">138852</route>
</leg>
<activity type="outside" link="189898" facility="outside_249" x="648729.9598002436" y="6866057.250182923" end_time="17:20:35" >
</activity>
</plan>
</person>
<person id="102">
<attributes>
<attribute name="age" class="java.lang.Integer" >53</attribute>
</attributes>
<plan score="-0.38" selected="yes">
<activity type="inside" link="81312" facility="outside_208" x="649324.9906891582" y="6866581.699995641" end_time="08:22:00" >
</activity>
<leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
<route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
</leg>
<activity type="work" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
<attributes>
<attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
</attributes>
</activity>
<leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
<route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
</leg>
<activity type="work" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
<attributes>
<attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
</attributes>
</activity>
<leg mode="pt" dep_time="17:15:22" trav_time="00:07:05">
<route type="links" start_link="138852" end_link="189898" trav_time="00:07:05" distance="4604.544053407517" vehicleRefId="7262234">138852</route>
</leg>
<activity type="outside" link="189898" facility="outside_249" x="648729.9598002436" y="6866057.250182923" end_time="17:20:35" >
</activity>
</plan>
</person>
<person id="103">
<attributes>
<attribute name="age" class="java.lang.Integer" >53</attribute>
</attributes>
<plan score="-0.38" selected="yes">
<activity type="inside" link="81312" facility="outside_208" x="649324.9906891582" y="6866581.699995641" end_time="08:22:00" >
</activity>
<leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
<route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
</leg>
<activity type="shopping" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
<attributes>
<attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
</attributes>
</activity>
<leg mode="bike" dep_time="08:22:00" trav_time="00:10:13">
<route type="links" start_link="81312" end_link="138852" trav_time="00:10:13" distance="6046.54932060571" vehicleRefId="7262234">81312</route>
</leg>
<activity type="work" link="138852" facility="38407" x="651680.6" y="6863892.5" start_time="08:45:22" end_time="17:15:22" >
<attributes>
<attribute name="innerParis" class="java.lang.Boolean" >true</attribute>
</attributes>
</activity>
<leg mode="pt" dep_time="17:15:22" trav_time="00:07:05">
<route type="links" start_link="138852" end_link="189898" trav_time="00:07:05" distance="4604.544053407517" vehicleRefId="7262234">138852</route>
</leg>
<activity type="outside" link="189898" facility="outside_249" x="648729.9598002436" y="6866057.250182923" end_time="17:20:35" >
</activity>
</plan>
</person>
</population>
我的意图是创建一个包含三列的数据框; activity type
、 leg mode
和route distance
。 它们应该用下面的代码填充。
我使用以下代码尝试此操作,但收到以下常规错误消息:
import gzip
import xml.etree.ElementTree as ET
import pandas as pd
data = gzip.open('file.xml.gz', 'r')
root = ET.parse(data).getroot()
from collections import defaultdict
d = defaultdict(list)
for ent in root.findall('./person/plan[@selected="yes"]'):
if ent.name == 'activity':
d['type'].append(ent.get('type'))
elif ent.name == 'leg':
d['mode'].append(ent.get('mode'))
for place in ent.children:
if place.name=='route':
d['distance'].append(place.get('distance'))
coords=pd.DataFrame(d)
AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'children'
我已经阅读了 this和this ,但真的不知道如何将其应用于我的问题。
非常感谢您的帮助!
下面的解决方案可能会有所帮助 - 我注意到活动元素比每个plan
的腿多一个,因此必须进行调整以确保在提取时有同步:
import xml.etree.ElementTree as ET
from itertools import zip_longest,chain
from collections import defaultdict
root = ET.parse('test.xml').getroot()
#key elements and tags to extract
elements = ['activity', 'leg', 'route']
tags = ['type', 'mode', 'distance']
box = []
for entry in root.findall(".//plan[@selected='yes']"):
#keeping the defaultdict within the for loop ensures
#there is a new dictionary for every iteration
#also allows us align each extaction per ``plan`` element
d = defaultdict(list)
for element, tag in zip(elements, tags):
for ent in entry.findall(f".//{element}"):
d[f"{element}_{tag}"].append(ent.attrib.get(tag))
box.append(d)
flatten = chain.from_iterable
#activity results are more than leg mode and route
#zip longest helps pair them, without excluding any entry
flat_data = flatten(zip_longest(*ent.values()) for ent in box)
outcome = pd.DataFrame(flat_data, columns = d)
outcome
activity_type leg_mode route_distance
0 outside car 6046.54932060571
1 work car 4604.544053407517
2 outside None None
3 inside bike 6046.54932060571
4 work bike 6046.54932060571
5 work pt 4604.544053407517
6 outside None None
7 inside bike 6046.54932060571
8 shopping bike 6046.54932060571
9 work pt 4604.544053407517
10 outside None None
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.