[英]How to represent a sort key of type number in a Glue python job script
[英]How to extract file xml attribute using Python ElementTree -Glue Job
为了
<POR Cli="1" Name="Paul Smith" Street="SN" >
<Sal Val="1000" Gan="M">
<Fam dep="1" dog="2" />
</Sal>
</POR>
<POR Cli="2" Name="Mary Smith" Street="SN" >
<Sal Val="2000" Gan="S">
<Fam dep="0" dog="1" />
</Sal>
</POR>
我想提取标签做 xml
cli;name;Street;val;gran;dep;dog
在写完 aws s3 之后
cli;name;Street;val;gran;dep;dog
1;PauloSmith,SN,1000,M,1,2
2;Mary Smith,SN,2000,S,0,1
您可以使用 BeautifulSoup 和 csv 模块:
from bs4 import BeautifulSoup
import csv, sys
data = '''\
<POR Cli="1" Name="Paul Smith" Street="SN" >
<Sal Val="1000" Gan="M">
<Fam dep="1" dog="2" />
</Sal>
</POR>
<POR Cli="2" Name="Mary Smith" Street="SN" >
<Sal Val="2000" Gan="S">
<Fam dep="0" dog="1" />
</Sal>
</POR>
'''
soup = BeautifulSoup(data, 'html.parser')
writer = csv.DictWriter(
sys.stdout,
fieldnames=['cli', 'name', 'street', 'val', 'gan', 'dep', 'dog'])
writer.writeheader()
for por in soup.find_all('por'):
d = por.attrs
d.update(por.sal.attrs)
d.update(por.sal.fam.attrs)
writer.writerow(d)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.