[英]How can I Extract Specific xml tags from a local xml file using python?
I'm pretty new to interacting with xml, python, and scraping data so bear with me please: I've got an xml file with my notes saved from evernote. 我对与xml,python和数据进行交互非常陌生,所以请多多包涵:我有一个xml文件,其中的笔记保存在evernote中。 I have been able to load BeautifulSoup and lxml into my python environment. 我已经能够将BeautifulSoup和lxml加载到我的python环境中。 I have also been able to load the xml file and print 我还能够加载xml文件并打印
Heres my code up until print: 这是我的代码,直到打印为止:
from bs4 import BeautifulSoup
from xml.dom.minidom import parseString
file = open('myNotes.xml','r')
data = file.read()
dom = parseString(data)
print data.toxml()
I didn't include the actual printed file as it contains lots of base 64 code. 我没有包含实际的打印文件,因为它包含许多基本的64位代码。
What I am trying to accomplish is to extract select xml tags and print them to a new file... help! 我要完成的工作是提取选定的xml标记并将它们打印到新文件中……帮助!
This is how to use BeautifulSoup to print xml 这是如何使用BeautifulSoup打印xml
from bs4 import BeautifulSoup
soup = BeautifulSoup(open('myNotes.xml','r'))
print(soup.prettify())
And to write it to a file: 并将其写入文件:
with open("file.txt", "w") as f:
f.write(soup.prettify())
Now, to extract all of a certain type of tag to a list: 现在,要将所有特定类型的标签提取到列表中:
# Extract all of the <a> tags:
tags = soup.find_all('a')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.