简体   繁体   English

如何使用python从本地xml文件中提取特定的xml标签?

[英]How can I Extract Specific xml tags from a local xml file using python?

I'm pretty new to interacting with xml, python, and scraping data so bear with me please: I've got an xml file with my notes saved from evernote. 我对与xml,python和数据进行交互非常陌生,所以请多多包涵:我有一个xml文件,其中的笔记保存在evernote中。 I have been able to load BeautifulSoup and lxml into my python environment. 我已经能够将BeautifulSoup和lxml加载到我的python环境中。 I have also been able to load the xml file and print 我还能够加载xml文件并打印

Heres my code up until print: 这是我的代码,直到打印为止:

from bs4 import BeautifulSoup
from xml.dom.minidom import parseString
file = open('myNotes.xml','r')
data = file.read()
dom = parseString(data)
print data.toxml()

I didn't include the actual printed file as it contains lots of base 64 code. 我没有包含实际的打印文件,因为它包含许多基本的64位代码。

What I am trying to accomplish is to extract select xml tags and print them to a new file... help! 我要完成的工作是提取选定的xml标记并将它们打印到新文件中……帮助!

This is how to use BeautifulSoup to print xml 这是如何使用BeautifulSoup打印xml

from bs4 import BeautifulSoup

soup = BeautifulSoup(open('myNotes.xml','r'))
print(soup.prettify())

And to write it to a file: 并将其写入文件:

with open("file.txt", "w") as f:
    f.write(soup.prettify())

Now, to extract all of a certain type of tag to a list: 现在,要将所有特定类型的标签提取到列表中:

# Extract all of the <a> tags:    
tags = soup.find_all('a')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM