简体   繁体   English

如何从xml错误消息python中提取文本

[英]How to extract text from xml error message python

I want to determine if the return from a beautifulsoup request looks like this. 我想确定从beautifulsoup请求返回的结果是否看起来像这样。

Out[32]: 
<?xml version="1.0" encoding="utf-8"?>
<boardgames termsofuse="https://boardgamegeek.com/xmlapi/termsofuse">
<boardgame>
<error message="Item not found"/>
</boardgame>
</boardgames>

I can extract the center of the previous output using: 我可以使用以下命令提取先前输出的中心:

soup.find_all('boardgame')[0], which produces the following:

Out[24]: 
<boardgame>
<error message="Item not found"/>
</boardgame>

I feel like this should be so easy, and I've tried the following, but I still can't determine if the "error message="Item not found" is in there. What am I missing here? 我觉得这应该很容易,我尝试了以下操作,但是仍然无法确定“错误消息=“未找到项目”是否在其中。我在这里想念什么?

soup.findAll('boardgame')[0].getText()
Out[26]: '\n\n'

Use the attribute message to get the value.If you to find the error tag first and then use the attribute message 使用属性message获取值。如果先找到error标签,然后再使用属性message

from bs4 import BeautifulSoup

data='''<?xml version="1.0" encoding="utf-8"?>
<boardgames termsofuse="https://boardgamegeek.com/xmlapi/termsofuse">
<boardgame>
<error message="Item not found"/>
</boardgame>
</boardgames>'''

soup=BeautifulSoup(data,'html.parser')
message=soup.find('boardgame').find('error')['message']
print(message)

Output: 输出:

Item not found 找不到项目


Or you can use css selector 或者您可以使用CSS选择器

from bs4 import BeautifulSoup

data='''<?xml version="1.0" encoding="utf-8"?>
<boardgames termsofuse="https://boardgamegeek.com/xmlapi/termsofuse">
<boardgame>
<error message="Item not found"/>
</boardgame>
</boardgames>'''

soup=BeautifulSoup(data,'html.parser')

message=soup.select_one('boardgame error')['message']
print(message)

Output: 输出:

Item not found 找不到项目

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM