美丽的汤不返回 HTML

Question

I use the below script to gather all tags from a html page, but it's not showing html response, instead I am getting something else我使用以下脚本从 html 页面收集所有标签，但它没有显示 html 响应，而是我得到了其他东西

import urllib.request
from bs4 import BeautifulSoup
loginurl= 'https://172.56.66.77'
fhand = urllib.request.urlopen(loginurl).read()
soup = BeautifulSoup(fhand,'html.parser')
print(soup)

I tried collect a particular data from html page, but when I use Beautiful soup, it's not getting html data instead I am getting the below response我尝试从 html 页面收集特定数据，但是当我使用美丽的汤时，它没有得到 html 数据，而是得到以下响应

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="xslt.cgi"?>
<iconmenu>
<title>Geräteinformationen</title><prompt>Geräteinformationen anzhhas</prompt>
<menuitem/><iconindex>-1</iconindex><name>MAC-Adresse :  76238823354</name><url></url>
<menuitem/><iconindex>-1</iconindex><name>Host-Name : SEP76238823354</name><url></url>
</iconmenu>

I cannot filter the data as it's not showing html tag.我无法过滤数据，因为它没有显示 html 标签。

Please help me to get the 2nd data SEP76238823354 from the response请帮助我从响应中获取第二个数据SEP76238823354

Answer 1

It turns out that you just need to remove the second argument 'html.parser' from the constructor call:事实证明，您只需要从构造函数调用中删除第二个参数'html.parser' ：

import urllib.request
from bs4 import BeautifulSoup
xml_doc = """<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="xslt.cgi"?>
<iconmenu>
<title>Geräteinformationen</title><prompt>Geräteinformationen anzhhas</prompt>
<menuitem/><iconindex>-1</iconindex><name>MAC-Adresse :  76238823354</name><url></url>
<menuitem/><iconindex>-1</iconindex><name>Host-Name : SEP76238823354</name><url></url>
</iconmenu>"""
soup = BeautifulSoup(xml_doc)
print(soup.find_all("name")[1])
# -> <name>Host-Name : SEP76238823354</name>

Answer 2

Just select the element you need in this case, by containing Host-Name, split() it by delemiter and grab the last part:只需 select 在这种情况下您需要的元素，通过包含主机名，通过分隔符split()它并抓住最后一部分：

...
soup = BeautifulSoup(fhand, 'xml')
soup.select_one('name:-soup-contains("Host-Name")').text.split(': ')[-1]

Output: Output：

SEP76238823354

美丽的汤不返回 HTML

问题描述

2 个解决方案

解决方案1
0 2022-01-05 08:04:49

解决方案2
0 2022-01-05 08:23:39

美丽的汤不返回 HTML

问题描述

2 个解决方案

解决方案1 0 2022-01-05 08:04:49

解决方案2 0 2022-01-05 08:23:39

解决方案1
0 2022-01-05 08:04:49

解决方案2
0 2022-01-05 08:23:39