繁体 English 中英

python beautifulsoup：lxml html.parser

[英]python beautifulsoup : lxml html.parser

原文 2016-06-20 23:34:46 3 2 python/ beautifulsoup/ lxml/ html-parser

我必须使用 beautifulsoup，但我不知道我必须使用哪个解析器。 我在 lxml 和 html.parser 之间犹豫不决，或者为什么不两者兼而有之。 如何知道网页是否符合 lxml 标准？ 如何知道网页是否符合 html 解析器？ 非常感谢

2 个解决方案

没有银弹。 不同的 HTML 解析器的行为不同，您应该选择适用于您的特定页面的那个。 在这种情况下工作基本上意味着您可以获得所需的数据。

lxml解析器通常更快， html5lib是最宽松的解析器 - 如果您要解析损坏的或格式不正确的 HTML，这种差异将是相关的。 html.parser是内置的，如果这是一个问题，它可以帮助避免额外的依赖。 这是一个突出差异的相关表格。

我学得很辛苦。 它一直在杀死我。 我只是想不通为什么我想要的标签包含该标签中没有的东西。 结果发现 html 解析器无法在该站点上正常工作。 经过数小时的头痛之后，我突然尝试切换到 lxml 解析器，你瞧……那些不值得的东西已经消失了！

Python BeautifulSoup html.parser无法正常工作

[英]Python BeautifulSoup html.parser not working

BeautifulSoup 在 html.parser 上失败

[英]BeautifulSoup failed on html.parser

beautifulsoup html.parser错误

[英]beautifulsoup html.parser error

BeautifulSoup：'lxml' 和 'html.parser' 和 'html5lib' 解析器有什么区别？

[英]BeautifulSoup: what's the difference between 'lxml' and 'html.parser' and 'html5lib' parsers?

做BeautifulSoup（source_code，'html.parser'）时“ html.parser”是什么意思？

[英]What is the meaning of “html.parser” when doing BeautifulSoup(source_code, 'html.parser')?

如何将 'features="html.parser"' 添加到 BeautifulSoup 构造函数

[英]How to add 'features="html.parser"' to the BeautifulSoup constructor

无法使用 html.parser 提取 web 页面的内容

[英]Unable to extract the contents of a web page using Beautifulsoup with html.parser

在python3，re，html.parser或其他内容中解析HTML？

[英]Parsing HTML in python3, re, html.parser, or something else?

ImportError：没有名为'html.parser'的模块; 'html'不是包（python3）

[英]ImportError: No module named 'html.parser'; 'html' is not a package (python3)

Python 美丽汤 html.parser 返回无

[英]Python Beautiful Soup html.parser returns none

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python BeautifulSoup html.parser无法正常工作 BeautifulSoup 在 html.parser 上失败 beautifulsoup html.parser错误 BeautifulSoup：'lxml' 和 'html.parser' 和 'html5lib' 解析器有什么区别？做BeautifulSoup（source_code，'html.parser'）时“ html.parser”是什么意思？如何将 'features="html.parser"' 添加到 BeautifulSoup 构造函数无法使用 html.parser 提取 web 页面的内容在python3，re，html.parser或其他内容中解析HTML？ ImportError：没有名为'html.parser'的模块; 'html'不是包（python3） Python 美丽汤 html.parser 返回无

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM