繁体 English 中英

抓取网站时缺少 Python html

[英]Python html missing when scraping website

原文 2020-07-23 17:15:43 7 2 python/ html/ web

我试图使用类似的代码来抓取网站

import requests
requests.get("myurl.com").content

但是网站上的一些重要元素丢失了。 如何使用 Python 3 获取整个网站内容，就像我在 Firefox 或其他浏览器中使用检查器一样？

2 个解决方案

为什么不试试 Scrapy、Selenium 甚至 Splash？ 它们是强大的抓取库。

为此，您可以使用 Beautiful Soup，一个用于抓取的 python 库。 只需在顶部导入它：

from bs4 import BeautifulSoup

然后，将这些行添加到您的代码中

data = requests.get("myurl.com").text
soup = BeautifulSoup(data, 'html.parser')

抓取网站时缺少 HTML 元素。 Python

[英]Missing HTML Elements when scraping website. Python

使用循环抓取网站时丢失数据

[英]Missing data when scraping website using loop

使用 Python 中的 Beautifulsoup 从网站抓取数据并将其放入 Z251D2BBFE9A3B95E5691CEB30DC6784EBAZ ZBA834BA059A175A3798E4Z9C1 时，某些单元格中的值缺失

[英]Missing values in certain cells when scraping data from website using Beautifulsoup in Python and placing it in Pandas DataFrame

使用Python从网站中收集HTML数据

[英]Scraping HTML data from website in Python

使用python抓取网站时获取最大页面数

[英]Getting max pagenumber when scraping website with python

解析网站上的HTML以进行抓取

[英]Parsing HTML on Website for Scraping

使用Python从HTML5网站抓取文本

[英]Scraping text from HTML5 website using Python

使用 Python 抓取网站，我怎么知道在 html 中引用的位置？

[英]Website scraping with Python, how do I know where to reference in the html?

使用python从一个网站内的多个html表中收集数据

[英]Scraping data from multiple html tables within one website in python

使用来自 imdb 网站的 python topboxoffice 列表抓取 html

[英]html scraping using python topboxoffice list from imdb website

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 抓取网站时缺少 HTML 元素。 Python 使用循环抓取网站时丢失数据使用 Python 中的 Beautifulsoup 从网站抓取数据并将其放入 Z251D2BBFE9A3B95E5691CEB30DC6784EBAZ ZBA834BA059A175A3798E4Z9C1 时，某些单元格中的值缺失使用Python从网站中收集HTML数据使用python抓取网站时获取最大页面数解析网站上的HTML以进行抓取使用Python从HTML5网站抓取文本使用 Python 抓取网站，我怎么知道在 html 中引用的位置？使用python从一个网站内的多个html表中收集数据使用来自 imdb 网站的 python topboxoffice 列表抓取 html

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM