无法使用 python 从网站图表中抓取数据

Question

我正在尝试从https://www.poder360.com.br/agregador-de-pesquisas/中抓取（解析）图表中显示的数据。

我已经尝试过请求、requests-html 和 beautifulsoup，但我无法解析整个网站。 即使我单击右键并查看页面源，它也不会显示带有数据的表格，其 id 是“方法表”。

上次尝试的代码：

from requests_html import HTMLSession

def get_data(url_path):
    from requests_html import HTMLSession
    session = HTMLSession()

    r = session.get(url_path)
    r.html.render(wait = 8, sleep = 8)

    return r.html

url_path = 'https://www.poder360.com.br/agregador-de-pesquisas'
content = get_data(url_path)
print(content.html)

还尝试以下代码

import requests
import json
from bs4 import BeautifulSoup

url = 'https://www.poder360.com.br/agregador-de-pesquisas'

r = requests.get(url)

soup = BeautifulSoup(r.content, 'html.parser')

print(soup)

Answer 1

我认为这是因为您需要运行 Javascript 来渲染整个页面并显示图形，这不适用于 HTMLSession oder 请求。

如果您在页面的浏览器中单击“检查”并查看实时代码而不是页面源，则可以搜索“圆圈”并找到图形的数据点。

也许这会有所帮助： Using python Requests with javascript pages

无法使用 python 从网站图表中抓取数据

问题描述

1 个解决方案

解决方案1
0 2022-09-20 20:02:04

无法使用 python 从网站图表中抓取数据

问题描述

1 个解决方案

解决方案1 0 2022-09-20 20:02:04

解决方案1
0 2022-09-20 20:02:04