Beautiful Soup 并通过 ID 提取 div

Question

I am trying to extract the number of "confirmados" cases of COVID-19 from this page https://coronavirus.gob.mx/datos/我正在尝试从此页面https://coronavirus.gob.mx/datos/中提取 COVID-19 的“确认”病例数

This is my line of code table_div = soup.find('div', {"id": "gsPosDIV"}) but is not working, I am really neophyte with web scraping.这是我的代码行table_div = soup.find('div', {"id": "gsPosDIV"})但不起作用，我真的是 web 刮擦的新手。 Which is the correct form to extract this data?提取这些数据的正确形式是什么？

This is the html <div id="gsPosDIV" class="h5 mb-0 font-weight-bold text-gray-800">47,144</div这是 html <div id="gsPosDIV" class="h5 mb-0 font-weight-bold text-gray-800">47,144</div

Answer 1

The data is loaded dynamically via JavaScript.数据通过 JavaScript 动态加载。 You can simulate the Javascript requests by requests module and then parse the data with re module:您可以通过requests模块模拟 Javascript 请求，然后使用re模块解析数据：

import re
import requests

data = {'sPatType': 'Confirmados',
'cve': '000',
'nom': 'Nacional'}

url = 'https://coronavirus.gob.mx/datos/Overview/info/getInfo.php'

raw_data = requests.post(url, data=data).text

positivos = re.search(r'document\.getElementById\("gsPosDIV"\)\.innerHTML = \((\d+)', raw_data).group(1)
print(positivos)

Prints:印刷：

Beautiful Soup 并通过 ID 提取 div

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-05-17 21:57:51

Beautiful Soup 并通过 ID 提取 div

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-05-17 21:57:51

解决方案1
0 已采纳 2020-05-17 21:57:51