簡體   English   中英

如何使用 beautifulsoup 和 pandas 從帶有日期過濾器的 dataframe 中抓取數據?

[英]How to use beautifulsoup and pandas to scrape data from a dataframe with a date filter?

我是 python 的新手,我正在尋找從網站上抓取數據。 問題是它有一個日期過濾器,我正在努力尋找如何提取多個日期。 在這方面有什么好的資源嗎,或者有人對如何做到這一點有建議嗎? 我似乎無法在網上找到我需要的東西。

我的代碼提取了今天顯示的內容:

res = requests.get("https://www.inmo.ie/Trolley_Ward_Watch")
soup = BeautifulSoup(res.content,'lxml')
table = soup.find_all('table')[0]
df = pd.read_html(str(table))
print(df[0].to_json(orient='records'))

數據通過 Javascript 加載。 但是您可以使用requests庫模擬 AJAX,例如(將DateTrolley參數更改為所需的日期):

import requests
import pandas as pd
from bs4 import BeautifulSoup

url = 'https://www.inmo.ie/Trolley_Ward_Watch'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
form_url = 'https://www.inmo.ie' + soup.form['action']

data = {'DateTrolley': '01/05/2020'}  # <-- change it to eg. 05/05/2020 to get other date

soup = BeautifulSoup(requests.post(form_url, data=data).content, 'html.parser')

df = pd.read_html(str(soup.table))
print(df)

印刷:

[          Date                                 Hospital   Region  Trolley Total  Ward Total  Total
0   01/05/2020                        Beaumont Hospital  Eastern              0           0      0
1   01/05/2020        Connolly Hospital, Blanchardstown  Eastern              0           0      0
2   01/05/2020        Connolly Hospital, Blanchardstown  Eastern              0           0      0
3   01/05/2020  Mater Misericordiae University Hospital  Eastern              0           0      0
4   01/05/2020                    Naas General Hospital  Eastern              0           0      0
5   01/05/2020                       St James' Hospital  Eastern              2           0      2
6   01/05/2020         St Vincent's University Hospital  Eastern              0           0      0
7   01/05/2020             Tallaght University Hospital  Eastern              1           0      1
8   01/05/2020                  Bantry General Hospital  Country              0           0      0
9   01/05/2020                   Cavan General Hospital  Country              2           0      2
10  01/05/2020                 Cork University Hospital  Country              2           0      2
11  01/05/2020          Letterkenny University Hospital  Country              0           0      0
12  01/05/2020                 Mayo University Hospital  Country              0           0      0
13  01/05/2020          Mercy University Hospital, Cork  Country              0           0      0
14  01/05/2020     Mid Western Regional Hospital, Ennis  Country              0           0      0
15  01/05/2020     Midland Regional Hospital, Mullingar  Country              1           0      1
16  01/05/2020    Midland Regional Hospital, Portlaoise  Country              0           0      0
17  01/05/2020     Midland Regional Hospital, Tullamore  Country              0           0      0
18  01/05/2020                  Nenagh General Hospital  Country              0           1      1
19  01/05/2020   Our Lady of Lourdes Hospital, Drogheda  Country              0           0      0
20  01/05/2020               Our Lady's Hospital, Navan  Country              0           0      0
21  01/05/2020          Portiuncula University Hospital  Country              0           0      0
22  01/05/2020                Sligo University Hospital  Country              0           0      0
23  01/05/2020         South Tipperary General Hospital  Country              0           0      0
24  01/05/2020              St Lukes Hospital, Kilkenny  Country              0           0      0
25  01/05/2020       University College Hospital Galway  Country              0           0      0
26  01/05/2020                University Hospital Kerry  Country              0           0      0
27  01/05/2020            University Hospital Waterford  Country              0           0      0
28  01/05/2020            University Hospital, Limerick  Country              8           0      8
29  01/05/2020                 Wexford General Hospital  Country              1           0      1]

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM