简体   繁体   English

从 python 中每 10 分钟刷新一次的网站抓取数据

[英]Scraping data from website that refreshes every 10 minutes in python

I am very new to web scraping and python in general.一般来说,我对 web 刮擦和 python 非常陌生。 I am working on a project that requires me to scrape data from a website that refreshes/updates data every 10 minutes.我正在做一个项目,该项目要求我从每 10 分钟刷新/更新一次数据的网站上抓取数据。 I was able to scrape the data for the current 10 minutes but when the data refreshes the previous data is not valid anymore.我能够抓取当前 10 分钟的数据,但是当数据刷新时,以前的数据不再有效。 I need help with 3 things-我需要3件事的帮助-

  1. There is an input time stamp at the top of the website.网站顶部有一个输入时间戳。 How can I change the time in that input to only fetch data for that particular time period?如何更改该输入中的时间以仅获取该特定时间段的数据? enter image description here在此处输入图像描述

  2. My current code is -我目前的代码是 -

    import requests
    import pandas as pd
    import datetime as dt
    from datetime import datetime
    
    URL1 = "URL.com"
    
    tables1= pd.read_html(URL1)
    
    print("There are : ",len(tables1)," tables1")
    
    PartUsage=pd.DataFrame(tables1[8])
    
    now=datetime.now()
    PartUsage["Date"]=now
    PartUsage.set_index("Date", inplace=True)
    
    from pathlib import Path  
    filepath = Path('Path.csv')  
    filepath.parent.mkdir(parents=True, exist_ok=True)  
    PartUsage.to_csv(filepath)

I added time stamp because there is no timestamp in the required table.我添加了时间戳,因为所需表中没有时间戳。 How can I link the time stamp to use that as an input?如何链接时间戳以将其用作输入?

This is company specific data and hence I cannot provide the link or any further details.这是公司特定的数据,因此我无法提供链接或任何进一步的细节。 Any help will be appreciated.任何帮助将不胜感激。 Thank you谢谢

You can use Cron app for this.您可以为此使用Cron应用程序。 This is an application, that runs some scripts on a specific schedule.这是一个应用程序,它按特定的时间表运行一些脚本。 You can also deploy it in an auto-running docker container for convenience.为方便起见,您还可以将其部署在自动运行的 docker 容器中。 More about cron, you can find there: How do I get a Cron like scheduler in Python?有关 cron 的更多信息,您可以在此处找到: How do I get a Cron like scheduler in Python?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM