简体   繁体   English

Python:限制代码运行一个小时

[英]Python: Restrict the code to be run for an hour

I have written a scraper that does html scraping and then use API to get some data, since its a very lengthy code I haven't put it here. 我写了一个刮板,它做了html抓取,然后使用API​​来获取一些数据,因为它是一个非常冗长的代码,我没有把它放在这里。 I have implemented random sleep method and using it within my code to monitor throttle. 我已经实现了随机睡眠方法并在我的代码中使用它来监控油门。 But I want to make sure I don't over run this code, so my idea is to run for an 3-4 hours then taker breather and then run again. 但是我想确保我不会过度使用这个代码,所以我的想法是运行3-4个小时然后接受呼吸,然后再次运行。 I haven't done anything like this in python I was trying to search but not really sure where to start from, it would be great if I get some guidance on this. 我没有在python中做过这样的事情,我试图搜索但不确定从哪里开始,如果我得到一些指导就会很好。 If python has a specific module link to that would be a great help. 如果python有一个特定的模块链接,那将是一个很大的帮助。

Also is this relevant? 这也是相关的吗? I don't I need this level of complication? 我不需要这种程度的并发症吗?

Suggestions for a Cron like scheduler in Python? 在Python中建议像Cron一样的调度程序?

I have functions for every single scraping task, and I have main method calling all those functions. 我有每个单一抓取任务的函数,我有main方法调用所有这些函数。

You could just note the time you have started and each time you want to run something make sure you haven't exceeded the given maximum. 你可以记下你开始的时间,每次你想要运行的东西,确保你没有超过给定的最大值。 Something like this should get you started: 这样的事情应该让你开始:

from datetime import datetime
MAX_SECONDS = 3600

# note the time you have started
start = datetime.now()

while True:
    current = datetime.now()
    diff = current-start
    if diff.seconds >= MAX_SECONDS:
        # break the loop after MAX_SECONDS
        break

    # MAX_SECONDS not exceeded, run more tasks
    scrape_some_more() 

Here's the link to the datetime module documentation . 这是datetime模块文档链接

You can use a threading.Timer object to schedule an interrupt signal to the main thread after the time is exceeded: 您可以使用threading.Timer对象在超过时间后为主线程调度中断信号:

import thread, threading

def longjob():
    try:
        # do your job
        while True:
            print '*', 
    except KeyboardInterrupt:
        # do your cleanup
        print 'ok, giving up'

def terminate():
    print 'sorry, pal'
    thread.interrupt_main()

time_limit = 5  # terminate in 5 seconds
threading.Timer(time_limit, terminate).start()
longjob()

Put this in your crontab and run every time_limit + 2 minutes. 把它放在你的crontab中,每隔time_limit + 2分钟运行一次。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 运行代码后如何禁用/限制一行代码? (蟒蛇) - How to disable/restrict a line of code after it has been run? (Python) Python - 计划:每小时运行 function + “整点” - Python - Schedule: Run function every hour + "on the hour" 尝试每小时使用crontab运行python脚本,但是python代码的一部分未执行 - Trying to run a python script with crontab every hour, but one section of the python code does not execute 如何在Selenium中运行代码以在一天中的特定时间在python脚本而非cron中执行功能 - How can I run code in selenium to execute a function at a specific hour of the day within the python script not cron 安排一个 python 脚本每小时运行一次 - schedule a python script to run every hour 安排 Python 脚本每小时准确运行 - Scheduling Python Script to run every hour accurately Python 代码插入本地时间戳(日期+小时) - Python code to insert the local timestamp (date + hour) 是否可以限制对python中的代码块的全局访问? - Is it possible to restrict access to globals for a block of code in python? schedule (Python lib) - 如何在多个特定分钟内运行作业 - schedule (Python lib) - How to run a job in multiple certain minutes of hour 在 Python 中安排一个简单的通知脚本每小时运行一次 - Scheduling a simple notification script in Python to run every hour
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM