简体   繁体   English

python的线程/ While循环

[英]Threading / While Loop with python

I'm trying to make this script run every 30 minutes. 我正在尝试使此脚本每30分钟运行一次。 at the moment it only runs once - a bit confused why it isn't running many more times. 目前,它只运行一次-为什么它不能运行更多次有些困惑。

Any idea on where I'm going wrong with my code? 对我的代码哪里出错有任何想法吗? Basically this script is taking data from an API, in XML, and then putting it into a csv file. 基本上,此脚本从XML的API中获取数据,然后将其放入csv文件中。

Trying to use Threading to make it run every so many seconds - running it on pythonanywhere at the moment - and it will only run once. 尝试使用Threading使其每隔这么多秒运行一次-目前在pythonanywhere上运行它-并且只会运行一次。 A little bit confused why that would be! 有点困惑为什么会这样!

I also have tried using a while loop - put a example of what i've tried near the threading code. 我也尝试过使用while循环-在线程代码附近放一个我尝试过的例子。

from lxml import etree
import urllib.request
import csv
import threading

#Pickle is not needed
#append to list next

def handleLeg(leg):
   # print this leg as text, or save it to file maybe...
   text = etree.tostring(leg, pretty_print=True)
   # also process individual elements of interest here if we want
   tagsOfInterest=["noTrafficTravelTimeInSeconds", "lengthInMeters", "departureTime", "trafficDelayInSeconds"]  # whatever
   #list to use for data analysis
   global data
   data = []
   #create header dictionary that includes the data to be appended within it. IE, Header = {TrafficDelay[data(0)]...etc
   for child in leg:
       if 'summary' in child.tag:
          for elem in child:
              for item in tagsOfInterest:
                  if item in elem.tag:
                      data.append(elem.text)


def parseXML(xmlFile):
"""While option
   lastTime = time.time() - 600
   while time.time() >= lastTime + 600:
    lastTime += 600"""
   #Parse the xml
   threading.Timer(5.0, parseXML).start()
   with urllib.request.urlopen("https://api.tomtom.com/routing/1/calculateRoute/-37.79205923474775,145.03010268799338:-37.798883995180496,145.03040309540322:-37.807106781970354,145.02895470253526:-37.80320743019992,145.01021142594075:-37.7999012967757,144.99318476311566:?routeType=shortest&key=xxx&computeTravelTimeFor=all") as fobj:
       xml = fobj.read()

   root = etree.fromstring(xml)

   for child in root:
       if 'route' in child.tag:
           handleLeg(child)
           # Write CSV file
           with open('datafile.csv', 'w') as fp:
            writer = csv.writer(fp, delimiter=' ')
            # writer.writerow(["your", "header", "foo"])  # write header
            writer.writerows(data)
           """for elem in child:
               if 'leg' in elem.tag:
                   handleLeg(elem)
"""


if __name__ == "__main__":
   parseXML("xmlFile")

with open('datafile.csv', 'r') as fp:
    reader = csv.reader(fp, quotechar='"')
    # next(reader, None)  # skip the headers
    data_read = [row for row in reader]

print(data_read)

How do you know it runs only once? 您怎么知道它只运行一次? Have you debugged it or do you expect to have the correct result when code reaches this part? 您已调试它,还是期望代码到达此部分时得到正确的结果?

with open('datafile.csv', 'r') as fp:
    ....

And in general, what do you expect to happen, and when is your program supposed to enter this part? 通常,您期望发生什么,您的程序什么时候应该进入这一部分? I do not know how to fix this without knowing what you want it to do, but I think I know where your problem is. 我不知道如何解决这个问题,不知道如何解决,但我想我知道您的问题出在哪里。

This is what your program does. 这就是您的程序所做的。 I will call the main thread M: 我将主线程称为M:

  1. M: if __main__() matches and parseXML is called M: if __main__()匹配并且parseXML被调用
  2. M: parseXML launches a new thread, which we call T1, with threading.Timer() M: parseXML使用threading.Timer()启动一个新线程,我们称为T1。
  3. M: parseXML finishes and with open... reached. M: parseXML完成并with open... T1: sleep(5) T1:睡眠(5)
  4. M: print(data_read) T1: still probably in sleep(5) M:打印(读取数据)T1:可能仍处于睡眠状态(5)
  5. M: exit - just waits for other threads to terminate T1: parseXML M:退出-仅等待其他线程终止T1: parseXML
  6. M: - T1: launches a new thread T2 M:-T1:启动新线程T2
  7. M: - T1: completes parseXML T2: sleep(5) M:-T1:完成parseXML T2:sleep(5)
  8. M: - T1: thread exits T2: still in sleep(5) M:-T1:线程退出T2:仍处于睡眠状态(5)
  9. M: - T1: - T2: parseXML M:-T1: parseXMLparseXML
  10. M: - T1: - T2: launches a new thread T3 M:-T1:-T2:启动新线程T3
  11. ... ...

How your program is built, parseXML (probably - not able to run your code but it looks about right) does launch a delayed copy of itself in a new thread, but your main program that handles the results has already exited and does not read your datafile.csv anymore after a new timed thread has modified it. 程序的构建方式为parseXML (可能无法运行代码,但看起来正确)确实在新线程中启动了其自身的延迟副本,但是处理结果的主程序已经退出,无法读取您的新的定时线程对其进行修改后,datafile.csv不再可用。

You can verify this by setting daemon=True on your threads (meaning the threads will exit as soon as your main program exits). 您可以通过在线程上设置daemon=True来验证这一点(这意味着线程将在主程序退出后立即退出)。 Now your program does not "hang". 现在您的程序不会“挂起”。 It displays results after the first iteration of parseXML and it kills immediately the timed thread: 它在parseXML的第一次迭代之后显示结果,并立即杀死定时线程:

#Parse the xml
   _t = threading.Timer(5.0, parseXML)
   _t.daemon = True
   _t.start()
   with urllib.request.urlopen(....)

Do you really need threads here at all? 您真的需要这里的线程吗? Or could you just move the datafile.csv processing and display to parseXML, put a while True loop there and sleep 5 seconds between iterations? 还是可以只将datafile.csv处理并显示为parseXML,在其中放置一会儿True循环,在两次迭代之间睡眠5秒钟?

Another possibility is to move the data reader part to another thread that would sleep N seconds and then run the reader. 另一种可能性是将数据读取器部件移至另一个线程,该线程将休眠N秒,然后运行读取器。 BUT in this case you will need locks. 但是在这种情况下,您将需要锁。 If you process the same file in different threads, eventually the unexpected will happen and your writer has written only a part of your file when the reader decides to read it. 如果您在不同的线程中处理同一文件,则最终会发生意外情况,并且当读者决定读取文件时,书写者只写了文件的一部分。 Your parser will then most likely crash to a syntax error. 然后,您的解析器很可能会崩溃到语法错误。 To avoid this, create a global lock and use it to protect your file read and write operations: 为避免这种情况,请创建一个全局锁,并使用它来保护文件的读写操作:

foo = threading.Lock()
....
....

with foo:
    with open(...) as fp:
        ....

Now your file operations stay atomic. 现在,您的文件操作保持原子性。

Sorry about the lengthy explanation, hope this helps. 很抱歉,冗长的解释,希望对您有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM