[英]python, multthreading, safe to use pandas “to_csv” on common file?
I've got some code that works pretty nicely. 我有一些很好用的代码。 It's a while-loop that goes through a list of dates, finds files on my HDD that corresponds to those dates, does some calculations with those files, and then outputs to a "results.csv" file using the command:
这是一个while循环,它遍历日期列表,在HDD上查找与这些日期相对应的文件,对这些文件进行一些计算,然后使用以下命令输出到“ results.csv”文件:
my_df.to_csv("results.csv",mode = 'a')
I'm wondering if it's safe to create a new thread for each date, and call the stuff in the while loop on several dates at a time? 我想知道为每个日期创建一个新线程,然后一次在多个日期的while循环中调用这些东西是否安全?
MY CODE: 我的密码:
import datetime, time, os
import sys
import threading
import helperPY #a python file containing the logic I need
class myThread (threading.Thread):
def __init__(self, threadID, name, counter,sn, m_date):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.counter = counter
self.sn = sn
self.m_date = m_date
def run(self):
print "Starting " + self.name
m_runThis(sn, m_date)
print "Exiting " + self.name
def m_runThis(sn, m_date):
helperPY.helpFn(sn,m_date) #this is where the "my_df.to_csv()" is called
sn = 'XXXXXX'
today=datetime.datetime(2016,9,22) #
yesterday=datetime.datetime(2016,6,13)
threadList = []
i_threadlist=0
while(today>yesterday):
threadList.append(myThread(i_threadlist, str(today), i_threadlist,sn,today))
threadList[i_threadlist].start()
i_threadlist = i_threadlist +1
today = today-datetime.timedelta(1)
Writing the file in multiple threads is not safe. 在多个线程中写入文件并不安全。 But you can create a lock to protect that one operation while letting the rest run in parallel.
但是您可以创建一个锁来保护该操作,同时让其余操作并行运行。 Your
to_csv
isn't shown, but you could create the lock 您的
to_csv
未显示,但是您可以创建锁
csv_output_lock = threading.Lock()
and pass it to helperPY.helpFn
. 并将其传递给
helperPY.helpFn
。 When you get to the operation, do 当您开始操作时,请执行
with csv_output_lock:
my_df.to_csv("results.csv",mode = 'a')
You get parallelism for other operations - subject to the GIL of course - but the file access is protected. 您将获得其他操作的并行性-当然要遵守GIL-但文件访问受到保护。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.