I have the following code:
while (i< 10):
for i in range(0, len(df_1)):
new_df_1 = df_1.iloc[i]
for j in (len(df_2)):
new_df_2 = df_2.iloc[j]
client.compute(self.func(i, new_df_1, new_df_2), scheduler="processes"),
break
I don't know how to use dask in such a nested loops to speed up the code. I tried to make the inner function as a function like below, but raises error.
This is what I have tried:
while (i< 10):
for i in range(0, len(df_1)):
new_df_1 = df_1.iloc[i]
def process_l(i, client, new_df_1, new_df_2):
for j in (len(df_2)):
new_df_2 = df_2.iloc[j]
client.compute(self.func(i, new_df_1, new_df_2), scheduler="processes"),
break
client.submit(process_l(i, new_df_1, new_df_2)
Calling .compute()
will stop further execution of the code until the results of .compute()
are ready. Instead you might want to use delayed
or client.submit
. Here's a rough suggestion:
futs = []
# to avoid the while loop
for i in range(0, min(10, len(df_1))):
new_df_1 = df_1.iloc[i]
for j in range(0, len(df_2)):
new_df_2 = df_2.iloc[j]
# this will submit future and proceed with the code without
# waiting for the result
fut = client.submit(self.func, i, new_df_1, new_df_2, scheduler="processes")
futs.append(fut)
results = client.gather(futs) # this waits for all results
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.