简体   繁体   中英

How to recursive traversal directory using ThreadPoolExecutor?

My real task is to recursive traversal a remote directory using paramiko with multi-threading. For the sake of simplicity, I just use local filesystem to demonstrate it:

from pathlib import Path
from typing import List
from concurrent.futures import ThreadPoolExecutor, Executor

def listdir(root: Path, executor: Executor) -> List[Path]:
    if root.is_dir():
        xss = executor.map(lambda d: listdir(d, executor), root.glob('*'))
        return sum(xss, [])
    return [root]

with ThreadPoolExecutor(4) as e:
    listdir(Path('.'), e)

However, the above code running without end.

What's wrong with my code? And how to fix it (better to use Executor rather than the raw Thread )?

EDIT: I have confirmed @Sraw 's answer by the following code:

In [4]: def listdir(root: Path, executor: Executor) -> List[Path]:
   ...:     print(f'Enter {root}', flush=True)
   ...:     if root.is_dir():
   ...:         xss = executor.map(lambda d: listdir(d, executor), root.glob('*'))
   ...:         return sum(xss, [])
   ...:     return [root]
   ...:

In [5]: with ThreadPoolExecutor(4) as e:
   ...:     listdir(Path('.'), e)
   ...:
Enter .
Enter NonRestrictedShares
Enter corporateActionData
Enter RiskModelAnnualEPS
Enter juyuan

There is a dead lock inside your code.

As you are using ThreadPoolExecutor(4) , there are only four work threads in this executor, so you cannot run more than four tasks at the same time.

Image the following simplest structure:

test
----script.py
----test1
--------test2
------------test3
----------------test4
--------------------test5

If python script.py , the first work thread handles test1 , the second one handles test1/test2 , the third one handles test1/test2/test3 , the fourth one handles test1/test2/test3/test4 . And now the work threads are exhausted. But there is another task test1/test2/test3/test4/test5 inserted into work queue.

So it will hang forever.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM