简体   繁体   中英

What is the fastest way to check whether a directory is empty in Python

I work on a windows machine and want to check if a directory on a network path is empty.

The first thing that came to mind was calling os.listdir() and see if it has length 0.

ie

def dir_empty(dir_path):
    return len(os.listdir(dir_path)) == 0

Because this is a network path where I do not always have good connectivity and because a folder can potentially contain thousands of files, this is a very slow solution. Is there a better one?

The fastest solution I found so far:

def dir_empty(dir_path):
    return not any((True for _ in os.scandir(dir_path)))

Or, as proposed in the comments below:

def dir_empty(dir_path):
    return not next(os.scandir(dir_path), None)

On the slow network I was working on this took seconds instead of minutes (minutes for the os.listdir() version). This seems to be faster, as the any statement only evaluates the first True statement.

From Python 3.4 onwards you can use pathlib.iterdir() which will yield path objects of the directory contents:

>>> from pathlib import Path
>>>
>>> def dir_empty(dir_path):
...     path = Path(dir_path)
...     has_next = next(path.iterdir(), None)
...     if has_next is None:
...             return True
...     return False

Since the OP is asking about the fastest way, I thought using os.scandir and returns as soon as we found the first file should be the fastest. os.scandir returns an iterator. We should avoid creating a whole list just to check if it is empty.

The test directory contains about 100 thousands files:

from pathlib import Path    
import os

path = 'jav/av'
len(os.listdir(path))

>>> 101204

Then start our test:

def check_empty_by_scandir(path):
    with os.scandir(path) as it:
        return not any(it)
    
def check_empty_by_listdir(path):
    return not os.listdir(path)

def check_empty_by_pathlib(path):
    return not any(Path(path).iterdir())


%timeit check_empty_by_scandir(path)
>>> 179 µs ± 878 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit check_empty_by_listdir(path)
>>> 28 ms ± 185 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit check_empty_by_pathlib(path)
>>> 27.6 ms ± 140 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

As we can see, check_empty_by_listdir and check_empty_by_pathlib is about 155 times slower than check_empty_by_scandir . The result from os.listdir() and Path.iterdir() is identical because Path.iterdir() uses os.listdir() in the background, creating a whole list in memory.

Additionally, as people point out, reading os.stat is not an option, which returns 4096 on empty directories in linux.

listdir gives a list. scandir gives an iterator, which may be more performant.

def dir_empty(dir_path):
    try:
        next(os.scandir(dir_path))
        return False
    except StopIteration:
        return True

On Windows OS there is PathIsDirectoryEmptyA . We can use it to check if folder is empty or not.

def is_dir_empty(path:str)->bool:
    import ctypes
    shlwapi = ctypes.OleDLL('shlwapi')
    return shlwapi.PathIsDirectoryEmptyA(path.encode('utf-8'))

Using os.stat :

is_empty = os.stat(dir_path).st_size == 0

Using Python's pathlib :

from pathlib import Path

is_empty = Path(dir_path).stat().st_size == 0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM