I work on a windows machine and want to check if a directory on a network path is empty.
The first thing that came to mind was calling os.listdir()
and see if it has length 0.
ie
def dir_empty(dir_path):
return len(os.listdir(dir_path)) == 0
Because this is a network path where I do not always have good connectivity and because a folder can potentially contain thousands of files, this is a very slow solution. Is there a better one?
The fastest solution I found so far:
def dir_empty(dir_path):
return not any((True for _ in os.scandir(dir_path)))
Or, as proposed in the comments below:
def dir_empty(dir_path):
return not next(os.scandir(dir_path), None)
On the slow network I was working on this took seconds instead of minutes (minutes for the os.listdir() version). This seems to be faster, as the any statement only evaluates the first True statement.
From Python 3.4 onwards you can use pathlib.iterdir()
which will yield path objects of the directory contents:
>>> from pathlib import Path
>>>
>>> def dir_empty(dir_path):
... path = Path(dir_path)
... has_next = next(path.iterdir(), None)
... if has_next is None:
... return True
... return False
Since the OP is asking about the fastest way, I thought using os.scandir
and returns as soon as we found the first file should be the fastest. os.scandir
returns an iterator. We should avoid creating a whole list just to check if it is empty.
The test directory contains about 100 thousands files:
from pathlib import Path
import os
path = 'jav/av'
len(os.listdir(path))
>>> 101204
Then start our test:
def check_empty_by_scandir(path):
with os.scandir(path) as it:
return not any(it)
def check_empty_by_listdir(path):
return not os.listdir(path)
def check_empty_by_pathlib(path):
return not any(Path(path).iterdir())
%timeit check_empty_by_scandir(path)
>>> 179 µs ± 878 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit check_empty_by_listdir(path)
>>> 28 ms ± 185 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit check_empty_by_pathlib(path)
>>> 27.6 ms ± 140 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
As we can see, check_empty_by_listdir
and check_empty_by_pathlib
is about 155 times slower than check_empty_by_scandir
. The result from os.listdir() and Path.iterdir() is identical because Path.iterdir() uses os.listdir() in the background, creating a whole list in memory.
Additionally, as people point out, reading os.stat is not an option, which returns 4096 on empty directories in linux.
listdir
gives a list. scandir
gives an iterator, which may be more performant.
def dir_empty(dir_path):
try:
next(os.scandir(dir_path))
return False
except StopIteration:
return True
On Windows OS there is PathIsDirectoryEmptyA . We can use it to check if folder is empty or not.
def is_dir_empty(path:str)->bool:
import ctypes
shlwapi = ctypes.OleDLL('shlwapi')
return shlwapi.PathIsDirectoryEmptyA(path.encode('utf-8'))
Using os.stat
:
is_empty = os.stat(dir_path).st_size == 0
Using Python's pathlib :
from pathlib import Path
is_empty = Path(dir_path).stat().st_size == 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.