简体   繁体   English

在 Python 中检查目录是否为空的最快方法是什么

[英]What is the fastest way to check whether a directory is empty in Python

I work on a windows machine and want to check if a directory on a network path is empty.我在 Windows 机器上工作,想检查网络路径上的目录是否为空。

The first thing that came to mind was calling os.listdir() and see if it has length 0.想到的第一件事是调用os.listdir()并查看它的长度是否为 0。

ie IE

def dir_empty(dir_path):
    return len(os.listdir(dir_path)) == 0

Because this is a network path where I do not always have good connectivity and because a folder can potentially contain thousands of files, this is a very slow solution.因为这是一个网络路径,我的连接并不总是很好,而且一个文件夹可能包含数千个文件,所以这是一个非常慢的解决方案。 Is there a better one?有没有更好的?

The fastest solution I found so far:迄今为止我找到的最快的解决方案:

def dir_empty(dir_path):
    return not any((True for _ in os.scandir(dir_path)))

Or, as proposed in the comments below:或者,如以下评论中所建议的:

def dir_empty(dir_path):
    return not next(os.scandir(dir_path), None)

On the slow network I was working on this took seconds instead of minutes (minutes for the os.listdir() version).在我正在处理的慢速网络上,这需要几秒钟而不是几分钟(os.listdir() 版本是几分钟)。 This seems to be faster, as the any statement only evaluates the first True statement.这似乎更快,因为 any 语句只评估第一个 True 语句。

From Python 3.4 onwards you can use pathlib.iterdir() which will yield path objects of the directory contents:从 Python 3.4 开始,您可以使用pathlib.iterdir()这将产生目录内容的路径对象:

>>> from pathlib import Path
>>>
>>> def dir_empty(dir_path):
...     path = Path(dir_path)
...     has_next = next(path.iterdir(), None)
...     if has_next is None:
...             return True
...     return False

Since the OP is asking about the fastest way, I thought using os.scandir and returns as soon as we found the first file should be the fastest.由于 OP 询问最快的方式,我认为使用os.scandir并在我们发现第一个文件后立即返回应该是最快的。 os.scandir returns an iterator. os.scandir返回一个迭代器。 We should avoid creating a whole list just to check if it is empty.我们应该避免创建一个完整的列表来检查它是否为空。

The test directory contains about 100 thousands files:测试目录包含大约 10 万个文件:

from pathlib import Path    
import os

path = 'jav/av'
len(os.listdir(path))

>>> 101204

Then start our test:然后开始我们的测试:

def check_empty_by_scandir(path):
    with os.scandir(path) as it:
        return not any(it)
    
def check_empty_by_listdir(path):
    return not os.listdir(path)

def check_empty_by_pathlib(path):
    return not any(Path(path).iterdir())


%timeit check_empty_by_scandir(path)
>>> 179 µs ± 878 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit check_empty_by_listdir(path)
>>> 28 ms ± 185 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit check_empty_by_pathlib(path)
>>> 27.6 ms ± 140 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

As we can see, check_empty_by_listdir and check_empty_by_pathlib is about 155 times slower than check_empty_by_scandir .如我们所见, check_empty_by_listdircheck_empty_by_pathlibcheck_empty_by_scandir慢约 155 倍。 The result from os.listdir() and Path.iterdir() is identical because Path.iterdir() uses os.listdir() in the background, creating a whole list in memory. os.listdir() 和 Path.iterdir() 的结果是相同的,因为 Path.iterdir() 在后台使用 os.listdir() ,在内存中创建了一个完整的列表。

Additionally, as people point out, reading os.stat is not an option, which returns 4096 on empty directories in linux.此外,正如人们指出的那样,读取 os.stat 不是一种选择,它在 linux 中的空目录上返回 4096。

listdir gives a list. listdir给出了一个列表。 scandir gives an iterator, which may be more performant. scandir给出了一个迭代器,它的性能可能更高。

def dir_empty(dir_path):
    try:
        next(os.scandir(dir_path))
        return False
    except StopIteration:
        return True

On Windows OS there is PathIsDirectoryEmptyA .在 Windows 操作系统上有PathIsDirectoryEmptyA We can use it to check if folder is empty or not.我们可以用它来检查文件夹是否为空。

def is_dir_empty(path:str)->bool:
    import ctypes
    shlwapi = ctypes.OleDLL('shlwapi')
    return shlwapi.PathIsDirectoryEmptyA(path.encode('utf-8'))

Using os.stat :使用os.stat

is_empty = os.stat(dir_path).st_size == 0

Using Python's pathlib :使用 Python 的pathlib

from pathlib import Path

is_empty = Path(dir_path).stat().st_size == 0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 检查单元格是否包含字母的最快方法是什么? - What is the fastest way to check whether a cell contains letters? 在 Python 中清空列表的最快方法是什么 - what is the fastest way to empty a list in Python Python:检查两个字符串列表是否“相似”的最快方法 - Python: fastest way to check whether two string lists are “similar” 什么是决定数字是否是Python 3中的平方数的最快方法 - What is the fastest way to decide whether a number is a square number in Python 3 有没有办法检查套接字的数据缓冲区在Python中是否为空? - Is there a way to check whether the data buffer for a socket is empty or not in python? 什么是检查一个数字是否在规定的范围内蟒最快的方法是什么? - What is the fastest way to check if a number is in specific range in python? 在两个python numpy数组中检查条件的最快方法是什么? - What is fastest way to check conditions in two python numpy arrays? 检查文件夹大小是否大于特定大小的最快方法是什么? - What is the fastest way to check whether a folder size is greater than a specific size? 使用python登录网站时,测试是否登录的最快方法是什么? - What is the fastest way to test whether logged in or not when use python to login to a website? 在目录中搜索文件的最快方法-Python - Fastest way to search files in a directory -Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM