简体   繁体   English

如何找到目录及其所有子目录中所有文件的扩展名?

[英]How to find the extension of all files in a directory and all its sub-directories?

I'm trying to list the extension (ie, type) of all the files in a directory and all its sub-directories.我试图列出目录及其所有子目录中所有文件的扩展名(即类型)。 I have successfully listed all the file names and here's my code:我已成功列出所有文件名,这是我的代码:

import os
path = os.getcwd()
for dirpath, dirnames, filenames in os.walk(path):
    for file in filenames:
        file_path = os.path.join(dirpath, file)
        file_size = os.path.getsize(file_path)
        print("{} : {}".format(file_path, round(file_size, 3)))
    for dirname in dirnames:
        dir_path = os.path.join(dirpath, dirname)
        dir_size = os.path.getsize(dir_path)
        print("{} : {}".format(dir_path, round(dir_size, 3)))

I want to get the extension (ie, type) of all the files in the directory and all its sub-directories.我想获取目录及其所有子目录中所有文件的扩展名(即类型)。 Any help will be appreciated!任何帮助将不胜感激!

This will also give you the unique extensions found in the directory and its subdirectories这还将为您提供目录及其子目录中的唯一扩展名

import os
path = os.getcwd()
extensions = set()

for dirpath, dirnames, filenames in os.walk(path):
    for file in filenames:
        file_extension = os.path.splitext(file)[1]
        extensions.add(file_extension)

print("Extensions found:", extensions)

Here's how you can extract the file extension using regular expressions.以下是使用正则表达式提取文件扩展名的方法。

\.([^.]*)$ means "at the end of the string, match a dot followed by anything that is not a dot, and save the non-dot part to the result". \.([^.]*)$表示“在字符串的末尾,匹配一个点后跟任何不是点的东西,并将非点部分保存到结果中”。

Note that if a file has no extension - such as the Windows hosts file - the match will be None , in which case we need to replace it with an empty string.请注意,如果文件没有扩展名 - 例如 Windows hosts文件 - 匹配将为None ,在这种情况下我们需要将其替换为空字符串。

import os
import re

prog = re.compile(r'\.([^.]*)$')

path = os.getcwd()

for dirpath, dirnames, filenames in os.walk(path):
    for file in filenames:
        file_path = os.path.join(dirpath, file)
        m = prog.search(file)
        file_ext = m.group(1) if m else ''
        file_size = os.path.getsize(file_path)
        print("{} : {} : {}".format(file_path, file_ext, round(file_size, 3)))

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何查找目录及其所有子目录中所有文件的大小? - How to find the size of all files in a directory and all its sub-directories? 在特定目录及其子目录中,找到所有扩展名为.tmp的文件夹 - In a particular directory and its sub-directories, find all the folders ending with .tmp extension 如何在特定目录(包括其子目录)中列出所有文件及其大小和创建日期? - How can I list all files with their sizes and date of creation in a specific directory including its sub-directories? 如何删除目录中的所有文件,并保持子目录完整 - How to delete all files in a directory, keeping sub-directories intact 在目录和子目录中找到所有文件,并提供目录的路径 - find all files indirectory and sub-directories and provide the path from directory Python代码,用于从/ directory开始查找所有目录/子目录中新创建,修改和删除的文件 - Python code to find all newly created, modified and deleted files in all the directories/sub-directories starting from / directory 调整目录所有子目录中所有图像的大小 - Resize all images in all sub-directories of directory glob.iglob查找所有子目录中的所有.txt文件会产生错误 - glob.iglob to find all .txt files in all sub-directories yields error 尝试列出目录的所有子目录时键入错误 - Type error when trying to list all the sub-directories of a directory 在 PySpark 中指定时间戳之后创建的目录及其子目录中的文件计数 - Counting files in a directory and its sub-directories created after a specified timestamp in PySpark
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM