简体   繁体   中英

Find summed size of files with specific extensions within a directory?

I want to sum the sizes of files that match a particular extension (and do so for several extensions). Below is partially working code to do so, but I need help applying this to all extensions found in a Directory.

import glob
import os

path = '/tmp'
files = glob.glob(path + "/**/*.txt")
total_size = 0
for file in files:
    total_size += os.path.getsize(os.path.join(path, file))
print len(files), total_size

So, I want to end up with variables containing the values of how much total .txt or .mp3 file data there is. Something like:

Data1[] = { .mp3, 1209879834 bytes);
Data2[] = { .txt, 134213443 bytes);
DataX[] = { .X, X bytes);

I have taken the liberty of assuming your intention was to find the total sum of the sizes of all files matching a certain set of extensions within a directory (and my pending edit to your question will reflect that if approved):

import glob
import os


def summed_sizes(extensions: list, directory: str='.'):
    total = 0

    grouped_files = [glob.glob(os.path.join(directory, f"**/*.{ext}")) for ext in extensions]

    for ext_group in grouped_files:
        for file in ext_group:
            total += os.path.getsize(file)

    return total


print(summed_sizes(['jpg', 'txt'], '/tmp'))

You can search for all names in the subdirectory and filter the extensions yourself. glob is doing something similar by comparing all names with fnmatch . Notice that glob returns the full path, so you don't need to add it again. You can use list comprehensions to build the lists.

import glob
import os

path = '/tmp'
extensions = set(('.txt', '.foo', '.bar'))

files = [fn for fn in glob.glob(path + "/**/*")
    if os.path.splitext(fn)[1] in extensions]
total_size = sum(os.path.getsize(fn) for fn in files)
print len(files), total_size

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM