I want to sum the sizes of files that match a particular extension (and do so for several extensions). Below is partially working code to do so, but I need help applying this to all extensions found in a Directory.
import glob
import os
path = '/tmp'
files = glob.glob(path + "/**/*.txt")
total_size = 0
for file in files:
total_size += os.path.getsize(os.path.join(path, file))
print len(files), total_size
So, I want to end up with variables containing the values of how much total .txt or .mp3 file data there is. Something like:
Data1[] = { .mp3, 1209879834 bytes);
Data2[] = { .txt, 134213443 bytes);
DataX[] = { .X, X bytes);
I have taken the liberty of assuming your intention was to find the total sum of the sizes of all files matching a certain set of extensions within a directory (and my pending edit to your question will reflect that if approved):
import glob
import os
def summed_sizes(extensions: list, directory: str='.'):
total = 0
grouped_files = [glob.glob(os.path.join(directory, f"**/*.{ext}")) for ext in extensions]
for ext_group in grouped_files:
for file in ext_group:
total += os.path.getsize(file)
return total
print(summed_sizes(['jpg', 'txt'], '/tmp'))
You can search for all names in the subdirectory and filter the extensions yourself. glob
is doing something similar by comparing all names with fnmatch
. Notice that glob
returns the full path, so you don't need to add it again. You can use list comprehensions to build the lists.
import glob
import os
path = '/tmp'
extensions = set(('.txt', '.foo', '.bar'))
files = [fn for fn in glob.glob(path + "/**/*")
if os.path.splitext(fn)[1] in extensions]
total_size = sum(os.path.getsize(fn) for fn in files)
print len(files), total_size
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.