简体   繁体   中英

Python - adding file names (not full paths) to list from directory and subfolders

This is for python 2.

I have a chunk of code that is creating an object (dtry) containing three identical lists. Each list is all of the files (excluding folders) with a folder. This works, but I want to extend it to also work for subfolders.

My working code is as follows:

import os

fldr = "C:\Users\jonsnow\OneDrive\Documents\my_python\Testing\Testing"
dtry[:] = []  # clear list

for i in range(3):
        dtry.append([tup for tup in os.listdir(fldr)
                     if os.path.isfile(os.path.join(fldr, tup))])

This successfully creates the three lists containing the names but not full paths of files (and only files, not folders) inside fldr.

I want this to also search within the subfolders of fldr.

Unfortunately I can't figure out how to get it to do so.

I have cobbled together another piece of code that does list all of the files in the subfolders as well (and so kind of works), but it lists the full paths not just the file names. This is as follows:


import os

fldr = "C:\Users\jonsnow\OneDrive\Documents\my_python\Testing\Testing"
dtry[:] = []  # clear list

for i in range(3):
        dtry.append([os.path.join(root, name)
                     for root, dirs, files in os.walk(fldr)
                     for name in files
                     if os.path.isfile(os.path.join(root, name))])

I have tried changing the line:

dtry.append([os.path.join(root, name)

to

tup for tup in os.listdir(fldr)

but this is not working for me.

Can anyone tell me what I am missing here?

Again, I am trying to get dtry to be three lists, each list being all of the files within fldr and the files within all of its all of its subfolders.

Here's the simplest way I can think of to get all of the filenames without any subpaths, using just os.listdir() :

import os
from pprint import pprint

def getAllFiles(dir, result = None):
    if result is None:
        result = []
    for entry in os.listdir(dir):
        entrypath = os.path.join(dir, entry)
        if os.path.isdir(entrypath):
            getAllFiles(entrypath ,result)
        else:
            result.append(entry)
    return result

def main():
    result = getAllFiles("/tmp/foo")
    pprint(result)

main()

This uses the recursion idea I mentioned in my comment.

With test directory structure:

/tmp/foo
├── D
│   ├── G
│   │   ├── h
│   │   └── i
│   ├── e
│   └── f
├── a
├── b
└── c

I get:

['a', 'c', 'i', 'h', 'f', 'e', 'b']

If I change this line:

result.append(entry)

to:

result.append(entrypath)

then I get:

['/tmp/foo/a',
 '/tmp/foo/c',
 '/tmp/foo/D/G/i',
 '/tmp/foo/D/G/h',
 '/tmp/foo/D/f',
 '/tmp/foo/D/e',
 '/tmp/foo/b']

To get the exact result you wanted, you can do

dtry = [getAllFiles("/tmp/foo")]
dtry.append(list(dtry[0]))
dtry.append(list(dtry[0]))

And if you want to use os.walk , which is more compact, here are the two flavors of that:

def getAllFiles2(dir):
    result = []
    for root, dirs, files in os.walk(dir):
        result.extend(files)
    return result

def getAllFilePaths2(dir):
    result = []
    for root, dirs, files in os.walk(dir):
        result.extend([os.path.join(root, f) for f in files])
    return result

These produce the same results (order aside) as the recursive versions.

You're making an easy problem very hard. This works:

from glob import glob

files = glob(r'C:\Users\jonsnow\OneDrive\Documents\my_python\Testing\Testing\**\*', recursive=True')
result = [files for _ in range(3)]

Note that this produces a list with three references to the original list. If you need three identical copies:

from glob import glob

files = glob(r'C:\Users\jonsnow\OneDrive\Documents\my_python\Testing\Testing\**\*', recursive=True)
result = [files.copy() for _ in range(3)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM