Can anybody help me create a function which will create a list of all files under a certain directory by using pathlib
library?
Here, I have a:
I have
c:\desktop\test\A\A.txt
c:\desktop\test\B\B_1\B.txt
c:\desktop\test\123.txt
I expected to have a single list which would have the paths above, but my code returns a nested list.
Here is my code:
from pathlib import Path
def searching_all_files(directory: Path):
file_list = [] # A list for storing files existing in directories
for x in directory.iterdir():
if x.is_file():
file_list.append(x)
else:
file_list.append(searching_all_files(directory/x))
return file_list
p = Path('C:\\Users\\akrio\\Desktop\\Test')
print(searching_all_files(p))
Hope anybody could correct me.
Use Path.glob()
to list all files and directories. And then filter it in a List Comprehensions .
p = Path(r'C:\Users\akrio\Desktop\Test').glob('**/*')
files = [x for x in p if x.is_file()]
pathlib
module:from pathlib import Path
from pprint import pprint
def searching_all_files(directory):
dirpath = Path(directory)
assert dirpath.is_dir()
file_list = []
for x in dirpath.iterdir():
if x.is_file():
file_list.append(x)
elif x.is_dir():
file_list.extend(searching_all_files(x))
return file_list
pprint(searching_all_files('.'))
With pathlib, it is as simple as the below comand.
path = Path('C:\\Users\\akrio\\Desktop\\Test')
list(path.iterdir())
If you can assume that only file objects have a .
in the name (ie, .txt, .png, etc.) you can do a glob or recursive glob search...
from pathlib import Path
# Search the directory
list(Path('testDir').glob('*.*'))
# Search directories and subdirectories, recursively
list(Path('testDir').rglob('*.*'))
But that's not always the case. Sometimes there are hidden directories like .ipynb_checkpoints
and files that do not have extensions. In that case, use list comprehension or a filter to sort out the Path objects that are files.
# Search Single Directory
list(filter(lambda x: x.is_file(), Path('testDir').iterdir()))
# Search Directories Recursively
list(filter(lambda x: x.is_file(), Path('testDir').rglob('*')))
# Search Single Directory
[x for x in Path('testDir').iterdir() if x.is_file()]
# Search Directories Recursively
[x for x in Path('testDir').rglob('*') if x.is_file()]
If your files have the same suffix, like .txt
, you can use rglob
to list the main directory and all subdirectories, recursively.
paths = list(Path(INPUT_PATH).rglob('*.txt'))
If you need to apply any useful Path function to each path. For example, accessing the name
property:
[k.name for k in Path(INPUT_PATH).rglob('*.txt')]
Where INPUT_PATH
is the path to your main directory, and Path
is imported from pathlib
.
A similar, more functional-oriented solution to @prasastoadi's one can be achieved by using the built-in filter
function of Python:
from pathlib import Path
my_path = Path(r'C:\Users\akrio\Desktop\Test')
list(filter(Path.is_file, my_path.glob('**/*')))
Using pathlib2 is much easier,
from pathlib2 import Path
path = Path("/test/test/")
for x in path.iterdir():
print (x)
def searching_all_files(directory: Path):
file_list = [] # A list for storing files existing in directories
for x in directory.iterdir():
if x.is_file():
file_list.append(x)#here should be appended
else:
file_list.extend(searching_all_files(directory/x))# need to be extended
return file_list
import pathlib
def get_all_files(dir_path_to_search):
filename_list = []
file_iterator = dir_path_to_search.iterdir()
for entry in file_iterator:
if entry.is_file():
#print(entry.name)
filename_list.append(entry.name)
return filename_list
The function can we tested as -
dir_path_to_search= pathlib.Path("C:\\Users\\akrio\\Desktop\\Test")
print(get_all_files(dir_path_to_search))
from pathlib import Path
data_path = Path.home() / 'Desktop/My-Folder/'
paths = sorted(data_path.iterdir())
files = sorted(f for f in Path(data_path).iterdir() if f.is_file())
png_files = sorted(data_path.glob('*.png'))
You can use a generator like this one with online filtering:
for file in (_ for _ in directory.iterdir() if _.is_file()):
...
You can use this:
folder: Path = Path('/path/to/the/folder/')
files: list = [file.name for file in folder.iterdir()]
You can use os.listdir(). It will get you everything that's in a directory - files and directories.
If you want just files, you could either filter this down using os.path:
from os import listdir
from os.path import isfile, join
onlyfiles = [files for files in listdir(mypath) if isfile(join(mypath, files))]
or you could use os.walk() which will yield two lists for each directory it visits - splitting into files and directories for you. If you only want the top directory you can just break the first time it yields
from os import walk
files = []
for (dirpath, dirnames, filenames) in walk(mypath):
files.extend(filenames)
break
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.