I'm using os.walk to run through directory "foo". I want to process .dat files but how to check for a directory name and only process the specific directory?
If dir="bar" then process files.dat. Do not process "notbar". I'm probably missing something simple
C:\data\foo
- notbar
-123
-file1.dat
-456
-file2.dat
-file3.dat
- bar
-123
-file1.dat
-456
-file2.dat
-file3.dat
this finds all .dat files....
for (root, dirnames, filenames) in os.walk(base_path):
print('Found directory: {0}'.format(root))
for filename in filenames:
if filename.endswith(".dat"):
print(filename)
glob
is really good for this. It returns all the files that match a certain pattern.
There is a reference for the patterns, but the most useful are:
*
matches everything except path slashes ( \\
for windows, /
for mac / linux) **
matches zero or more directories In your example, you want to find the .dat
( *.dat
) files in any sub-directory ( *
) of a sub-directory ( bar
) inside a base path base_path
. To get these files we can write
from glob import glob
filenames = glob(base_path + "\\bar\\*\\*.dat")
It is better to use os.path.join
for cross-platform:
from glob import glob
filenames = glob(os.path.join(base_path, "bar", "*", "*.dat"))
Check out the results here
If bar
is not necessarily the immediate sub-directory of base_path, but nested further down, you could use **
:
from glob import glob
filenames = glob(os.path.join(base_path, "**", "bar", "*", "*.dat"))
Finally, glob will not necessarily return the files in any order. To get them in alphabetical order use sorted(filenames)
. To get them in modified order use sorted(filenames, key=os.path.getmtime)
as per this answer .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.