简体   繁体   中英

Why python glob.glob does not give me the files I want with the regex I passed in?

For example:

20190108JPYUSDabced.csv
20190107JPYUSDabced.csv
20190106JPYUSDabced.csv

When I search the first 2 files from terminal:

bash: ls /Users/Downloads/201901{08,07}JPYUSDabced.csv
it gives me the first 2 files (exclude 20190106JPYUSDabced.csv)

When I do in python:

import glob
glob.glob('/Users/Downloads/201901{08,07}JPYUSDabced.csv')
it gives me []

According to the docs for the glob module, under-the-hood glob uses fnmatch.fnmatch . The only patterns the fnmatch doc describes are:

 Pattern | Meaning --------- | ----------------------------- * | matches everything ? | matches any single character [seq] | matches any character in seq [!seq] | matches any character not in seq 

For a literal match, wrap the meta-characters in brackets. For example, '[?]' matches the character '?'.

Try using sequence of characters in brackets instead:

glob.glob('/Users/Downloads/2019010[87]JPYUSDabced.csv')

Using os.walk

Assuming you're looking to search for specific date ranges, you might need to try using os.walk with re regexes to get the more complex pattern you're looking for.

Caveat: os.walk recursively goes through every dir from the starting location, which may not be what you desire.

You'd have to tailor the regex to whatever your situation is, but here's an example:

The regex matches either date 20181208 or date 20190107 but must contain the identifier JPYUSDabced.csv .

regex = re.compile("(?:(?:20181208)|(?:20190107))JPYUSDabced.csv")

files = []
for dirpath, dirnames, filenames in os.walk('/Users/Downloads'):
    for f in filenames:
        if regex.match(f):
            files.append(os.path.join(dirpath, f))
print(files)
# ['/Users/Downloads/20190107JPYUSDabced.csv', '/Users/Downloads/20181208JPYUSDabced.csv']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM