为什么python glob.glob无法通过传入的正则表达式给我想要的文件？

Question

For example: 例如：

20190108JPYUSDabced.csv
20190107JPYUSDabced.csv
20190106JPYUSDabced.csv

When I search the first 2 files from terminal: 当我从终端搜索前两个文件时：

bash: ls /Users/Downloads/201901{08,07}JPYUSDabced.csv
it gives me the first 2 files (exclude 20190106JPYUSDabced.csv)

When I do in python: 当我在python中执行以下操作时：

import glob
glob.glob('/Users/Downloads/201901{08,07}JPYUSDabced.csv')
it gives me []

Answer 1

According to the docs for the glob module, under-the-hood glob uses fnmatch.fnmatch . 根据glob模块的文档，引擎盖glob使用fnmatch.fnmatch 。 The only patterns the fnmatch doc describes are: fnmatch文档描述的唯一模式是：

 Pattern | Meaning --------- | ----------------------------- * | matches everything ? | matches any single character [seq] | matches any character in seq [!seq] | matches any character not in seq 
For a literal match, wrap the meta-characters in brackets. 对于文字匹配，请将元字符括在方括号中。 For example, '[?]' matches the character '?'. 例如，“ [？]”与字符“？”匹配。

Try using sequence of characters in brackets instead: 尝试在方括号中使用字符序列：

glob.glob('/Users/Downloads/2019010[87]JPYUSDabced.csv')

Using os.walk 使用os.walk

Assuming you're looking to search for specific date ranges, you might need to try using os.walk with re regexes to get the more complex pattern you're looking for. 假设您要搜索特定的日期范围，则可能需要尝试使用带有re表达式的os.walk来获取要查找的更复杂的模式。

Caveat: os.walk recursively goes through every dir from the starting location, which may not be what you desire. 警告： os.walk从起始位置递归遍历每个目录，这可能不是您想要的。

You'd have to tailor the regex to whatever your situation is, but here's an example: 无论您遇到什么情况，都必须调整正则表达式，但这是一个示例：

The regex matches either date 20181208 or date 20190107 but must contain the identifier JPYUSDabced.csv . 正则表达式匹配日期20181208或日期20190107但必须包含标识符JPYUSDabced.csv 。

regex = re.compile("(?:(?:20181208)|(?:20190107))JPYUSDabced.csv")

files = []
for dirpath, dirnames, filenames in os.walk('/Users/Downloads'):
    for f in filenames:
        if regex.match(f):
            files.append(os.path.join(dirpath, f))
print(files)
# ['/Users/Downloads/20190107JPYUSDabced.csv', '/Users/Downloads/20181208JPYUSDabced.csv']

为什么python glob.glob无法通过传入的正则表达式给我想要的文件？

问题描述

1 个解决方案

解决方案1
2 2019-01-14 19:37:52

Using os.walk 使用os.walk

为什么python glob.glob无法通过传入的正则表达式给我想要的文件？

问题描述

1 个解决方案

解决方案1 2 2019-01-14 19:37:52

Using os.walk 使用os.walk

解决方案1
2 2019-01-14 19:37:52