简体   繁体   English

为什么python glob.glob无法通过传入的正则表达式给我想要的文件?

[英]Why python glob.glob does not give me the files I want with the regex I passed in?

For example: 例如:

20190108JPYUSDabced.csv
20190107JPYUSDabced.csv
20190106JPYUSDabced.csv

When I search the first 2 files from terminal: 当我从终端搜索前两个文件时:

bash: ls /Users/Downloads/201901{08,07}JPYUSDabced.csv
it gives me the first 2 files (exclude 20190106JPYUSDabced.csv)

When I do in python: 当我在python中执行以下操作时:

import glob
glob.glob('/Users/Downloads/201901{08,07}JPYUSDabced.csv')
it gives me []

According to the docs for the glob module, under-the-hood glob uses fnmatch.fnmatch . 根据glob模块的文档,引擎盖glob使用fnmatch.fnmatch The only patterns the fnmatch doc describes are: fnmatch文档描述的唯一模式是:

 Pattern | Meaning --------- | ----------------------------- * | matches everything ? | matches any single character [seq] | matches any character in seq [!seq] | matches any character not in seq 

For a literal match, wrap the meta-characters in brackets. 对于文字匹配,请将元字符括在方括号中。 For example, '[?]' matches the character '?'. 例如,“ [?]”与字符“?”匹配。

Try using sequence of characters in brackets instead: 尝试在方括号中使用字符序列:

glob.glob('/Users/Downloads/2019010[87]JPYUSDabced.csv')

Using os.walk 使用os.walk

Assuming you're looking to search for specific date ranges, you might need to try using os.walk with re regexes to get the more complex pattern you're looking for. 假设您要搜索特定的日期范围,则可能需要尝试使用带有re表达式的os.walk来获取要查找的更复杂的模式。

Caveat: os.walk recursively goes through every dir from the starting location, which may not be what you desire. 警告: os.walk从起始位置递归遍历每个目录,这可能不是您想要的。

You'd have to tailor the regex to whatever your situation is, but here's an example: 无论您遇到什么情况,都必须调整正则表达式,但这是一个示例:

The regex matches either date 20181208 or date 20190107 but must contain the identifier JPYUSDabced.csv . 正则表达式匹配日期20181208或日期20190107但必须包含标识符JPYUSDabced.csv

regex = re.compile("(?:(?:20181208)|(?:20190107))JPYUSDabced.csv")

files = []
for dirpath, dirnames, filenames in os.walk('/Users/Downloads'):
    for f in filenames:
        if regex.match(f):
            files.append(os.path.join(dirpath, f))
print(files)
# ['/Users/Downloads/20190107JPYUSDabced.csv', '/Users/Downloads/20181208JPYUSDabced.csv']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM