简体   繁体   English

glob()排除子目录

[英]glob() to exclude sub-directories

So I'm working on a script which will go through a bunch of log files looking for strings and server names. 因此,我正在研究一个脚本,该脚本将通过一堆日志文件来查找字符串和服务器名称。

In my testing I was using glob() to create a list of files to troll through. 在我的测试中,我使用glob()创建了一系列文件来浏览。

However, to improve my testing I have copied a log directory from a live system (11gb!) - and things aren't as smooth as they were before.. it looks like glob treats the sub-directories as files, and as such the readlines() is struggling to read them. 但是,为了改进测试,我从一个实时系统(11gb!)中复制了一个日志目录-事情并没有像以前那么顺利..看起来glob会将子目录视为文件,因此readlines()正在努力阅读它们。

I don't care about files in the sub-directories, I just want to scan through the files in the native directory. 我不在乎子目录中的文件,我只想浏览本机目录中的文件。

I think I can use os.walk() to achieve this, with something like: 我想我可以使用os.walk()实现此目的,例如:

logs = next(os.walk('var/opt/server/log/current'))[2]

As opposed to: 相对于:

logs = glob('/var/opt/server/log/current/*')

Because I'm learning python, I want to make sure I learn things the correct way.. so am I correct in what I'm saying above? 因为我正在学习python,所以我想确保我以正确的方式学习东西。.所以我在上面所说的正确吗? Or should I use glob() in a slightly different way to achieve this goal? 还是应该以略有不同的方式使用glob()来实现此目标?

使用glob并过滤掉所有目录:

logs = [log for log in glob('/var/opt/server/log/current/*') if not os.path.isdir(log)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM