简体   繁体   English

与Python os.walk的差异

[英]Discrepancies with Python os.walk

I've written a script to crawl directories on my system and record file meta data. 我编写了一个脚本来抓取我系统上的目录并记录文件元数据。 I've used os.walk to do this. 我用os.walk做了这个。

It has worked for the most part, but when running on different machines it returns a different list of files. 它在大多数情况下都有效,但是当在不同的机器上运行时,它会返回不同的文件列表。

Right now I'm testing on my Dropbox folder; 现在我正在我的Dropbox文件夹上测试; on my MBPro(lion) it crawls the folder and returns the correct number of files. 在我的MBPro(狮子)上它抓取文件夹并返回正确数量的文件。 On my iMac(mountain lion) it does not, normally skipping between 1-3 files per run. 在我的iMac(山狮)上它没有,通常每次运行跳过1-3个文件。 Additional crawls will pickup a straggler but usually it will continue to ignore a few files in the directory. 额外的抓取将拾取一个落后者,但通常会继续忽略目录中的一些文件。

here's a short snippet of the code: 这是代码的简短片段:

directory = '/Users/user/Dropbox/'
for dirname, dirnames, filenames in os.walk(directory):
  for subdirname in dirnames:
    for filename in filenames:
      if os.path.isfile(filename):
        # collect file info using os.path and os.stat

I obviously want to ignore directories. 我显然想忽略目录。 Is there a better way to do this? 有一个更好的方法吗? Preferably something that will be os agnostic. 最好是与os不可知的东西。

The trick is like @MartijnPieters suggested. 诀窍就像@MartijnPieters建议的那样。 It is unnecessary to loop over the sub-directories as well because they are picked up in the next iteration of the loop. 也没有必要循环遍历子目录,因为它们是在循环的下一次迭代中被拾取的。 This was cause for the discrepancies between my two machines. 这是导致我的两台机器之间出现差异的原因。

Also it is important to note that OSX has a very odd way of calculating files in a given directory. 另外值得注意的是,OSX有一种非常奇怪的方式来计算给定目录中的文件。 You can see this by running df on a given directory and then doing 'Get Info' and comparing the results. 您可以通过在给定目录上运行df然后执行“获取信息”并比较结​​果来查看此信息。

directory = '/Users/user/Dropbox/'
for dirname, dirnames, filenames in os.walk(directory):
    for filename in filenames:
        if os.path.isfile(filename):
            # collect file info using os.path and os.stat'   

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM