[英]Python os.walk() function vs. find command
I am writing a program to walk the filesystem to collect file information to put into a database. 我正在编写一个程序来遍历文件系统以收集文件信息以放入数据库中。 I am trying to learn python after a lifetime of shell scripting, and am seeing an issue between what find returns and what
os.walk
returns 我尝试过一辈子的shell脚本学习python,
os.walk
现在find返回和os.walk
返回之间存在问题
find THIS_PATH -print
for dirpath, dirs, files in os.walk( THIS_PATH ):
print ( root )
for fname in files:
print ( os.path.join( root, fname ) )
The issue I have is that the "OS" find
returns symlinks to directories, but the python find does not, and I have no idea how to make it do that. 我遇到的问题是“ OS”
find
将符号链接返回目录,但是python查找没有,并且我也不知道如何使它这样做。 Now I don't want it to follow them (ie followlinks=True
) and that would create a different result from find as well. 现在,我不希望它跟随它们(即
followlinks=True
),这也会产生与find不同的结果。 But I want to be able to print the entries that are symlinks to directories. 但是我希望能够打印作为目录符号链接的条目。
thanks c 谢谢c
If you want to get same output (sorting may vary), you need to print both directories and files for given path. 如果要获得相同的输出(排序可能会有所不同),则需要打印给定路径的目录和文件。
find
returns directories as well as links (to anything). find
返回目录以及链接(指向任何内容)。 Minimal change to you code would be: 对您的代码的最小更改是:
print(THIS_PATH)
for dirpath, dirs, files in os.walk(THIS_PATH):
for fname in dirs + files: # iterate over items form both lists
print (os.path.join(dirpath, fname))
This may be a bit easier to do with pathlib
: 使用
pathlib
可能会更容易pathlib
:
from pathlib import Path
mypath = Path(THIS_PATH)
for found_item in mypath.rglob('*'):
print(mypath.joinpath(found_item))
For instance I've created the following tree: 例如,我创建了以下树:
.
├── d1
│ ├── d2
│ │ └── f2
│ └── f1
├── f2 -> d1/d2/f2
└── l1 -> d1
Running find
will yield (note directories and links to directories appear the same way): 运行
find
将产生(注意目录和目录链接以相同的方式出现):
$ find .
.
./f2
./l1
./d1
./d1/.h
./d1/d2
./d1/d2/f2
./d1/f1
And running the first snippet with THIS_PATH='.'
然后使用
THIS_PATH='.'
运行第一个代码段THIS_PATH='.'
yields the same items (in slightly different order, find
would default to depth first, os.walk
does breadth first). 产生相同的项目(顺序略有不同,“
find
默认默认为“深度”,“ os.walk
”首先进行“广度”)。 For that pathlib
example just be ware if THIS_PATH
is '.'
对于
pathlib
例如只是洁具如果THIS_PATH
是'.'
, as is it would chomp the leading ./
off. ,因为这样会使开头的
./
断掉。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.