简体   繁体   English

使用Python的os.walk函数和ls命令可获得不同的结果

[英]Different results achieved using Python's os.walk function and ls command

#!/bin/python
import os
pipe=os.popen("ls /etc -alR| grep \"^[-l]\"|wc -l")         #Expr1
a=int(pipe.read())
pipe.close()
b=sum([len(files) for root,dirs,files in os.walk("/etc")])  #Expr2
print a
print b
print "a equals to b ?", str(a==b)  #False
print "Why?"

What is the difference between Expr1 's function and Expr2 's? Expr1的功能和Expr2的功能有什么区别 I think Expr1 gives the right answer, but not sure. 我认为Expr1提供正确的答案,但不确定。

Short answer: 简短答案:

ls -laR | grep "^[-l]" ls -laR | grep "^[-l]" counts symlinks to directories. ls -laR | grep "^[-l]"计算到目录的符号链接。 It matches any line that begins with l and that includes symlinks to directories. 它与任何以l开头并且包括目录符号链接的行匹配。

In contrast, [files for root, dirs, files in os.walk('/etc')] does not count symlinks to directories . 相反, [files for root, dirs, files in os.walk('/etc')] 不计入目录的符号链接 It ignores all directories and lists only files. 它忽略所有目录,仅列出文件。


Long answer: 长答案:

Here is how I identified the discrepancies: 这是我识别差异的方法:

import os
import subprocess
import itertools

def line_to_filename(line):
    # This assumes that filenames have no spaces, which is a false assumption
    # Ex: /etc/NetworkManager/system-connections/Wired connection 1
    idx = line.rfind('->')
    if idx > -1:
        return line[:idx].split()[-1]
    else:
        return line.split()[-1]

line_to_filename tries to find the filename in the output of ls -laR . line_to_filename尝试在ls -laR的输出中找到文件名。

This defines expr1 and expr2 and is essentially the same as your code. 这定义了expr1expr2并且与您的代码基本相同。

proc=subprocess.Popen(
    "ls /etc -alR 2>/dev/null | grep -s \"^[-l]\" ", shell = True,
    stdout = subprocess.PIPE)         #Expr1
out, err = proc.communicate()
expr1 = map(line_to_filename, out.splitlines())

expr2 = list(itertools.chain.from_iterable(
    files for root,dirs,files in os.walk('/etc') if files))  #Expr2

for expr in ('expr1', 'expr2'):
    print '{e} is of length {l}'.format(e = expr, l = len(vars()[expr]))

This removes names from expr1 that are also in expr2 : 这将从expr1中删除也位于expr2中的名称:

for name in expr2:
    try:
        expr1.remove(name)
    except ValueError:
        print('{n} is not in expr1'.format(n = name))

After removing filenames that expr1 and expr2 share in common, 删除expr1expr2共同共享的文件名后,

print(expr1) 

yields 产量

['i386-linux-gnu_xorg_extra_modules', 'nvctrl_include', 'template-dkms-mkdsc', 'run', '1', 'conf.d', 'conf.d']

I then used find to find these files in /etc and tried to guess what was unusual about these files. 然后,我使用find/etc找到这些文件,并试图猜测这些文件的异常之处。 They were symlinks to directories (rather than files). 它们是目录(而不是文件)的符号链接。

If you use walk, errors are ignored (see this ), and ls sends a message for each error. 如果您使用的步行路程,错误被忽略(见 )和ls发送消息为每个错误。 These count as words. 这些算作单词。

On my machine, /etc is a symlink to /private/etc, so ls /etc has only one line of output. 在我的机器上,/ etc是/ private / etc的符号链接,因此ls /etc只有一行输出。 ls /etc/ give the expected equivalence between ls and os.walk . ls /etc/给出了lsos.walk之间的等效值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM