使用纯Python而不是grep？

Question

I am not familiar with grep as I've always been on a Windows system so when someone suggested I add these lines to my code, I'm a little confused... 我不熟悉grep因为我一直在Windows系统上，所以当有人建议我将这些行添加到我的代码中时，我有点困惑......

grep = 'grep -E \'@import.*{}\' * -l'.format(name)
proc = Popen(grep, shell=True, cwd=info['path'], stdout=PIPE, stderr=PIPE)

From my understanding, this is trying to find all files in cwd that contain @import given_file_name essentially, right? 根据我的理解，这是试图在cwd中查找包含@import given_file_name所有文件，对吧？

If this is how grep works, I would need to write something in just Python that would do this for me, however I'm worried about the time it may take to do such a thing. 如果这就是grep工作方式，我需要用Python编写一些可以为我做这件事的东西，但是我担心做这样的事情可能需要花时间。

The script is in a Sublime Text 3 plugin that runs the sublime_plugin.EventListener method on_post_save to find all files containing the just saved filename and build a list of file names to compile. 该脚本位于Sublime Text 3插件中，该插件运行sublime_plugin.EventListener方法on_post_save以查找包含刚刚保存的文件名的所有文件，并构建要编译的文件名列表。

def files_that_import(filename, project_root):
    files = []
    for root, dirnames, files in os.walk(project_root):
        for fn in files:
            if fn.endswith(('.scss', '.sass')):
                with open(fn, 'r') as f:
                    data = f.read()
                if re.search(r'@import.*["\']{}["\'];'.format(fn), data):
                    files.append(fn)
    return files

Not knowing exactly how grep works, this was the best I could think of. 不知道grep究竟是如何工作的，这是我能想到的最好的。 However, like I said, I'm worried about the time it would take to scan through all .scss and .sass files. 但是，就像我说的，我很担心它会采取通过所有扫描的时间.scss和.sass文件。 While there shouldn't be a ton of them, getting the contents for each seems like it's more complicated than what it could be. 虽然不应该有大量的内容，但为每个内容获取内容似乎比它可能更复杂。

updated 更新

I updated the code using @nneonneo corrections. 我使用@nneonneo更正更新了代码。 I also noticed in the code I used, it was checking each file for an @import statement for itself. 我还注意到在我使用的代码中，它正在为每个文件检查@import语句。

def files_that_import(filename, project_root):
    pattern = re.compile('''@import.*["']{}["'];'''.format(filename))
    found = []
    for root, dirnames, files in os.walk(project_root):
        for fn in files:
            if fn.endswith(('.scss', '.sass')):
                with open(fn, 'r') as f:
                    if any(pattern.search(line) for line in f):
                        found.append(fn)
    return found

update If anyone finds this useful and wants to use the code, I changed files = [] to found = [] since files is being defined in the for loop with os.walk() causing an error. 更新如果有人发现这个有用并想使用代码，我将files = []更改为found = []因为files是在for循环中定义的， os.walk()导致错误。

Answer 1

You've mostly got it. 你大多得到它。 You can make it a bit more efficient by doing the following: 通过执行以下操作，您可以提高效率：

import_pattern = re.compile(r'''@import.*["']{}["'];'''.format(fn))
with open(fn, 'r') as f:
    for line in f:
        if import_pattern.match(line):
            files.append(fn)
            break

This will scan through each line, and break as soon as it finds what it is looking for. 这将扫描每一行，并在找到所需内容后立即中断。 It should be faster than reading the whole file. 它应该比读取整个文件更快。

使用纯Python而不是grep？

问题描述

1 个解决方案

解决方案1
3 已采纳 2015-06-20 09:28:56

使用纯Python而不是grep？

问题描述

1 个解决方案

解决方案1 3 已采纳 2015-06-20 09:28:56

解决方案1
3 已采纳 2015-06-20 09:28:56