[英]Search directories for multi line string
Looking for a way to recursively search a repository for all files containing a multi line string and return the file names that contain it. 寻找一种方法来递归地搜索存储库中包含多行字符串的所有文件,并返回包含该文件的文件名。 The paragraph is just a header approx 30 lines.
该段只是一个标题,大约30行。
Below is the approach I am taking but is not working. 以下是我正在使用但无法使用的方法。
repo = os.getcwd()
header = """ /*
/* .......paragraph
/* ..............
*/
"""
for file in glob.glob(repo):
with open(file) as f:
contents = f.read()
if header in contents:
print file
I am getting this error: 我收到此错误:
IOError: [Errno 21] Is a directory: '/home/test/python/repos/projects/one'
Edited new version @zondo 编辑新版本@zondo
def findAllFiles(directory):
gen = os.walk(directory)
next(gen)
return [os.path.join(path, f) for path, _, files in gen for f in files]
def main():
print "Searching directory for copyright header"
for file in findAllFiles(repo):
with open(file) as f:
contents = f.read()
if header in contents:
print file
With the os
module, you can do this: 使用
os
模块,您可以执行以下操作:
# Find not only all files in a folder, but all files in all sub-directories
def find_all_files(folder):
return [os.path.join(path, f) for path, _, files in os.walk(folder) for f in files]
for file in find_all_files(repo):
with open(file) as f:
contents = f.read()
if header in contents:
print file
Try using subprocess and pcregrep for matching multiple lines in different directories. 尝试使用subprocess和pcregrep来匹配不同目录中的多行。
from subprocess import call
call(["pcregrep", "-rM","<regular_exp>","<path to directory>"])
Never tried this. 从来没有尝试过。 Just came to my mind
刚浮现在我的脑海
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.