[英]Find all instances of string between two other strings that are on other lines
So I feel like I should know how to do this but I can't quite get it. 所以我觉得我应该知道该怎么做,但我做不到。
I'm trying to find all instances (in all files) where a string that ends with _START exists between two strings (that are normally on other lines) @GROUP and @END_GROUP 我正在尝试查找所有实例(在所有文件中),其中两个字符串之间(通常在其他行上)@GROUP和@END_GROUP中存在以_START结尾的字符串
So there might be some code like this 所以可能会有这样的代码
// @GROUP GroupName OtherStuff
#define GROUPNAME_START 1
#define GROUPNAME_FOO 2
.... (more defines)
#define GROUPNAME_END 10
// @END_GROUP
#define GROUPTWO_START 1
// @GROUP GroupTwo MoreStuff
#define GROUPTWO_FOO 2
.... (some defines)
#define GROUPTWO_BAR 70
// @END_GROUP
And I would want to match the first group (really just the line with _START, but everything would be ok) but not the second group or the _START line that is outside of the @GROUP comments. 我想匹配第一个组(实际上只匹配_START行,但一切正常),但不匹配第二个组或@GROUP注释之外的_START行。
I figure using grep for this would be the best way to search through all the files, but I can't quite get the regex needed. 我认为为此使用grep将是搜索所有文件的最佳方法,但是我不能完全获得所需的正则表达式。 Thanks for the help.
谢谢您的帮助。
edit: My bad for not making it clear that I want to be able to search through files in multiple directories at the same time, doing the same as a grep -r "foo" * . 编辑:我的缺点是无法明确表示我希望能够同时搜索多个目录中的文件,就像grep -r“ foo” *一样。 Answers have been good, I just didn't make that clear.
答案很好,我只是没有说清楚。
edit2: Multiple great answers each solved it in a slightly different way and I really don't know which one would be best. edit2:多个很好的答案每个都以略有不同的方式解决了它,我真的不知道哪个是最好的。 I marked the one who responded first, but anyone looking at this should be sure to check out all the answers, one might be better for your problem.
我标记了第一个回答的人,但是任何关注此问题的人都应该确保检查出所有答案,一个可能会更好地解决您的问题。
grep
only sees one line, so it doesn't know whether it's between the group comments or not. grep
只看到一行,因此它不知道它是否在组注释之间。 sed
can use addresses, though: sed
可以使用地址,但是:
sed '/@GROUP/,/@END_GROUP/!d' input_file | grep '_START'
!
negates the addresses, d
deletes a line, ie we're telling sed
to remove lines that are not between the group comments. 取反地址,
d
删除一行,即我们告诉sed
删除不在组注释之间的行。 grep
then operates only on the "interesting" lines. 然后
grep
仅在“有趣”行上运行。
To make it work for subdirectories, too, add find
to the toolbox: 要使其也适用于子目录,请在工具箱中添加
find
:
find /path/to/dir -type f -exec sed '/@GROUP/,/@END_GROUP/!d' {} + | grep '_START'
Or, if the group comment could appear without the corresponding END, use a slower but safer 或者,如果组注释可能没有相应的END出现,请使用较慢但更安全的方法
find /path/to/dir -type f -exec sed '/@GROUP/,/@END_GROUP/!d' {} \; | grep '_START'
Or, let xargs
operate on the output of grep -l
: 或者,让
xargs
对grep -l
的输出进行操作:
grep -lr @GROUP /path/to/dir | xargs sed '/@GROUP/,/@END_GROUP/!d' | grep '_START'
Note: If your filenames contain spaces, it wouldn't work. 注意:如果文件名包含空格,则无法使用。
With awk
you can use null RS
and do all that in single search: 使用
awk
您可以使用空RS
并在单个搜索中完成所有操作:
awk -v RS= '/@GROUP.*_START.*@END_GROUP/' file
// @GROUP GroupName OtherStuff
#define GROUPNAME_START 1
#define GROUPNAME_FOO 2
.... (more defines)
#define GROUPNAME_END 10
// @END_GROUP
This is a job for sed
, using its address syntax: 这是
sed
的工作,使用其地址语法:
#!/bin/sed -f
/@GROUP/h # store the @GROUP line
/@GROUP/,/@END_GROUP/{
/_START/{
g # retrieve the @GROUP line
n # print it and continue
}
}
# otherwise, delete the line and continue
d
It's a little bit complicated by the nested blocks, but what this does is: within @GROUP
.. @END_GROUP
, then for any line matching _START
it will print the previously found @GROUP
line thus (using your example): 嵌套的块有点复杂,但是它的作用是:在
@GROUP
.. @END_GROUP
,然后对于匹配_START
任何行,它将打印先前找到的@GROUP
行(因此,使用您的示例):
$ ./group.sed group.data
// @GROUP GroupName OtherStuff
Is that what you're trying to achieve? 这就是您要达到的目标吗?
Edit : It's not what you asked for - you just want the _START
line, not the @GROUP
line. 编辑 :这不是您要的-您只需要
_START
行,而不是@GROUP
行。 Well that's much easier: 嗯,这要容易得多:
#!/bin/sed -nf
/@GROUP/,/@END_GROUP/{
/_START/p
}
Addendum : Since you now ask for recursive directory searching, you can use find
as described in other answers: 附录 :由于您现在要求递归目录搜索,因此可以按照其他答案中的描述使用
find
:
find . -type f -print0 | xargs -0 ./group.sed --separate
(I've used the GNU sed --separate
argument here to protect against any file having the group start but missing the group end line). (我在这里使用了GNU sed
--separate
参数来防止任何文件以组开头但缺少组结束行)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.