简体   繁体   English

GREP从另一个文件中的文件行,直到出现某个字符

[英]GREP lines from file in another file until occurence of a certain character

grep -A 10 -f smallfile bigfile

greps every line from smallfile in bigfile and the next 10 lines too greps来自bigfile中的smallfile和接下来的10行的每一行

is it possible by using another flag instead of -A to keep grepping the following lines until the occurence of a character (lets say @) in the bigfile, and I need to do it for hundreds of lines from smallfile and I have no information how many lines following the line from smallfile I need to grep, it changes for each. 是否有可能通过使用另一个标志而不是-A来保持grepping以下行,直到bigfile中出现一个字符(比如说@),我需要从smallfile中为数百行进行,我没有信息如何从smallfile到我需要grep的行之后的许多行,它会为每个行更改。 Example just illustrating one of the lines: 示例只是说明其中一行:

smallfile: 小文件:

@123
@555

bigfile: 大文件:

@123
abc
def
ghj
@789
sdf
tyu
rzx
@555
yui
wer
@435
teg
gdgd

So I want it to give me this 所以我希望它能给我这个

@123
abc
def
ghj
@555
yui
wer

If you know another way of "grepping" lines from one file in another file which can do this, that would also work, I may try to write a python script or a more complex loop, but I believe there should be a way to make grep do this using a flag like -m but I just couldn't make it work the way I want. 如果你知道从另一个文件中的一个文件“grepping”行的另一种方法可以做到这一点,那也行,我可能会尝试编写python脚本或更复杂的循环,但我相信应该有一种方法可以grep使用像-m这样的标志来做这件事,但我无法让它按照我想要的方式工作。

Many Thanks! 非常感谢!

This job is better handled with awk than grep. 使用awk比使用grep更好地处理这项工作。 Bellow script seems to work ok in my tests: 在我的测试中,Bellow脚本似乎正常工作:

$ awk 'NR==FNR{a[$0];next}$0 in a{print;f=0;next} \
{if ($0 !~ /^@/ && f!=1) {print} else {f=1}}' smallfile bigfile

Or even: 甚至:

awk 'NR==FNR{a[$0];next}$0 in a || ($0 !~ /^@/ && f!=1){print;f=0;next}{f=1}' file1 file2

Explanation: 说明:
awk scripts are based on pattern 'condition1{action1}condition2{action2}etc' awk脚本基于模式'condition1{action1}condition2{action2}etc'
FNR =Open File Line Number (resets on reading next file) FNR =打开文件行号(在读取下一个文件时重置)
NR =Global Line number - keeps increasing among all files NR =全局行号 - 在所有文件中不断增加
|| = OR logical operator = OR逻辑运算符
$0 =whole line $0 =整条线
a[$0] = initialize an array with $0 as key/index a[$0] =使用$ 0初始化一个数组作为键/索引
$0 in a = check if $0 (whole line) is a key/index of array a $0 in a = $ 0检查$ 0(整行)是否为数组a的键/索引
$0 !~/^@/ =$0 do not match with regex /^@/ = not starting with @ $0 !~/^@/ = $ 0与正则表达式匹配/ ^ @ / =不以@开头
next =read next line next =读下一行
files are read serially by awk Condition can be ommitted and action can be written directly. 文件由awk串行读取条件可以省略,动作可以直接写入。 In this case action is always performed when will be reached by awk (equivalent to condition==1/true) 在这种情况下,当awk到达时总是执行动作(相当于condition == 1 / true)
Action can be ommited for a given condition. 对于给定条件,可以省略动作。 In that case default action will be executed = print $0 在这种情况下,将执行默认操作= print $ 0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM