简体   繁体   English

使用 grep 在多个文件上循环

[英]looping with grep over several files

I have multiple files /text-1.txt, /text-2.txt... /text-20.txt and what I want to do is to grep for two patterns and stitch them into one file.我有多个文件/text-1.txt, /text-2.txt... /text-20.txt我想要做的是 grep 两个模式并将它们拼接成一个文件。

For example:例如:
I have我有

 grep "Int_dogs" /text-1.txt > /text-1-dogs.txt
 grep "Int_cats" /text-1.txt> /text-1-cats.txt
cat /text-1-dogs.txt /text-1-cats.txt > /text-1-output.txt

I want to repeat this for all 20 files above.我想对以上所有 20 个文件重复此操作。 Is there an efficient way in bash / awk , etc. to do this? bash / awk等是否有有效的方法来做到这一点?

#!/bin/sh
count=1

next () {
[[ "${count}" -lt 21 ]] && main
[[ "${count}" -eq 21 ]] && exit 0
}

main () {
file="text-${count}"
grep "Int_dogs" "${file}.txt" > "${file}-dogs.txt"
grep "Int_cats" "${file}.txt" > "${file}-cats.txt"
cat "${file}-dogs.txt" "${file}-cats.txt" > "${file}-output.txt"
count=$((count+1))
next
}

next

grep has some features you seem not to be aware of: grep有一些你似乎不知道的特性:

  1. grep can be launched on lists of files, but the output will be different: For a single file, the output will only contain the filtered line, like in this example: grep可以在文件列表上启动,但 output 会有所不同:对于单个文件,output 将仅包含过滤后的行,如下例所示:

     cat text-1.txt I have a cat. I have a dog. I have a canary. grep "cat" text-1.txt I have a cat.

    For multiple files, also the filename will be shown in the output: let's add another textfile:对于多个文件,文件名也将显示在 output 中:让我们添加另一个文本文件:

     cat text-2.txt I don't have a dog. I don't have a cat. I don't have a canary. grep "cat" text-*.txt text-1.txt: I have a cat. text-2.txt: I don't have a cat.
  2. grep can be extended to search for multiple patterns in files, using the -E switch. grep可以扩展为在文件中搜索多个模式,使用-E开关。 The patterns need to be separated using a pipe symbol:需要使用 pipe 符号分隔模式:

     grep -E "cat|dog" text-1.txt I have a dog. I have a cat.
  3. (summary of the previous two points + the remark that grep -E equals egrep ): (前两点总结+ grep -E等于egrep的备注):

     egrep "cat|dog" text-*.txt text-1.txt:I have a dog. text-1.txt:I have a cat. text-2.txt:I don't have a dog. text-2.txt:I don't have a cat.

So, in order to redirect this to an output file, you can simply say:因此,为了将其重定向到 output 文件,您可以简单地说:

egrep "cat|dog" text-*.txt >text-1-output.txt

Assuming you're using bash .假设您使用的是bash Try this:尝试这个:

for i in $(seq 1 20) ;do rm -f text-${i}-output.txt ; grep -E "Int_dogs|Int_cats" text-${i}.txt >> text-${i}-output.txt ;done

Details细节

This one-line script does the following:这个单行脚本执行以下操作:

  • Original files are intended to have the following name order/syntax:原始文件旨在具有以下名称顺序/语法:
    • text-<INTEGER_NUMBER>.txt - Example: text-1.txt , text-2.txt , ... text-100.txt . text-<INTEGER_NUMBER>.txt - 示例: text-1.txt , text-2.txt , ... text-100.txt
  • Creates a loop starting from 1 to <N> and <N> is the number of files you want to process.创建从 1 到 <N> 的循环,<N> 是您要处理的文件数。
  • Warn: rm -f text-${i}-output.txt command first will be run and remove the possible outputfile (if there is any), to ensure that a fresh new output file will be only available at the end of the process.警告:首先将运行rm -f text-${i}-output.txt命令并删除可能的输出文件(如果有),以确保新的 output 文件仅在进程结束时可用.
  • grep -E "Int_dogs|Int_cats" text-${i}.txt will try to match both strings in the original file and by >> text-${i}-output.txt all the matched lines will be redirected to a newly created output file with the relevant number of the original file. grep -E "Int_dogs|Int_cats" text-${i}.txt将尝试匹配原始文件中的两个字符串,并且通过>> text-${i}-output.txt所有匹配的行将被重定向到一个新的使用原始文件的相关编号创建 output 文件。 Example: if integer number in original file is 5 text-5.txt , then text-5-output.txt file will be created & contain the matched string lines (if any).示例:如果原始文件中的 integer 编号为 5 text-5.txt ,则将创建text-5-output.txt文件并包含匹配的字符串行(如果有)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM