简体   繁体   English

Linux - 如何根据字段值从文件中删除某些行

[英]Linux - How to remove certain lines from a files based on a field value

I want to remove certain lines from a tab-delimited file and write output to a new file. 我想从制表符分隔文件中删除某些行并将输出写入新文件。

a   b   c   2017-09-20
a   b   c   2017-09-19
es  fda d   2017-09-20
es  fda d   2017-09-19

The 4th column is Date, basically I want to keep only lines that has 4th column as "2017-09-19" (keep line 2&4) and write to a new file. 第4列是Date,基本上我只想将第4列的行保留为“2017-09-19”(保留第2行和第4行)并写入新文件。 The new file should have same format as the raw file. 新文件应具有与原始文件相同的格式。

How to write the linux command for this example? 如何为这个例子编写linux命令?

Note : The search criteria should be on the 4th field as I have other fields in the real data and possibly have same value as 4th field. 注意 :搜索条件应该在第4个字段上,因为我在实际数据中有其他字段,并且可能与第4个字段具有相同的值。

Use grep to filter: 使用grep过滤:

cat file.txt | grep '2017-09-19' > filtered_file.txt

This is not perfect, since the string 2017-09-19 is not required to appear in the 4th column, but if your file looks like the example, it'll work. 这并不完美,因为字符串2017-09-19不需要出现在第4列,但如果您的文件看起来像示例,它将起作用。

With awk: 用awk:

awk 'BEGIN{OFS="\t"} $4=="2017-09-19"' file

OFS : output field separator, a space by default OFS :输出字段分隔符,默认为空格

Sed solution: Sed解决方案:

sed -nr "/^([^\t]*\t){3}2017-09-19/p" input.txt >output.txt

this is: 这是:

  • -n - don't output every line -n - 不输出每一行
  • -r - extended regular expresion -r - 延长正常表达
  • /regexp/p - print line that contains regular expression regexp /regexp/p - 包含正则表达式regexp的打印行
  • ^ - begin of line ^ - 开始行
  • (regexp){3} - repeat regexp 3 times (regexp){3} - 重复regexp 3次
  • [^\\t] - any character except tab [^\\t] - 除标签外的任何字符
  • \\t - tab character \\t - 制表符
  • * - repeat characters multiple times * - 多次重复字符
  • 2017-09-19 - search text 2017-09-19 - 搜索文字

That is, skip 3 columns separated by a tab from the beginning of the line, and then check that the value of column 4 coincides with the required value. 也就是说,跳过从行开头用制表符分隔的3列,然后检查第4列的值是否与所需值一致。

awk '/2017-09-19/' file >newfile

cat newfile
a   b   c   2017-09-19
es  fda d   2017-09-19

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM