简体   繁体   中英

Show lines which does not contain specific string on Linux

I have a text file on my Linux server with these characters:

  ID              DATA
MF00034657,12435464^DRogan^DPUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;M-DT_MAX_1;
MF00056578,12435464^DRogan^DPUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;PUM-DT_MAX_1234;UM-DT_MAX_123;

Now I need to filter the lines which do not contain "PUM-DT_MAX_1234" and save them in another file with the ID.

Like this:

MF00034657,M-DT_MAX_1
MF00056578,UM-DT_MAX_123

I use:

grep -v 'PUM-DT_MAX_1234' file > file.out
awk '!/PUM-DT_MAX_1234/' file > file.out

But it doesn't work.

How can I fix it?

Use:

awk '$0 !~ /your_pattern/'

As found in the (probably) greatest AWK documentation .

If you wish to remove any field containing "PUM-DT_MAX_1234" then you have to iterate over each field in your line:

awk -F "[;,]" -v OFS="," 'NR==1 { next; }; { for (i=1;i<=NF;i++) { if(!match($i,/.*PUM-DT_MAX_1234.*/) && length($i) > 0) { if (i==1) r=$i;  else r = r OFS $i }}; print r }' filter.txt

In a more readable view with comments:

  • -F "[;,]" Set the field separator to be ; or ,
  • -v OFS="," Set the output separator to be ,
  • 'NR==1 { next; }; ' start of the AWK script. The rest is to skip the header of your file (if the record number is 1, stop and go to to the next line
  • { for (i=1;i<=NF;i++) { Iterate over the number of fields ( NF )
  • if(!match($i,/.*PUM-DT_MAX_1234.*/) && length($i) > 0) { If the field is not null and don't match the text
  • if (i==1) r=$i; else r = r OFS $i if (i==1) r=$i; else r = r OFS $i concatenate the field to previous one (or just set it to the first field to avoid a leading , in the output)
  • print r }' Once the loop ends, print the result of the previous concatenation, and end the AWK script with ' for the shell
  • filter.txt Last argument is the file name.

OFS is the O utput F ield S eparator, so you can change it by changing the variable on the command line.

Output from your example:

MF00034657,M-DT_MAX_1
MF00056578,UM-DT_MAX_123

I'll use an analogy of your problem with the command ls (because it is easy to implement), let's say I want to display all files that are not mp4 , you do the following:

ls | awk '! /\.mp4/'

If you want to go further with the options, I could be actually looking for a file that it does not contain the mp4 extension and it does contain an specific string, eg abc :

ls | awk '! /\.mp4/ &&  /abc/'

This should be analogous and applicable to your purposes (or at least, not hard to implement).

sed '1b
h;s/.*DRogan^D//;s/PUM-DT_MAX_1234;\{0,1\}//g;s/;$//;/./!d
H;g;s/,.*\n/,/' YourFile
  • based on your sample

Concept:

  • keep a copy of the line
  • remove head and any "PUM" from the line. Check if something stay
  • get back the header (from the buffered line) and reformat with the reduce line

In silgon's answer , the command worked after I removed the gap in '! /.mp4/'

  • I wanted to remove "none" images from 'docker images' output, using AWK :

 docker images | awk '!/\\<none>/'

  • I wanted to print the name and tag only from 'docker images' output, ie, column 1 and 2 from an output excluding "none" images as well, using AWK:

 docker images | awk '!/\\<none>/' | awk '{print $1,$2}'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM