简体   繁体   中英

AWK removal of lines in CSV with two empty fields?

I rarely use awk and I think I'm forgetting a basic of using it with CSV files, here, but I have the following file, called new2.csv:

Filename,Dist.(ft),BR Name,BR 2-Way Time,Till/WBR Name,Till/WBR 2-Way Time
FILE____007P_1.DZT,0,BR,,Till/WBR,
FILE____007P_1.DZT,0.042,BR,,Till/WBR,
FILE____007P_1.DZT,0.083,BR,,Till/WBR,
FILE____007P_1.DZT,0.125,BR,,Till/WBR,
FILE____007P_1.DZT,0.167,BR,,Till/WBR,
FILE____007P_1.DZT,0.208,BR,92.58,Till/WBR,
FILE____007P_1.DZT,0.25,BR,,Till/WBR,29.3
FILE____007P_1.DZT,0.292,BR,,Till/WBR,

I would like to only retain the rows that have values in the fourth or sixth column (lines 7 and 8) using awk.

I tried a few things to check what's going on in there:

awk -F',' '{print NR, "->", $4}' new2.csv

returns line 7 -> 92.58 for line 7 and nothing for the rest of the lines, so that's good. Next, I tried

awk -F',' '{print NR, "->", $6}' new2.csv

which returns line 8 -> 29.3, so we're still good.

Thinking I have it solved, I move on to

awk -F',' '$4!=""' new2.csv

and it prints the header line and the seventh line of the code, as expected. Moving on to column 6, I write the same expression and it returns the entire contents of new2.csv. In attempt to troubleshoot, I try

awk -F',' '{print NR, "->", $6!=""}' new2.csv

and that returns line 1 -> 1, line2 -> 1, ..., line 8 ->1, ...etc, so there's my problem. What's going on? Is there a way I can fix it?

The comma at the end of the line seems like it might be the source of the problem, but after reading quite a few posts I'm still not sure what to do about it. awk '{print substr($0,0,length($0)-1)}' new.csv doesn't remove the last comma either. I generated the csv on a Windows 8 machine and am using awk on a Linux box in bash.

$ awk -F, '($4$6)~/./' file
Filename,Dist.(ft),BR Name,BR 2-Way Time,Till/WBR Name,Till/WBR 2-Way Time
FILE____007P_1.DZT,0.208,BR,92.58,Till/WBR,
FILE____007P_1.DZT,0.25,BR,,Till/WBR,29.3

or if you want fields that contain only spaces to be considered "empty" too:

$ awk -F, '($4$6)~/[^[:space:]]/' file
Filename,Dist.(ft),BR Name,BR 2-Way Time,Till/WBR Name,Till/WBR 2-Way Time
FILE____007P_1.DZT,0.208,BR,92.58,Till/WBR,
FILE____007P_1.DZT,0.25,BR,,Till/WBR,29.3

and if you want to exclude the header line:

$ awk -F, '(NR>1) && (($4$6)~/[^[:space:]]/)' file
FILE____007P_1.DZT,0.208,BR,92.58,Till/WBR,
FILE____007P_1.DZT,0.25,BR,,Till/WBR,29.3
Filename,Dist.(ft),BR Name,BR 2-Way Time,Till/WBR Name,Till/WBR 2-Way Time FILE____007P_1.DZT,0,BR,,Till/WBR,
FILE____007P_1.DZT,0.042,BR,,Till/WBR,
FILE____007P_1.DZT,0.083,BR,,Till/WBR,
FILE____007P_1.DZT,0.125,BR,,Till/WBR,
FILE____007P_1.DZT,0.167,BR,,Till/WBR,
FILE____007P_1.DZT,0.208,BR,92.58,Till/WBR,
FILE____007P_1.DZT,0.25,BR,,Till/WBR,29.3
FILE____007P_1.DZT,0.292,BR,,Till/WBR,

hzhang@dell-work ~ $ cat test.awk 
#!/usr/bin/awk
BEGIN{
    FS = ","
}
{
    # ignore the first definition line
    if(FNR > 1){
        # checks column 4 has a non-empty value
        if($4 !=""){
            print FNR,"->", $4
        }

        # checks column 6 has a non-empty value
        if($6 != ""){
            print FNR,"->", $6
        }

    }
}
hzhang@dell-work ~ $ awk -f test.awk sample.csv 
6 -> 92.58
7 -> 29.3

if you want to run command line on console:

hzhang@dell-work ~ $ awk -F, '(FNR>1){if($4 != ""){ print FNR,"->",$4  }; if($6 != ""){ print FNR,"->",$6  }}' sample.csv 
6 -> 92.58
7 -> 29.3

You are mixing up patterns and actions.

awk -F',' '$4!=""' is a pattern without action, it says "do default if $4 is empty". Default action is print input.

awk -F',' '{print NR, "->", $6!=""}' is an action without pattern. Missing pattern makes all line match for the action. It says -- print line number, then -> , then result of comparing $6 to "empty" (which is a boolean 0 or 1, 1 for "true").

To do what you want, you can use bare pattern:

awk -F, '$4!="" || $6!=""'

This says - "do default action of $4 is not empty, or $6 is not empty". Default action is print input.

Or you can use bare action:

awk -F, '{ if ($4 != "" || $6 != "") { print $0; } }'

This one says - evaluate if $4 is not empty, or $6 is not empty then print $0 (which is input line).

PS From your question and actions, it's not clear whether you want lines that have both 4th and 6th not empty, or just one. If it's both, then the tests about should be and ( && ) instead of or ( || )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM