简体   繁体   English

在AWK中删除带有两个空字段的CSV中的行?

[英]AWK removal of lines in CSV with two empty fields?

I rarely use awk and I think I'm forgetting a basic of using it with CSV files, here, but I have the following file, called new2.csv: 我很少使用awk,我想我在这里忘记了将它与CSV文件一起使用的基本知识,但是我有一个名为new2.csv的文件:

Filename,Dist.(ft),BR Name,BR 2-Way Time,Till/WBR Name,Till/WBR 2-Way Time
FILE____007P_1.DZT,0,BR,,Till/WBR,
FILE____007P_1.DZT,0.042,BR,,Till/WBR,
FILE____007P_1.DZT,0.083,BR,,Till/WBR,
FILE____007P_1.DZT,0.125,BR,,Till/WBR,
FILE____007P_1.DZT,0.167,BR,,Till/WBR,
FILE____007P_1.DZT,0.208,BR,92.58,Till/WBR,
FILE____007P_1.DZT,0.25,BR,,Till/WBR,29.3
FILE____007P_1.DZT,0.292,BR,,Till/WBR,

I would like to only retain the rows that have values in the fourth or sixth column (lines 7 and 8) using awk. 我只想使用awk保留第四或第六列(第7和第8行)中具有值的行。

I tried a few things to check what's going on in there: 我尝试了几件事来检查其中发生的情况:

awk -F',' '{print NR, "->", $4}' new2.csv

returns line 7 -> 92.58 for line 7 and nothing for the rest of the lines, so that's good. 对于第7行,返回第7行-> 92.58,对于其余各行,则不返回任何值,所以很好。 Next, I tried 接下来,我尝试了

awk -F',' '{print NR, "->", $6}' new2.csv

which returns line 8 -> 29.3, so we're still good. 返回第8行-> 29.3,所以我们仍然很好。

Thinking I have it solved, I move on to 以为我解决了,我继续

awk -F',' '$4!=""' new2.csv

and it prints the header line and the seventh line of the code, as expected. 并按预期方式打印代码的标题行和第七行。 Moving on to column 6, I write the same expression and it returns the entire contents of new2.csv. 移至第6列,我编写了相同的表达式,它返回new2.csv的全部内容。 In attempt to troubleshoot, I try 为了解决问题,我尝试

awk -F',' '{print NR, "->", $6!=""}' new2.csv

and that returns line 1 -> 1, line2 -> 1, ..., line 8 ->1, ...etc, so there's my problem. 然后返回行1-> 1,行2-> 1,...,行8-> 1,...等等,所以这是我的问题。 What's going on? 这是怎么回事? Is there a way I can fix it? 有办法解决吗?

The comma at the end of the line seems like it might be the source of the problem, but after reading quite a few posts I'm still not sure what to do about it. 该行末尾的逗号似乎可能是问题的根源,但是在阅读了很多帖子之后,我仍然不确定该如何处理。 awk '{print substr($0,0,length($0)-1)}' new.csv doesn't remove the last comma either. awk '{print substr($0,0,length($0)-1)}' new.csv也不会删除最后一个逗号。 I generated the csv on a Windows 8 machine and am using awk on a Linux box in bash. 我在Windows 8计算机上生成了csv,并在bash中的Linux机器上使用了awk。

$ awk -F, '($4$6)~/./' file
Filename,Dist.(ft),BR Name,BR 2-Way Time,Till/WBR Name,Till/WBR 2-Way Time
FILE____007P_1.DZT,0.208,BR,92.58,Till/WBR,
FILE____007P_1.DZT,0.25,BR,,Till/WBR,29.3

or if you want fields that contain only spaces to be considered "empty" too: 或者,如果您也希望仅包含空格的字段也被视为“空”:

$ awk -F, '($4$6)~/[^[:space:]]/' file
Filename,Dist.(ft),BR Name,BR 2-Way Time,Till/WBR Name,Till/WBR 2-Way Time
FILE____007P_1.DZT,0.208,BR,92.58,Till/WBR,
FILE____007P_1.DZT,0.25,BR,,Till/WBR,29.3

and if you want to exclude the header line: 如果要排除标题行:

$ awk -F, '(NR>1) && (($4$6)~/[^[:space:]]/)' file
FILE____007P_1.DZT,0.208,BR,92.58,Till/WBR,
FILE____007P_1.DZT,0.25,BR,,Till/WBR,29.3
Filename,Dist.(ft),BR Name,BR 2-Way Time,Till/WBR Name,Till/WBR 2-Way Time FILE____007P_1.DZT,0,BR,,Till/WBR,
FILE____007P_1.DZT,0.042,BR,,Till/WBR,
FILE____007P_1.DZT,0.083,BR,,Till/WBR,
FILE____007P_1.DZT,0.125,BR,,Till/WBR,
FILE____007P_1.DZT,0.167,BR,,Till/WBR,
FILE____007P_1.DZT,0.208,BR,92.58,Till/WBR,
FILE____007P_1.DZT,0.25,BR,,Till/WBR,29.3
FILE____007P_1.DZT,0.292,BR,,Till/WBR,

hzhang@dell-work ~ $ cat test.awk 
#!/usr/bin/awk
BEGIN{
    FS = ","
}
{
    # ignore the first definition line
    if(FNR > 1){
        # checks column 4 has a non-empty value
        if($4 !=""){
            print FNR,"->", $4
        }

        # checks column 6 has a non-empty value
        if($6 != ""){
            print FNR,"->", $6
        }

    }
}
hzhang@dell-work ~ $ awk -f test.awk sample.csv 
6 -> 92.58
7 -> 29.3

if you want to run command line on console: 如果要在控制台上运行命令行:

hzhang@dell-work ~ $ awk -F, '(FNR>1){if($4 != ""){ print FNR,"->",$4  }; if($6 != ""){ print FNR,"->",$6  }}' sample.csv 
6 -> 92.58
7 -> 29.3

You are mixing up patterns and actions. 您正在混合模式和动作。

awk -F',' '$4!=""' is a pattern without action, it says "do default if $4 is empty". awk -F',' '$4!=""'是一个没有动作的模式,它说“如果$4为空,请执行默认操作”。 Default action is print input. 默认操作是打印输入。

awk -F',' '{print NR, "->", $6!=""}' is an action without pattern. awk -F',' '{print NR, "->", $6!=""}'是没有模式的操作。 Missing pattern makes all line match for the action. 缺少模式会使所有行都与该操作匹配。 It says -- print line number, then -> , then result of comparing $6 to "empty" (which is a boolean 0 or 1, 1 for "true"). 它说-打印行号,然后是-> ,然后是将$6与“ empty”(这是布尔值0或1,true表示布尔值)进行比较的结果。

To do what you want, you can use bare pattern: 要执行所需的操作,可以使用裸模式:

awk -F, '$4!="" || $6!=""'

This says - "do default action of $4 is not empty, or $6 is not empty". 这说-“执行默认操作$4不为空,或$6不为空”。 Default action is print input. 默认操作是打印输入。

Or you can use bare action: 或者您可以使用裸操作:

awk -F, '{ if ($4 != "" || $6 != "") { print $0; } }'

This one says - evaluate if $4 is not empty, or $6 is not empty then print $0 (which is input line). 这说-计算$4是否不为空,或$6不为空,然后打印$0 (这是输入行)。

PS From your question and actions, it's not clear whether you want lines that have both 4th and 6th not empty, or just one. PS从您的问题和行动来看,不清楚您是希望第4行和第6行都不为空还是只有一行。 If it's both, then the tests about should be and ( && ) instead of or ( || ) 如果两者都存在,则关于的测试应为and( && )而不是or( ||

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM