[英]How to display file columns containing a specific word using awk
I would like to print all columns that contains word, for example "watermelon".我想打印所有包含单词的列,例如“西瓜”。 A was thinking about using together these 2 formulas, because they are working separetly (one is doing something for every column in file and another is checking if column contains specyfic word).
A 正在考虑一起使用这两个公式,因为它们是分开工作的(一个正在为文件中的每一列做一些事情,另一个正在检查列是否包含特定的单词)。
awk '{for(i=1;i<=NF-1;i++) printf $i" "; print $i}' a.csv
awk -F"," '{if ($2 == " watermelon") print $2}' a.csv
But when I try put them toghether my code isn't working但是当我尝试将它们放在一起时,我的代码不起作用
#!/bin/bash
awk '{for(i=1;i<=NF-1;i++)
awk -F"," '{if ($i == " watermelon")
print $i}' a.csv
}' a.csv
For example this is my file a.csv例如这是我的文件 a.csv
lp, type, name, number, letter
1, fruit, watermelon, 6, a
2, fruit, apple, 7, b
3, vegetable, onion, 8, c
4, vegetable, broccoli, 6, b
5, fruit, orange, 5, c
And this is the result i would like to get, while searching for word watermelon这是我想得到的结果,同时搜索 word 西瓜
name
watermelon
apple
onion
broccoli
orange
Here's one that processes the data twice:这是处理数据两次的一个:
$ awk -F', ' ' # remember to se OFS if you need one
NR==FNR { # on the first run
for(i=1;i<=NF;i++) # find
if($i=="watermelon") # watermelon fields
a[i] # and mark them
next
}
FNR==1 { # in case there were no such field
for(i in a) # test
next # and continue
exit # or exit
}
{ # on the second run
for(i=1;i<=NF;i++)
if(i in a)b=b (b==""?"":OFS) $i # buffer those fields for output
print b # and output
b="" # clean that buffer for next record
}' file file
Output: Output:
name
watermelon
apple
onion
broccoli
orange
$ cat tst.awk
BEGIN { FS=OFS=", " }
NR==FNR {
for (inFldNr=1; inFldNr<=NF; inFldNr++) {
if ( $inFldNr == tgt ) {
hits[inFldNr]
}
}
next
}
FNR==1 {
for (inFldNr=1; inFldNr<=NF; inFldNr++) {
if ( inFldNr in hits ) {
out2in[++numOutFlds] = inFldNr
}
}
}
{
for (outFldNr=1; outFldNr<=numOutFlds; outFldNr++) {
inFldNr = out2in[outFldNr]
printf "%s%s", $inFldNr, (outFldNr<numOutFlds ? OFS : ORS)
}
}
$ awk -v tgt='watermelon' -f tst.awk file file
name
watermelon
apple
onion
broccoli
orange
The main difference between the above and @JamesBrown's approach is that in the 2nd pass of the file my script only loops over the fields to be output while James' loops over all input fields and so will be slower in what is presumably the normal case where not all input fields have to be output.上述方法与@JamesBrown 的方法之间的主要区别在于,在文件的第二遍中,我的脚本仅循环遍历字段为 output 而 James 循环遍历所有输入字段,因此在可能的正常情况下会变慢并非所有输入字段都必须是 output。
Regarding printf $i
in your code btw - never do that, always do printf "%s", $i
for any input data instead as the former will fail when your input contains printf formatting chars like %s
.关于
printf $i
在您的代码中顺便说一句 - 永远不要这样做,总是对任何输入数据执行printf "%s", $i
%s
,因为当您的输入包含 ZAFA0FF8B27B87666A6BDE87251C 5.FDEZ 格式时,前者将失败
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.