简体   繁体   中英

AWK count occurrences of column A based on uniqueness of column B

I have a file with several colummns and I want to count the occurrence of one column based on a second columns value being unique to the first column EX:

column 10            column 15
orange               New York
green                New York
blue                 New York
gold                 New York
orange               Amsterdam
blue                 New York
green                New York
orange               Sweden
blue                 Tokyo
gold                 New York

I am fairly new to using commands like awk and am looking to gain more practical knowledge.

i've tried some different variations of

awk '{A[$10 OFS $15]++} END {for (k in A) print k, A[k]}' myfile

but, not quite understanding the code, the output was not what I've expected.

I am expecting output of

orange     3
blue       2
green      1
gold       1

With GNU awk. I assume tab is your field separator.

awk '{count[$10 FS $15]++}END{for(j in count) print j}' FS='\t' file | cut -d $'\t' -f 1 | sort | uniq -c | sort -nr

Output:

      3 orange
      2 blue
      1 green
      1 gold

I suppose it could be more elegant.

Single GNU awk invocation version (Works with non-GNU awk too, just doesn't sort the output):

$ gawk 'BEGIN{ OFS=FS="\t" }
        NR>1 { names[$2,$1]=$1 }
        END { for (n in names) colors[names[n]]++;
              PROCINFO["sorted_in"] = "@val_num_desc";
              for (c in colors) print c, colors[c] }' input.tsv
orange  3
blue    2
gold    1
green   1

Adjust column numbers as needed to match real data.


Bonus solution that uses sqlite3:

$ sqlite3 -batch -noheader <<EOF
.mode tabs
.import input.tsv names
SELECT "column 10", count(DISTINCT "column 15") AS total
FROM names
GROUP BY "column 10"
ORDER BY total DESC, "column 10";
EOF
orange  3
blue    2
gold    1
green   1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM