简体   繁体   中英

Print common values in columns using bash

I have file with two columns

apple apple
ball cat
cat hat
dog delta

I need to extract values that are common in two columns (occur in both columns) like

apple apple
cat cat 

There is no ordering in items in each column.

Could you please try following and let me know if this helps you.

awk '
{
  col1[$1]++;
  col2[$2]++;
}
END{
  for(i in col1){
    if(col2[i]){
      while(++count<=(col1[i]+col2[i])){
         printf("%s%s",i,count==(col1[i]+col2[i])?ORS:OFS)}
      count=""}
  }
}' Input_file

NOTE: It will print the values if found in both the columns exactly number of times they are occurring in both the columns too.

Assuming I can use unix commands:

cut -d' ' -f2 fil | egrep `cut -d' ' -f1 < fil | paste -sd'|'` -

Basically what this does is this:

The second cut command collects all the words in the first column. The paste command joins them with a pipe (ie dog|cat|apple ).

The first cut command takes the second column of words in the list and pipes them into a regexp-enabled egrep command.

Here is the closest I could get. Maybe you could loop through whole file and print when it reaches another occurrence.

Code

cat file.txt | gawk   '$1==$2 {print $1,"=",$2}'

or

gawk '$1==$2 {print $1,"=",$2}' file.txt
$ awk '{a[$1];b[$2]} END{for(k in a) if(k in b) print k}' file
apple
cat

to print the values twice change to print k,k

with sort/join

$ join <(cut -d' ' -f1 file | sort) <(cut -d' ' -f2 file | sort)
apple
cat

perhaps,

$ function f() { cut -d' ' -f"$1" file | sort; }; join <(f 1) <(f 2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM