Join two files linux

Question

I am trying to join two files but they don't have the same number of lines. I need to join them by the second column.

File1:

 11#San Noor#New York, US
 22#Maria Shiry#Dubai, UA
 55#John Smith#London, England
 66#Viki Sam#Roman, Italy
 81#Sara Moheeb#Montreal, Canada

File2:

 C1#Steve White#11
 C2#Hight Look#21
 E1#The Heaven is more#52
 I1#The Roma Seen#55

The output should be:

The output for paired lines should look like:

 San Noor#Sereve White

The output for unpairable lines should look like:

 Sara Moheeb#NA

(The file3 after joining should contain 5 lines and look as followed.)

  San Noor#Steve White
  Maria Shiry#Hight Look
  John Smith#The Heaven is more
  Viki Sam#The Roma Seen
  Sara Moheeb#NA

I have tried to join these two files using this command:

join -t '#' -j2 -e "NA" <(sort -t '#' -k2 File1) <(sort -t '#' -k2 File2) > File3

It says that both files are not sorted. Also, I need a way to fill in missing values after join.

Answer 1

Extract relevant columns and paste them together.

paste -d '#' <(cut -d '#' -f2 file1) <(cut -d '#' -f2 file2)

Well, but this will fail for the NA case, when one file has less lines then the other. You could pipe it to something along awk -v OFS='#' -F'#' { for (i=1;i<NF;++i) if (length($i) == 0) $i="NA"; } awk -v OFS='#' -F'#' { for (i=1;i<NF;++i) if (length($i) == 0) $i="NA"; } to substitute empty fields for the string NA .

So I guess your method is a possible one, but you have nothing to "join" on the files. So join on an a imaginary column with line numbers:

join -t'#' -eNA -a1 -a2 -o1.2,2.2 <(cut -d'#' -f2 file1 | nl -w1 -s'#') <(cut -d'#' -f2 file2 | nl -w1 -s'#')

Join two files linux

Question

1 answers

solution1
1 ACCPTED 2020-06-29 10:32:24

Join two files linux

Question

1 answers

solution1 1 ACCPTED 2020-06-29 10:32:24

solution1
1 ACCPTED 2020-06-29 10:32:24