I have three files, each with an ID and a value.
sdt5z@fir-s:~/test$ ls
a.txt b.txt c.txt
sdt5z@fir-s:~/test$ cat a.txt
id1 1
id2 2
id3 3
sdt5z@fir-s:~/test$ cat b.txt
id1 4
id2 5
id3 6
sdt5z@fir-s:~/test$ cat c.txt
id1 7
id2 8
id3 9
I want to create a file that looks like this...
id1 1 4 7
id2 2 5 8
id3 3 6 9
...preferably using a single command.
I'm aware of the join and paste commands. Paste will duplicate the id column each time:
sdt5z@fir-s:~/test$ paste a.txt b.txt c.txt
id1 1 id1 4 id1 7
id2 2 id2 5 id2 8
id3 3 id3 6 id3 9
Join works well, but for only two files at a time:
sdt5z@fir-s:~/test$ join a.txt b.txt
id1 1 4
id2 2 5
id3 3 6
sdt5z@fir-s:~/test$ join a.txt b.txt c.txt
join: extra operand `c.txt'
Try `join --help' for more information.
I'm also aware that paste can take STDIN as one of the arguments by using "-". Eg, I can replicate the join command using:
sdt5z@fir-s:~/test$ cut -f2 b.txt | paste a.txt -
id1 1 4
id2 2 5
id3 3 6
But I'm still not sure how to modify this to accomodate three files.
Since I'm doing this inside a perl script, I know I can do something like putting this inside a foreach loop, something like join file1 file2 > tmp1, join tmp1 file3 > tmp2, etc. But this gets messy, and I would like to do this with a one-liner.
join a.txt b.txt|join - c.txt
应该足够了
Since you're doing it inside a Perl script , is there any specific reason you're NOT doing the work in Perl as opposed to spawning in shell?
Something like (NOT TESTED! caveat emptor):
use File::Slurp; # Slurp the files in if they aren't too big
my @files = qw(a.txt b.txt c.txt);
my %file_data = map ($_ => [ read_file($_) ] ) @files;
my @id_orders;
my %data = ();
my $first_file = 1;
foreach my $file (@files) {
foreach my $line (@{ $file_data{$file} }) {
my ($id, $value) = split(/\s+/, $line);
push @id_orders, $id if $first_file;
$data{$id} ||= [];
push @{ $data{$id} }, $value;
}
$first_file = 0;
}
foreach my $id (@id_orders) {
print "$d " . join(" ", @{ $data{$id} }) . "\n";
}
pr -m -t -s\ file1.txt file2.txt|gawk '{print $1"\t"$2"\t"$3"\t"$4}'> finalfile.txt
Considering file1 and file2 have 2 columns and 1 and 2 represents columns from file1 and 3 and 4 represents columns from file2.
You can also print any column from each file in this way and it will take any number of files as input. If your file1 has 5 columns for example, then $6 will be the first column of the file2.
perl -lanE'$h{$F[0]} .= " $F[1]" END{say $_.$h{$_} foreach keys %h}' *.txt
Should work, can't test it as I'm answering from my mobile. You also could sort the output if you put a sort
between foreach
and keys
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.