I need a shell script to be designed to print lines in a pattern from three files.
file1.txt, file2.txt,file3.txt
I need the output to be
line1 of file1.txt
line2 of file1.txt
line1 of file2.txt
line2 of file2.txt
line1 of file3.txt
line2 of file3.txt
line3 of file1.txt
line4 of file1.txt
line3 of file2.txt
line4 of file2.txt
line3 of file3.txt
line4 of file3.txt
...
How can we get this in a shell script? Also it should print only the non-blank lines.
Perl to the rescue:
perl -e 'open $FH[ @FH ], "<", $_ or die $! for @ARGV;
while (grep !eof $_, @FH) {
for my $fh (@FH) {
print scalar <$fh> for 1, 2;
}
}' -- file*.txt
It keeps all the files opened at the same time (the @FH array contains the filehandles). While at least one hasn't ended yet, it prints two lines from each.
What about the following script, which accepts the files as parameters :
TOTAL_LINES=$(wc -l < "$1")
for n in $(seq 1 2 $TOTAL_LINES); do
for file in "$@"; do
sed -n "$n{p;n;p}" $file
done
done
I've considered all files had the same number of lines as suggested in the comments, but it will also work when it's not the case provided you pass the longest file as first parameter.
A little explanation on parts of the script you're the less likely to know :
seq
will generate a sequence of numbers for
will iterate over. It's syntax is seq from increment upTo
and it's used instead of the {from..upTo..increment}
syntax which doesn't accept variables $@
is an array of the parameters passed to the script sed -n "$n{p;n;p}"
is a sed
command that won't display the text by default, but will execute p
, n
and p
again for the line $n
; p
prints the current line, n
goes to the next line Consider four similar input files:
$ cat file1.txt
line1 of file1.txt
line2 of file1.txt
line3 of file1.txt
line4 of file1.txt
We create printer.sh
as follows:
#!/bin/bash
LINES=2 # Configure this to set the number of consecutive lines per file
MAX_HANDLE=3
# Create descriptors 3,4,... for filename1,filename2....
for var in "$@"
do
eval exec "$MAX_HANDLE"'<"$var"'
((MAX_HANDLE++))
done
# Start infinite loop
while :
do
# First descriptor is 3
COUNTER=3
# Loop over all open file descriptors from 3 to MAX_HANDLE - 1
while [ $COUNTER -lt $MAX_HANDLE ]; do
# Read $LINES lines from the open file descriptor
LINE_COUNTER=0
while [ $LINE_COUNTER -lt $LINES ]; do
read -r line <&"$COUNTER" || DONE=true
if [[ "$DONE" = true ]]; then
exit
fi
# Print the line that was read
echo "$line"
((LINE_COUNTER++))
done
((COUNTER++))
done
done
On executing this, the input parameters are each added to a new handle and read $LINES
lines at a time (in this case 2 lines at a time). This only works for identical length files as OP posited.
$ ./printer.sh file1.txt file2.txt file3.txt file4.txt
line1 of file1.txt
line2 of file1.txt
line1 of file2.txt
line2 of file2.txt
line1 of file3.txt
line2 of file3.txt
line1 of file4.txt
line2 of file4.txt
line3 of file1.txt
line4 of file1.txt
line3 of file2.txt
line4 of file2.txt
line3 of file3.txt
line4 of file3.txt
line3 of file4.txt
line4 of file4.txt
You can use paste
with awk
to get your output:
paste -d $'\01' file[123].txt |
awk -F '\01' 'NR%2{for (i=1; i<=NF; i++) a[i]=$i; next}
{for (i=1; i<=NF; i++) print a[i] ORS $i}'
line1 of file1.txt
line2 of file1.txt
line1 of file2.txt
line2 of file2.txt
line1 of file3.txt
line2 of file3.txt
line3 of file1.txt
line4 of file1.txt
line3 of file2.txt
line4 of file2.txt
line3 of file3.txt
line4 of file3.txt
paste
we create side-by-side control-A
(ASCII 1) delimited output awk
with field separator as control-A
we output 2 lines from each column lots of answers. This one is awk
create the test files
for f in file{1,2,3}.txt; do rm $f; for n in {1,2,3,4}; do echo "line $n of file $f" >> $f; done; done
and the awk program
awk '
FNR == 1 && NR>1 {
exit # exit after completing the first file
}
{
# print 2 lines from the first file
if (NF) print
getline; if (NF) print
# print 2 lines from each other file
for (i=2; i<ARGC; i++) {
getline < ARGV[i]; if (NF) print
getline < ARGV[i]; if (NF) print
}
}
' file{1,2,3}.txt
The if (NF) print
lines exclude blank lines since the number of whitespace-separated fields will be zero.
line 1 of file file1.txt
line 2 of file file1.txt
line 1 of file file2.txt
line 2 of file file2.txt
line 1 of file file3.txt
line 2 of file file3.txt
line 3 of file file1.txt
line 4 of file file1.txt
line 3 of file file2.txt
line 4 of file file2.txt
line 3 of file file3.txt
line 4 of file file3.txt
This may not be the most efficient approach, but this will work, assuming that you have all your files in $files, and $total_lines contains the number of lines in each file:
for line in $(seq 1 $total_lines)
do
for file in $files
do
sed '/^$/d' $file | sed $line'!d'
done
done
sed '/^$/d' removes all the empty lines from the stream;
sed $line'!d' prints out the line corresponding to $line
Using paste and awk.
$ cat test.sh
paste -d '|' file* | awk -F\| '{
if(NR % 2 == 1) {
file1 = $1;
file2 = $2;
file3 = $3;
} else {
file1 = file1 "\n" $1;
file2 = file2 "\n" $2;
file3 = file3 "\n" $3;
print file1;
print file2;
print file3;
}
}'
Because all files have same length, we can pasted all files first and printed when row number is even.
If you don't mind creating intermediate/temporary files, split(1) which is part of coreutils of every Linux distribution might be handy:
#!/bin/bash
# Split files every 2 lines using a numeric suffix
for f in file*.txt; do
split -d -l 2 "${f}" "${f}"split
done
# Reverse intermediate file names, so we can glob them in numeric order
for f in file*split*; do
mv "${f}" "reversed$(echo ${f}|rev)"
done
cat reversed* && rm reversed*
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.