简体   繁体   中英

Copy each line of file1 into every other line of file2 (Python)

Sorry for the ridiculous title; it's probably why I couldn't find an answer on Google.

I have 5 text files that I want to combine into 1. I'd like to have the format like this:

line1 of file1
line1 of file2
line1 of file3
line1 of file4
line1 of file5
line2 of file1
line2 of file2
line2 of file3
line2 of file4
line2 of file5

and so on.

I tried using the bash command below, but it seems like it's too much for sed or something: it just inserts the text into the first line, not the line of the variable I'm calling.

for ((num=1; num<=66; num++)) ; do
    queryline=$(sed -n "${num}p" "file2.txt")
    sed -i "${num}i ${queryline}" "file1.txt"
done

(I tried this too)

for ((num=1; num<=66; num++)) ; do
    numa=$((num + 1))
    queryline=$(sed -n "${num}p" "file2.txt")
    sed -i "${numa}i ${queryline}" "file1.txt"
done

I'm thinking this might be easier with python (3.4), but I'm not sure how to do it. Tips please anyone?

Use contextlib.ExitStack() to handle the input files as a group and zip to read lines from all of the files:

import contextlib
import os

filenames = ['a','b','c','d','e']
output_file = 'fred'

# setup files for test
for filename in filenames:
    with open(filename, 'w') as fp:
        for i in range(10):
            fp.write('%s %d\n' % (filename, i))
if os.path.exists('fred'):
    os.remove('fred')

# open all the files and use zip to interleave the lines    
with open(output_file, 'w') as out_file, contextlib.ExitStack() as in_files:
    files = [in_files.enter_context(open(fname)) for fname in filenames]
    for lines in zip(*files):
        # if you're not sure last line has a \n
        for line in lines:
            out_file.write(line)
            if not line.endswith('\n'):
                out_file.write('\n')
        # if you are sure last line has a \n
        # out_file.write(''.join(lines))

print(open('fred').read())

If you are sure you have exactly 5 files, this will work. If you need to make this work on a varying number of files, it gets a bit more complex.

with open("file1.txt") as f:
    file1 = f.readlines()
with open("file2.txt") as f:
    file2 = f.readlines()
with open("file3.txt") as f:
    file3 = f.readlines()
with open("file4.txt") as f:
    file4 = f.readlines()
with open("file5.txt") as f:
    file5 = f.readlines()
outfile = open("outfile.txt", "w")
for aline in [line for foo in zip(file1, file2, file3, file4, file5) for line in foo]:
    outfile.write(aline)
outfile.close()

Your bash didn't work because you were trying to insert into a line which didn't exist before you insert.

echo "\n" > file_to_insert.txt
for i in {1..5};do
  for((num=1;num<66;num++);do
    line_num=$((num*i)
    queryline=$(sed -n '${num}p' 'file${i}.txt'
    sed -i "${num}i '$queryline'" 'file_to_insert.txt'
done

Here is an gnu awk (gnu do to the ARGIND (file selector))

awk -v t=5 '{c=c<FNR?FNR:c; for (i=1;i<=t;i++) if (ARGIND==i) a[i FS FNR]=$0} END {for (i=1;i<=c;i++) for (j=1;j<=t;j++) print a[j FS i]}' file1 file2 file3 file4 file5

You set t to the number of files.

Example:

cat f1
file1 one
file1 two
file1 three
file1 four

cat f2
file2 one
file2 two
file2 three
file2 four

cat f3
file3 one
file3 two
file3 three
file3 four

awk -v t=3 '{c=c<FNR?FNR:c; for (i=1;i<=t;i++) if (ARGIND==i) a[i FS FNR]=$0} END {for (i=1;i<=c;i++) for (j=1;j<=t;j++) print a[j FS i]}' f1 f2 f3
file1 one
file2 one
file3 one
file1 two
file2 two
file3 two
file1 three
file2 three
file3 three
file1 four
file2 four
file3 four

How does it work?

awk -v t=3 '                    # Set t to number of files
    {c=c<FNR?FNR:c              # Find the file with most records and store number in c
    for (i=1;i<=t;i++)      # Loop trough one and one file
        if (ARGIND==i)          # Test what file we are on
            a[i FS FNR]=$0}     # Stor data in array a
END {
    for (i=1;i<=c;i++)          # Loop trough line number
        for (j=1;j<=t;j++)      # Loop trough file number
            print a[j FS i]}    # Print data from array
' f1 f2 f3                      # Read the files

A good possibility to achieve what you want is to stick with standard utilities: here paste (specified by POSIX) is recommended:

paste -d '\n' file1 file2 file3 file4 file5

or, if you like Bashisms:

paste -d '\n' file{1..5}

This generalizes trivially to any number of files.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM