簡體   English   中英

將file1的每一行復制到file2的其他每行中(Python)

[英]Copy each line of file1 into every other line of file2 (Python)

對不起,標題太荒謬了; 這可能就是為什么我無法在Google上找到答案的原因。

我有5個文本文件要合並為1。我想要這樣的格式:

line1 of file1
line1 of file2
line1 of file3
line1 of file4
line1 of file5
line2 of file1
line2 of file2
line2 of file3
line2 of file4
line2 of file5

等等。

我嘗試使用下面的bash命令,但對於sed或類似內容來說似乎太多了:它只是將文本插入到第一行,而不是我正在調用的變量的行。

for ((num=1; num<=66; num++)) ; do
    queryline=$(sed -n "${num}p" "file2.txt")
    sed -i "${num}i ${queryline}" "file1.txt"
done

(我也嘗試過)

for ((num=1; num<=66; num++)) ; do
    numa=$((num + 1))
    queryline=$(sed -n "${num}p" "file2.txt")
    sed -i "${numa}i ${queryline}" "file1.txt"
done

我認為使用python(3.4)可能會更容易,但是我不確定如何做到這一點。 提示請任何人?

使用contextlib.ExitStack()將輸入文件作為一個組來處理,並使用zip來讀取所有文件中的行:

import contextlib
import os

filenames = ['a','b','c','d','e']
output_file = 'fred'

# setup files for test
for filename in filenames:
    with open(filename, 'w') as fp:
        for i in range(10):
            fp.write('%s %d\n' % (filename, i))
if os.path.exists('fred'):
    os.remove('fred')

# open all the files and use zip to interleave the lines    
with open(output_file, 'w') as out_file, contextlib.ExitStack() as in_files:
    files = [in_files.enter_context(open(fname)) for fname in filenames]
    for lines in zip(*files):
        # if you're not sure last line has a \n
        for line in lines:
            out_file.write(line)
            if not line.endswith('\n'):
                out_file.write('\n')
        # if you are sure last line has a \n
        # out_file.write(''.join(lines))

print(open('fred').read())

如果確定只有5個文件,則可以使用。 如果您需要在不同數量的文件上執行此操作,它將變得更加復雜。

with open("file1.txt") as f:
    file1 = f.readlines()
with open("file2.txt") as f:
    file2 = f.readlines()
with open("file3.txt") as f:
    file3 = f.readlines()
with open("file4.txt") as f:
    file4 = f.readlines()
with open("file5.txt") as f:
    file5 = f.readlines()
outfile = open("outfile.txt", "w")
for aline in [line for foo in zip(file1, file2, file3, file4, file5) for line in foo]:
    outfile.write(aline)
outfile.close()

您的bash無效,因為您嘗試插入的行在插入之前不存在。

echo "\n" > file_to_insert.txt
for i in {1..5};do
  for((num=1;num<66;num++);do
    line_num=$((num*i)
    queryline=$(sed -n '${num}p' 'file${i}.txt'
    sed -i "${num}i '$queryline'" 'file_to_insert.txt'
done

這是一個gnu awk (對ARGIND (文件選擇器)執行gnu do)

awk -v t=5 '{c=c<FNR?FNR:c; for (i=1;i<=t;i++) if (ARGIND==i) a[i FS FNR]=$0} END {for (i=1;i<=c;i++) for (j=1;j<=t;j++) print a[j FS i]}' file1 file2 file3 file4 file5

您將t設置為文件數。

例:

cat f1
file1 one
file1 two
file1 three
file1 four

cat f2
file2 one
file2 two
file2 three
file2 four

cat f3
file3 one
file3 two
file3 three
file3 four

awk -v t=3 '{c=c<FNR?FNR:c; for (i=1;i<=t;i++) if (ARGIND==i) a[i FS FNR]=$0} END {for (i=1;i<=c;i++) for (j=1;j<=t;j++) print a[j FS i]}' f1 f2 f3
file1 one
file2 one
file3 one
file1 two
file2 two
file3 two
file1 three
file2 three
file3 three
file1 four
file2 four
file3 four

它是如何工作的?

awk -v t=3 '                    # Set t to number of files
    {c=c<FNR?FNR:c              # Find the file with most records and store number in c
    for (i=1;i<=t;i++)      # Loop trough one and one file
        if (ARGIND==i)          # Test what file we are on
            a[i FS FNR]=$0}     # Stor data in array a
END {
    for (i=1;i<=c;i++)          # Loop trough line number
        for (j=1;j<=t;j++)      # Loop trough file number
            print a[j FS i]}    # Print data from array
' f1 f2 f3                      # Read the files

要實現您想要的目標,一個很好的可能性就是堅持使用標准實用程序:在這里建議paste (由POSIX指定):

paste -d '\n' file1 file2 file3 file4 file5

或者,如果您喜歡Bashisms:

paste -d '\n' file{1..5}

這可以簡單地推廣到任意數量的文件。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM