简体   繁体   English

python按规则合并文件

[英]python merge files by rules

I need to write script in python that accept and merge 2 files to a new file according to the following rule: 1)take 1 word from 1st file followed by 2 words from the second file. 我需要根据以下规则在python中编写脚本,该脚本接受并合并2个文件到一个新文件:1)从第一个文件中提取1个单词,然后从第二个文件中提取2个单词。 2) when we reach the end of 1 file i'll need to copy the rest of the other file to the merged file without change. 2)当我们到达1个文件的末尾时,我需要将其他文件的其余部分复制到合并的文件中,而无需进行更改。

I wrote that script, but i managed to only read 1 word from each file. 我写了那个脚本,但是我只能从每个文件中读取1个字。 Complete script will be nice, but I really want to understand by words how i can do this by my own. 完整的脚本会很不错,但是我真的很想通过文字了解我如何能自己做到这一点。

This is what i wrote: 这是我写的:

def exercise3(file1,file2):
    lstFile1=readFile(file1)
    lstFile2=readFile(file2)

    with open("mergedFile", 'w') as outfile:
        merged = [j for i in zip(lstFile1, lstFile2) for j in i]
        for word in merged:
            outfile.write(word)


def readFile(filename):
    lines = []
    with open(filename) as file:
        for line in file:
            line = line.strip()
            for word in line.split():
                lines.append(word)
    return lines

Your immediate problem is that zip alternates items from the iterables you give it: in short, it's a 1:1 mapping, where you need 1:2. 您的直接问题是zip替换您提供的可迭代项中的项目:简而言之,这是1:1映射,需要1:2。 Try this: 尝试这个:

lstFile2a = listfile2[0::2]
lstFile2b = listfile2[1::2]
... zip(lstfile1, listfile2a, lstfile2b)

This is a bit inefficient, but gets the job done. 这有点低效,但可以完成工作。

Another way is to zip up pairs (2-tuples) in lstFile2 before zipping it with lstFile1. 另一种方法是在使用lstFile1压缩之前,在lstFile2中压缩对(2元组)。 A third way is to forget zipping altogether, and run your own indexing: 第三种方法是完全忘记压缩,然后运行自己的索引:

for i in min(len(lstFile1), len(lstFile2)//2):
    outfile.write(lstFile1[i])
    outfile.write(lstFile2[2*i])
    outfile.write(lstFile2[2*i+1])

However, this leaves you with the leftovers of the longer file to handle. 但是,这使您留有较长文件的剩余部分要处理。

These aren't particularly elegant, but they should get you moving. 这些并不是特别优雅,但是它们应该让您感动。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM