繁体   English   中英

从两个文件读取

[英]Reading from two files

我正在尝试编写一个脚本,该脚本将使用几个2列文件,将第一个和第二个列从第一个列写入结果文件,然后仅将第二个列从所有其他文件中写入并追加到它们。

例:

File one                         File two
Column 1     Column 2            dont take this column      Column 2
Line 1       Line 2              dont take this column      Line 2

最终结果应该是

Result file
Column 1    Column 2    Column 2     
Line1       Line 2      Line 2
etc

除了将第二列添加到第一列之外,我几乎可以进行所有工作。 我将ResultFile作为r +,我想读出其中的行(第一个文件数据),然后从其他文件中读取相应的行,将其追加,然后放回去。

这是第二部分的代码:

#Open each subsequent file for 2nd column data
while n < i:
    with open(FileNames[n], "r") as InputFile
        with ResultFile:
            Temp2 = ResultFile.readline()
            for line in InputFile:
                Temp2 += line.split(",", 1)[-1]
                if line == LastValue:
                    break
            if len(ResultFile,readline()) == "":
                break
        YData += (Temp2 + "\n")
    n += 1
InputFile.close

中断IF在atm上无法正常工作,我只需要一种方法来结束无限循环。 LastValue也等于第一个文件中的最后x列值。

任何帮助,将不胜感激

编辑
我正在尝试不使用itertools。

首先打开所有文件并将其存储在列表中可能会有所帮助。

fileHandles = []
for f in fileNames:
    fileHandles.append(open(f))

然后,您可以按照第一个文件中的每一行的顺序对它们进行readline()。

dataLine = fileHandles[0].readline()
while dataLine:
     outFields = dataLine.split(",")[0:2]
     for inFile in fileHandles[1:]:
          dataLine = inFile.readline()
          field = dataLine.split(",")[1]
          outFields.append(field)
     print ",".join(outFields)
     dataLine = fileHandles[0].readline()

从根本上讲,您希望像zip与迭代器一样同时遍历所有输入文件。

此示例说明了不分散文件和csvs的模式:

file_row_col = [[['1A1', '1A2'],  # File 1, Row A, Column 1 and 2
                 ['1B1', '1B2']], # File 1, Row B, Column 1 and 2
                [['2A1', '2A2'],  # File 2
                 ['2B1', '2B2']],
                [['3A1', '3A2'],  # File 3
                 ['3B1', '3B2']]]

outrows = []

for rows in zip(*file_row_col):
    outrow = [rows[0][0]] # Column 1 of the first file
    for row in rows:
        outrow.extend(row[1:]) # Only Column 2 and on
    outrows.append(outrow)

# outrows is now [['1A1', '1A2', '2A2', '3A2'],
#                 ['1B1', '1B2', '2B2', '3B2']]

关键是zip(*file_row_col)完成的转换。

现在,让我们用实际文件重新实现此模式。 我将使用csv库使读和写csvs更加容易和安全。

import csv

infilenames = ['1.csv','2.csv','3.csv']
outfilename = 'result.csv'

with open(outfilename, 'wb') as out: 
    outcsv = csv.writer(out)
    infiles = []
    # We can't use `with` with a list of resources, so we use
    # try...finally the old-fashioned way instead.
    try:
        incsvs = []
        for infilename in infilenames:
            infile = open(infilename, 'rb')
            infiles.append(infile)
            incsvs.append(csv.reader(infile))
        for inrows in zip(*incsvs):
            outrow = [inrows[0][0]] # Column 1 of file 1
            for inrow in inrows:
                outrow.extend(inrow[1:])
            outcsv.writerow(outrow)
    finally:
        for infile in infiles:
            infile.close()

给定这些输入文件:

#1.csv
1A1,1A2
1B1,1B2

#2.csv
2A1,2A2
2B1,2B2

#3.csv
3A1,3A2
3B1,3B2

代码产生了这个result.csv

1A1,1A2,2A2,3A2
1B1,1B2,2B2,3B2

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM