简体   繁体   中英

Python compare two csv files and append data to csv file

I am having two csv files in the following format:

The first is outputTweetsDate.csv:

Here is some text;13.09.13 16:45
Here is more text;13.09.13 16:45
And yet another text;13.09.13 16:46

The second file is apiSheet.csv:

13.09.13 16:46;89.56
13.09.13 16:45;90.40

I want to compare these two files and if the two datetime values match add the text and data to one new file (finalOutput.csv):

|89.56|,|Here is some text|
|89.56|,|Here is more text|
|90.49|,|And yet another text|

This is my code I have so far:

with open("apiSheet.csv", "U") as in_file1, open("outputTweetsDate.csv", "rb") as in_file2,open("finalOutput.csv", "wb") as out_file:
   reader1 = csv.reader(in_file1,delimiter=';')
   reader2 = csv.reader(in_file2,delimiter='|')
   writer = csv.writer(out_file,delimiter='|')
   for row1 in reader1:
       for row2 in reader2:
           if row1[0] == row2[1]:
               data = [row1[1],row2[0]]
               print data
               writer.writerow(data)

I edited my code and it now works so far, but it does not iterate trough all of my code correctly. Momentarily my output is this:

|89.56|,|Here is some text|
|89.56|,|Here is more text|

So it does not show me the third one, even if they are the same. It seems like it is not iterating good through the files.

Thank you!

Your second loop reach the end of file2 (outputTweetsDate.csv) before the second line of file1 is read.

Try this snippet :

 with open("apiSheet.csv", "U") as in_file1, open("outputTweetsDate.csv", "rb") as in_file2,open("finalOutput.csv", "wb") as out_file:
   reader1 = csv.reader(in_file1,delimiter=';')
   reader2 = csv.reader(in_file2,delimiter='|')
   writer = csv.writer(out_file,delimiter='|')
   row2 = reader2.next()
   for row1 in reader1:
       while row2 and row1[0] <= row2[1]:
           if row1[0] == row2[1]:
               data = [row1[1],row2[0]]
               print data
               writer.writerow(data)
           row2 = reader2.next()

Edit The inverse orders are tricky. Let's stop trying to be clever and do some brute force. It will work flawlessly as the files are far less than your RAM.

 with open("apiSheet.csv", "U") as in_file1, open("outputTweetsDate.csv", "rb") as in_file2,open("finalOutput.csv", "wb") as out_file:
   reader1 = csv.reader(in_file1,delimiter=';')
   reader2 = csv.reader(in_file2,delimiter='|')
   writer = csv.writer(out_file,delimiter='|')

   rows2 = [row for row in reader2] # all the content of file2 goes in RAM.
   for row1 in reader1:
       for row2 in rows2:
           if row1[0] == row2[1]:
               data = [row1[1],row2[0]]
               print data
               writer.writerow(data)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM