简体   繁体   中英

Python Script for comparing multiple columns in 2 csv files

Hoping someone can help me or point me to a previous post with the correct info (I've been searching for a while without success),

I am really new to python scripting and am spending the time studying to get my skills up, however I suddenly need to do the following whilst I am learning - I am hoping this will help me understand Python a bit more whilst I am learning the basics elsewhere

I have two CSV's with the same column data but different headers - example below

----$csv1------  
ID, FirstName, Surname

1, John, Smith

2, Steve, Davis

, John,Parrot,

4, Dave,Smith

5, Alan, Taylor

----$csv2------  
Employee ID, First Name, Given Surname

1, John, Smith

2, Steven, Davis

3, John, Parrott

4, Dave, Allen

6, Mike, Angelo

My script requirements are to compare the 2 csv's and create a 3rd file with the results (results.csv)

  • If columns 1,2 & 3 match then append row to results.csv with 'Correct'
  • If columns 1 does not match but 2 & 3 do then append row to results.csv with 'Wrong ID'
  • If columns 2 does not match but 1 & 3 do then append row to results.csv with 'Wrong Firstname'
  • If columns 3 does not match but 1 & 2 do then append row to results.csv with 'Wrong Surname'
  • If entire row in $csv1 not in $csv2 then append row to results.csv with 'In CSV1 not CSV2'
  • If entire row in $csv2 not in $csv1 then append row to results.csv with 'In CSV2 not CSV1'

I know its a big ask but I'd be really grateful if anyone can provide a script with a bit of explanation to help me on my Python journey!

Thanks all.

----SCRIPT ADDED------

import csv
CSV1_tuples = []
CSV2_tuples = []

with open("DB1.csv") as CSV2:
    csv_CSV2 = csv.reader(CSV2)
    for row in csv_CSV2:
        CSV2_tuples.append(tuple(row[0:3]))

with open("DB2.csv") as CSV1:
    csv_CSV1 = csv.reader(CSV1)
    for row in csv_CSV1:
        CSV1_tuples.append(tuple(row[0:3]))
        if tuple(row[0:3]) in CSV2_tuples:
            print(( row[0:3] ), "In both DB1 & DB2")
        if tuple(row[1:3]) in CSV2_tuples:
            print(( row[0:3] ), "Wrong ID")
import csv
import re


def get_csv_data(csv_file, row, cell=None):
    """
    :param csv_file: Name of csv file
    :param row: Row number that you want( counting starts from top to bottom)
    :param cell: cell number that you want(counting starts from left to right)
    If you give a cell number, the content of that cell will be returned.
    If cell =
    :return: cell content
    """
    ls = []
    with open(csv_file, newline='') as csvfile:
        csv_file = csv.reader(csvfile, delimiter=' ', quotechar='|')
        for rows in csv_file:
            ls.append(str(rows[0]).split(","))
        if cell is not None:
            return re.sub(r'\W+', '', str(ls[row-1]).split(",")[cell-1])
        else:
            return ls[row-1]

print(get_csv_data('csv1.csv', 2, 2)) #get row 2, cell 2 from csv1 -> returns John
print(get_csv_data('csv1.csv', 2)) #get row 2 from csv1 -> returns a list with all values from the row: [1, 'John', 'Smith']


def write_to_csv(ls):
    """
    :param ls: list argument to be written in CSV file
    List item will be written as a row, with every list value on a separate cell
    :return: None
    """
    with open("results.csv", "w") as f:
        writer = csv.writer(f)
        writer.writerow(ls)

This is how you get data from a CSV file and write in another file. You can further implement the if statements

row = get_csv_data('csv1.csv', 1) #get first row from csv1.csv, as a list
row.append("Correct") #add the Correct value
write_to_csv(row) #write all row 1 to CSV - will be 1, John, Smith, Correct

I have to compare two files similar like the example but will look only colum 1 with IP. file 1 ( both are text file)

file1.txt

IP - MAC-address - Port - IDF 1 0.2.1.5 00:07:5f:c2:9b:f2 gi1/0/2 2 10.2.1.3 0007.5fc2.9bf4 gi1/0/3 3 10.2.1.7 0007.5fc2.9bf5 gi1/0/4 4

file2.txt

IP - MAC-address - Port 10.2.1.5 0007.5fc2.9bf6 gi1/0/2 10.2.1.9 0007.5fc2.9bf7 gi1/0/2 10.2.1.10 0007.5fc2.9bf8 gi1/0/2

out put file ( result.file) will match IP and rest of the content So the it only match IP and will put rest of the content from file 1 Result file

10.2.1.5 00:07:5f:c2:9b:f2 gi1/0/2 2 Thanks

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM