Python CSV比较与乘法

Question

I want to double-check my logic on how to put this together in Python, so examples are appreciated. 我想仔细检查一下如何在Python中将其组合在一起的逻辑，因此不胜感激。

I need to compare 2 CSV files (same format exactly with 2 rows and 6 columns) and provide the difference. 我需要比较2个CSV文件（具有2行和6列的相同格式），并提供区别。

I need to pull both in and multiply row 2, columns 2-6 by specific values (5), total them separately, and then compare with each other (CSV2 total/CSV1 total), and present in a percentage format. 我需要同时将第2行，第2-6列乘以特定值（5），分别进行总计，然后相互比较（CSV2总计/ CSV1总计），并以百分比格式显示。

import csv and reader seem like the way to go, but the tricky part for me has been pulling it into a list I can multiply against different values (or should I use a collection?), and then comparing the two in the most concise/efficient manner. 导入csv和阅读器似乎是一种方法，但是对我来说，棘手的部分是将其拉入一个列表中，我可以将其乘以不同的值（或者应该使用集合吗？），然后以最简洁的方式比较两者/有效的方式。

Code update (based on 2nd answer- was great, thanks! but now encountering error with calling my row values integers): 代码更新（基于第二个答案-太好了，谢谢！但是现在在调用我的行值整数时遇到错误）：

import csv
file1 = open('csv1.csv', 'rb')
csv1 = csv.DictReader(file1)

file2 = open('csv2.csv', 'rb')
csv2 = csv.DictReader(file2)


myList = csv2.fieldnames
myList.append('Difference')

outFile = open('outFilename.csv', 'wb')
outCsv = csv.DictWriter(outFile, myList)

file1Dict = dict()
file2Dict = dict()

for row in file1:
    file1Dict[row['key value']]['Total1'] = {'Total1':(int(row[1]) * .75 + int(row[2]) * 2.25 + int(row[3]) * 3.5 + int(row[4]) * 5 + int(row[5]) * 25)}

for row in file2:
    file2Dict[row['key value']]['Total2'] = {'Total2':(int(row[1]) * .75, int(row[2]) * 2.25, int(row[3]) * 3.5, int(row[4]) * 5, int(row[5]) * 25)}

outFile.writeheader()

for stuff in file1Dict:
    file1Dict[stuff]['Difference'] = str(int(int(file1Dict[stuff]['Total2']) / int(file1Dict[stuff]['Total1'])) * 100) + '\%'
    outFile.writerow(file1Dict[stuff])

print 'difference'

Answer 1

I think you should use Python Pandas and the built-in read_csv function, since this will be very efficient and put it into a rectangular form where any kind of math operations are trivial to apply and easy to compare across two different imported data sets. 我认为您应该使用Python Pandas和内置的read_csv函数，因为这将非常高效，并且可以将其转换为矩形形式，在其中可以进行任何类型的数学运算，而且可以轻松地在两个不同的导入数据集之间进行比较。

Note that after importing pandas , there is a global level read_csv , called just as pandas.read_csv("/path/to/file.csv") rather than needing to go through io.parsers as in the linked doc page above. 请注意，在导入pandas ，有一个全局级别的read_csv ，称为pandas.read_csv("/path/to/file.csv")而不需要像上面的链接文档页面一样经过io.parsers 。

There's nothing wrong with doing this through standard modules; 通过标准模块执行此操作没有错； I just think if you plan to do aggregate or broadcasted math operations, the rectangular array provided by Pandas that uses the efficient NumPy math operations is the best choice. 我只是认为，如果您打算进行汇总或广播数学运算，那么由Pandas提供的使用高效NumPy数学运算的矩形数组是最佳选择。

Answer 2

import csv
file1 = open('filename1.csv', 'rb')
csv1 = csv.DictReader(file1)

file2 = open('filename2.csv', 'rb')
csv2 = csv.DictReader(file2)


myList = csv2.fieldnames
myList.append('Total1','Total2', 'Difference')

outFile = open('outFilename.csv', 'wb')
outCsv = csv.DictWriter(outFile, myList)

file1Dict = dict()
file2Dict = dict()

for rows in file1:
    file1Dict[rows['key value']] = {rows[0], rows[1], 'Total1':int(rows[1]) * 5}

for rows in file2:
    file1Dict[rows['key value']]['Total2'] = {'Total2':int(rows[1]) * 5}

outFile.writeheader()

for stuff in file1Dict:
    file1Dict[stuff]['Difference'] = str(int(int(file1Dict[stuff]['Total2']) / int(file1Dict[stuff]['Total1'])) * 100) + '\%'
    outFile.writerow(file1Dict[stuff])

Just a quick put together of what you described, without non standard modules. 只需将您描述的内容快速汇总即可，无需非标准模块。

Python CSV比较与乘法

问题描述

2 个解决方案

解决方案1
0 2013-01-23 18:03:52

解决方案2
0 2013-01-23 20:05:43

Python CSV比较与乘法

问题描述

2 个解决方案

解决方案1 0 2013-01-23 18:03:52

解决方案2 0 2013-01-23 20:05:43

解决方案1
0 2013-01-23 18:03:52

解决方案2
0 2013-01-23 20:05:43