[英]python parsing csv file
I am parsing a csv file where the first line is the header. 我正在解析csv文件,其中第一行是标题。 I want to sum the amount column according to dates, but am getting an error message. 我想根据日期对金额列求和,但收到错误消息。 To debug I am checking if the column is a digit as well as if it is a string according to the error message - and it is both. 为了调试,我根据错误消息检查列是否是数字以及是否是字符串-两者都是。 What could be the reason for this? 这可能是什么原因?
def parseDataFromFile(self,f):
fh = open(f,'r')
s = 0
for line in fh:
#parsing the line according to comma and stripping the '\n' char
year,month,day,amount = line.strip('\n').split(',')
#checking the header row, could check if was first row as well - would be faster
if (amount == "Amount"): continue
#just for the debug checks
#here is the question
if isinstance(amount,str):
print "amount is a string"
#continue
if amount.isdigit:
print "amount is a digit"
#sum on the amount column
s = s + amount
Output: amount is a string amount is a digit amount is a string amount is a digit 输出:数量是一个字符串数量是一个数字数量是一个字符串数量是一个数字
Error: 错误:
s = s + amount
TypeError: unsupported operand type(s) for +: 'int' and 'str'
Your problem is that s
is an integer, you initialize it to 0
. 您的问题是s
是一个整数,您将其初始化为0
。 Then you try to add a string to it. 然后,您尝试向其中添加一个字符串。 amount
is always a string. amount
始终是一个字符串。 You do nothing to turn your number-like data into actual numbers, it will always be a string. 您无需执行任何操作即可将类似数字的数据转换为实际数字,它将始终是字符串。
If you expect amount to be a number, then use: 如果您希望金额为数字,请使用:
s += float(amount)
PS: you should use the csv
module in the stdlib for reading CSV files. PS:您应该使用stdlib中的csv
模块读取CSV文件。
if amount.isdigit:
print "amount is a digit"
will always print "amount is a digit" because you're not calling the method (it should be if amount.isdigit():
). 将始终打印“金额是数字”,因为您没有调用该方法(应为if amount.isdigit():
。
You can be sure that any field you get by splitting a line from a CSV file will be a string, you'll need to convert it to an int first: 您可以确定通过从CSV文件中拆分一行而获得的任何字段都是字符串,您需要先将其转换为int:
s = s + int(amount)
s是一个int,而amount是一个数字的字符串表示形式,因此将s = s + amount
更改为s += int(amount)
Something like?: (assuming column headers are "Year", "Month", "Day", "Amount") 类似于?:(假设列标题为“ Year”,“ Month”,“ Day”,“ Amount”)
from collections import defaultdict
import csv
sum_by_ym = defaultdict(float)
with open('input_file.csv') as f:
for row in csv.DictReader(f):
sum_by_ym[(row['Year'], row['Month'])] += int(float['Amount'])
print sum_by_ym
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.