简体   繁体   中英

python parsing csv file

I am parsing a csv file where the first line is the header. I want to sum the amount column according to dates, but am getting an error message. To debug I am checking if the column is a digit as well as if it is a string according to the error message - and it is both. What could be the reason for this?

def parseDataFromFile(self,f):
    fh = open(f,'r')
    s = 0
    for line in fh:
        #parsing the line according to comma and stripping the '\n' char
        year,month,day,amount = line.strip('\n').split(',')

        #checking the header row, could check if was first row as well - would be faster
        if (amount == "Amount"): continue

        #just for the debug checks
        #here is the question

        if isinstance(amount,str):
            print "amount is a string"
            #continue
        if amount.isdigit:
            print "amount is a digit"

        #sum on the amount column
        s = s + amount

Output: amount is a string amount is a digit amount is a string amount is a digit

Error:

s = s + amount 
TypeError: unsupported operand type(s) for +: 'int' and 'str'

Your problem is that s is an integer, you initialize it to 0 . Then you try to add a string to it. amount is always a string. You do nothing to turn your number-like data into actual numbers, it will always be a string.

If you expect amount to be a number, then use:

s += float(amount)

PS: you should use the csv module in the stdlib for reading CSV files.

if amount.isdigit:
    print "amount is a digit"

will always print "amount is a digit" because you're not calling the method (it should be if amount.isdigit(): ).

You can be sure that any field you get by splitting a line from a CSV file will be a string, you'll need to convert it to an int first:

s = s + int(amount)

s是一个int,而amount是一个数字的字符串表示形式,因此将s = s + amount更改为s += int(amount)

Something like?: (assuming column headers are "Year", "Month", "Day", "Amount")

from collections import defaultdict
import csv

sum_by_ym = defaultdict(float)
with open('input_file.csv') as f:
    for row in csv.DictReader(f):
        sum_by_ym[(row['Year'], row['Month'])] += int(float['Amount'])

print sum_by_ym

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM