简体   繁体   中英

Processing a CSV file and combining values with python

I have a CSV file that has data in the following way

"/file/Puppies";"$2,166.74";"2,502";"5.55%";"$48.10";"152,844";"45,044"
"/file/Kittens";"$1,498.59";"1,618";"3.54%";"$32.75";"157,560";"45,764"
"/file/Puppies/pup";"$1,174.92";"1,451";"3.72%";"$30.10";"116,268";"39,038"

And I want to combine the 2nd column if the first column is similar, the rest of the values don't matter.

So in the example both /file/Puppies and /file/Puppies/pup values in column 2 would be added together in the final output.

By similar I mean that for example /file/Puppies/ , /file/Puppies/1 , /file/Puppies/ru would all be similar. but /file/Kittens would not.

Any ideas on how to get started?

Construct a dictionary to hold the values, then add to each dictionary value with each row

Values = {}
from csv import reader
with open('CSVFile.csv', 'r') as filehandle :
    reader = reader(filehandle,delimiter=';')
    for row in reader :
        Class = row[0].split('/')[2]
        Value = float(row[1].strip('$').replace(',', ''))
        if Class in Values :
            Values[Class] = Values[Class] + Value
        else :
            Values[Class] = Value
print Values

Here, I've made some simplifying assumptions about what you mean by "similar" -- namely, I assume that you mean the first thing that follows '/file/' and continues until the end of that field or the next '/'. That's what I call the Class .

Then, I find the value by taking the second column from your data, stripping off the '$', removing the commas, and converting to a float.

Then, because we're constructing a dictionary, we have to test whether we've already seen a Puppy, or whatever. If so, just add to the previous value; if not, set the value.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM