简体   繁体   中英

Calculate percentage of a csv column in python

I have this csv file about logged hours by users that looks roughly like this, but it's much larger (more users and projects):

User,Project,Hours
User1,ProjectA,5
User1,ProjectB,10
User2,ProjectA,7
User2,ProjectB,12

I have some code done that for now prints total logged hours for all users. It also prints data from only one user, as well as a line with total hours for that user.

What I wanted now is to use the total hours for a user, to calculate the percentage of a project time on the total. For example, what is the percentage of ProjectA on User1 time? Can anyone help, I've been trying to figure this out but so far couldn't. I'm quite new to python, so any hints or help is really appreciated.

Thanks in advance!

import csv
import collections

with open(<...>) as data_file:
    total_hours = collections.defaultdict(int)
    for row in csv.DictReader(data_file):
        total_hours[row['User']] += int(row['Hours'])

Or you could just read the data into a dictionary user -> project -> time and use that:

import functools

with open(<...>) as data_file:
    data = collections.defaultdict(
        functools.partial(collections.defaultdict, int))
    for row in csv.DictReader(data_file):
        data[row['User']][row['Project']] += int(row['Hours'])

and then

total_hours = {user: sum(time.values()) for user, time in data}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM