简体   繁体   中英

How to get sum of specific column values of a list based on a condition in one column

I have lists like below of length 8 which contains datewise data

['2019-07-04', 'default_default', 'FB', 'ios', '0', '0', '0', '0']
['2019-07-04', 'default_default', 'FB', 'default', '1', '1', '0', '1']
['2019-07-04', 'default_default', 'FB', 'android', '0', '0', '0', '0']
['2019-07-04', 'default_default', 'NON-FB', 'android', '2', '1', '1', '1']
['2019-07-04', 'default_default', 'NON-FB', 'default', '1', '2', '1', '1']
['2019-07-04', 'default_default', 'NON-FB', 'ios', '1', '1', '1', '2']

I want to get the count of the 5th, the 6th column for FB and NON-FB so on for the date 2019-07-04

so the final list will be

date|FB|NON-FB|total

[2019-07-04,1,4,5]
[2019-07-05,1,2,4]
... 
.. so on

Below is my code

 dic={}
 data=[str(item) for item in datewise]
 data[0]="date"
 dic[data[0]]={"FB":0,"NON-FB":0}
 dic[data[0]]["FB"]=sum(row[5] for row in data if row=="FB")

Right now I'm doing with loops and dictionary, is there any simple way to get the desired output?

You could use loops, but data that looks like this and needs to be manipulated like you're describing is a good use case for pandas:

import pandas as pd

df = (pd.DataFrame(datewise, 
                   columns=['col0','col1','col2','col3','col4','col5','col6','col7'])
 .assign(col5= lambda d: d.col5.astype(int))
 .pivot_table(index='col0', columns='col2', values='col5', aggfunc=sum)
 .assign(total= lambda d: d.sum(axis=1))
)

print(df)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM