input:
Date letters numbers mixed new
0 1/2/2014 a 6 z1 1/2/2014 a
1 1/2/2014 a 3 z1 1/2/2014 a
2 1/3/2014 c 1 x3 1/3/2014 c
I want to groupby new
and sum numbers
so that the output is:
Date letters numbers mixed new
0 1/2/2014 a 9 z1 1/2/2014 a
1 1/3/2014 c 1 x3 1/3/2014 c
I've looked through here: http://pandas.pydata.org/pandas-docs/stable/groupby.html but no luck.
Here is my code:
import pandas
a=[['Date', 'letters', 'numbers', 'mixed'], ['1/2/2014', 'a', '6', 'z1'], ['1/2/2014', 'a', '3', 'z1'], ['1/3/2014', 'c', '1', 'x3']]
df = pandas.DataFrame.from_records(a[1:],columns=a[0])
f=[]
for i in range(0,len(df)):
f.append(df['Date'][i] + ' ' + df['letters'][i])
df['new']=f
Also, any tricks that will concatenate date
and letters
without looping thru would also be helpful.
You have to convert your numbers
column to int
import pandas as pd
a=[['Date', 'letters', 'numbers', 'mixed'], ['1/2/2014', 'a', '6', 'z1'], ['1/2/2014', 'a', '3', 'z1'], ['1/3/2014', 'c', '1', 'x3']]
df = pd.DataFrame.from_records(a[1:],columns=a[0])
df['new'] = df.Date + " " + df.letters
df.numbers = df.numbers.astype(int)
print df
Date letters numbers mixed new
0 1/2/2014 a 6 z1 1/2/2014 a
1 1/2/2014 a 3 z1 1/2/2014 a
2 1/3/2014 c 1 x3 1/3/2014 c
You can get the dataframe you want to merge with:
df_to_merge = df[df.columns[~df.columns.isin(['numbers'])]].drop_duplicates()
Then you can do your groupby
df_grouped = pd.DataFrame(df.groupby('new').numbers.sum()).reset_index()
To get the result you posted merge
df_result = df_to_merge.merge(df_grouped)
print df_result
Date letters mixed new numbers
0 1/2/2014 a z1 1/2/2014 a 9
1 1/3/2014 c x3 1/3/2014 c 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.