Sum up non-unique rows in DataFrame

Question

I have a dataframe like this:

id = [1,1,2,3]
x1 = [0,1,1,2]
x2 = [2,3,1,1]

df = pd.DataFrame({'id':id, 'x1':x1, 'x2':x2})

df
id  x1  x2
1   0   2
1   1   3
2   1   1
3   2   1

Some rows have the same id . I want to sum up such rows (over x1 and x2 ) to obtain a new dataframe with unique ids :

df_new
id  x1  x2
1   1   5
2   1   1
3   2   1

An important detail is that the real number of columns x1 , x2 ,... is large, so I cannot apply a function that requires manual input of column names.

Answer 1

As discussed you can use pandas groupby function to sum based on the id value:

df.groupby(df.id).sum()
# or
df.groupby('id').sum()

If you need don't want id to become the index then you can:

df.groupby('id').sum().reset_index()
# or
df.groupby('id', as_index=False).sum()   # @John_Gait

Answer 2

With pivot_table :

In [31]: df.pivot_table(index='id', aggfunc=sum)
Out[31]:
    x1  x2
id
1    1   5
2    1   1
3    2   1

Sum up non-unique rows in DataFrame

Question

2 answers

solution1
2 ACCPTED 2016-01-27 14:11:56

solution2
0 2016-01-27 14:22:57

Sum up non-unique rows in DataFrame

Question

2 answers

solution1 2 ACCPTED 2016-01-27 14:11:56

solution2 0 2016-01-27 14:22:57

solution1
2 ACCPTED 2016-01-27 14:11:56

solution2
0 2016-01-27 14:22:57