简体   繁体   English

熊猫-对DataFrame列中的对象求和并与DataFrame连接

[英]pandas - sum objects in DataFrame column and join with DataFrame

I have a DataFrame like this: 我有一个像这样的DataFrame:

d = {'buy': Series([1., 0., 1., 0., 1., 1., 0., 0., 1., 1., 1., 0., 1., 0.]),
'id': Series([1., 2., 4., 2., 3., 4., 1., 1., 2., 1., 3., 3., 2., 3.]), 'datetime': Series(['01.02.2015',
'01.02.2015', '01.03.2015', '03.01.2015', '06.02.2015', '01.09.2015', '18.03.2015', '02.02.2015', '03.02.2015',
'06.04.2015', '01.04.2015', '03.04.2015', '02.04.2015', '20.03.2015'])}

df = DataFrame(d)
print(df)

    buy    datetime  id
0     1  01.02.2015   1
1     0  01.02.2015   2
2     1  01.03.2015   4
3     0  03.01.2015   2
4     0  06.02.2015   3
5     1  01.09.2015   4
6     0  18.03.2015   1
7     0  02.02.2015   1
8     1  03.02.2015   2
9     1  06.04.2015   1
10    1  01.04.2015   3
11    0  03.04.2015   3
12    1  02.04.2015   2
13    0  20.03.2015   3

Firstly, I group it by 'id' and receive only the latest 'datetime' from each 'id': 首先,我将其按“ id”分组,并仅从每个“ id”接收最新的“ datetime”:

df1 = df.sort(columns=['datetime']).drop_duplicates(subset='id', take_last=True)
print(df1)

    buy    datetime  id
5     1  01.09.2015   4
8     1  03.02.2015   2
6     0  18.03.2015   1
13    0  20.03.2015   3

And next I need to sum every id's 'buy' and join the new column (I named it buy_count') with my DataFrame. 接下来,我需要对每个ID的“购买”求和,并将新列(我命名为buy_count')与我的DataFrame合并。 I have smth like this: 我有这样的东西:

buys = df.groupby(by='id')['buy'].sum()

print(buys)

id
1    2
2    2
3    1
4    2

But I can't insert 'buy_count' to the DataFrame: 但是我无法在数据帧中插入“ buy_count”:

df1['buys_count'] = buys
print(df1)

    buy    datetime  id  buys_count
5     1  01.09.2015   4         NaN
8     1  03.02.2015   2         NaN
6     0  18.03.2015   1         NaN
13    0  20.03.2015   3         NaN

As I guess there is some trouble with indexes. 我猜索引有一些麻烦。 Tried to change indexes, try use loops, but all were unsuccessful. 试图更改索引,尝试使用循环,但均未成功。 How can I get this? 我怎么能得到这个?

You can call map against 'id' column of df1 and pass buys to perform a lookup: 您可以针对df1 “ id”列调用map并传递buys以执行查找:

In [270]:
df1['buy_count'] = df1['id'].map(buys)
df1

Out[270]:
    buy    datetime  id  buy_count
5     1  01.09.2015   4          2
8     1  03.02.2015   2          2
6     0  18.03.2015   1          2
13    0  20.03.2015   3          2

By the way I don't get the same output as you for buys 顺便说一句,我没有得到与您buys相同的输出

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM