Pandas Groupby 仅针对特定字符串值对多列进行计数

Question

I have a data frame like this我有一个这样的数据框

dummy = pd.DataFrame([
('01/09/2020', 'TRUE', 'FALSE'),
('01/09/2020', 'TRUE', 'TRUE'),
('02/09/2020', 'FALSE', 'TRUE'),
('02/09/2020', 'TRUE', 'FALSE'),
('03/09/2020', 'FALSE', 'FALSE'),
('03/09/2020', 'TRUE', 'TRUE'),
('03/09/2020', 'TRUE', 'FALSE')], columns=['date', 'Action1', 'Action2'])

Now I want an aggregation of 'TRUE' action per day, which should look like现在我想要每天聚合“TRUE”动作，它应该看起来像

I applied group by, sum and count etc but nothing is working for me as it i have to aggegate multiple columns and I don't want to split the table into multiple dataframes and resolve it indivisually and merge into one, can someone please suggest any smart way to do it.我应用了 group by、sum 和 count 等，但没有什么对我有用，因为我必须聚合多个列，我不想将表拆分为多个数据框并单独解决并合并为一个，有人可以建议任何聪明的方式来做到这一点。

Answer 1

True and False in your dummy df are strings, you can convert them to int and sum虚拟 df 中的 True 和 False 是字符串，您可以将它们转换为 int 和 sum

dummy.replace({'TRUE':1,'FALSE':0}).groupby('date',as_index = False).sum()

    date        Action1 Action2
0   01/09/2020  2       1
1   02/09/2020  1       1
2   03/09/2020  2       1

Answer 2

You can also try:你也可以试试：

dummy.set_index(['date']).eq('TRUE').sum(level='date')

Output: Output：

            Action1  Action2
date                        
01/09/2020        2        1
02/09/2020        1        1
03/09/2020        2        1

Answer 3

Anyone seeing this answer should look at the answers by @QuangHoang or @Vaishali任何看到这个答案的人都应该看看@QuangHoang或@Vaishali的答案
They are much better answers.它们是更好的答案。 I can't control what the OP chooses, but you should go upvote those answers.我无法控制 OP 选择什么，但您应该 go 支持这些答案。

Inspired by @QuangHoang灵感来自@QuangHoang

dummy.iloc[:, 1:].eq('TRUE').groupby(dummy.date).sum()

            Action1  Action2
date                        
01/09/2020        2        1
02/09/2020        1        1
03/09/2020        2        1

OLD ANSWER旧答案

Fix your dataframe such that it has actual True / False values修复您的 dataframe 使其具有实际的True / False值

from ast import literal_eval

dummy = dummy.assign(**dummy[['Action1', 'Action2']].applymap(str.title).applymap(literal_eval))

Then use groupby然后使用groupby

dummy.groupby('date').sum()

            Action1  Action2
date                        
01/09/2020        2        1
02/09/2020        1        1
03/09/2020        2        1

Answer 4

In [7]: dummy Out[7]: date Action1 Action2 0 01/09/2020 TRUE FALSE 1 01/09/2020 TRUE TRUE 2 02/09/2020 FALSE TRUE 3 02/09/2020 TRUE FALSE 4 03/09/2020 FALSE FALSE 5 03/09/2020 TRUE TRUE 6 03/09/2020 TRUE FALSE In [9]: dummy.groupby(['date'], as_index=False).agg(lambda x: x.eq('TRUE').sum()) Out[9]: date Action1 Action2 0 01/09/2020 2 1 1 02/09/2020 1 1 2 03/09/2020 2 1

Answer 5

You can also use pivot table:您还可以使用 pivot 表：

dummy.pivot_table(index='date', values=['Action1', 'Action2'], 
                  aggfunc=lambda x: (x=='TRUE').sum()).reset_index()

Output: Output：

          date  Action1 Action2
0   01/09/2020        2       1
1   02/09/2020        1       1
2   03/09/2020        2       1

Answer 6

On the similar path using .resample在使用.resample的类似路径上

...
dummy['date'] = pd.to_datetime(dummy['date'], dayfirst=True)
dummy[['Action1', 'Action2']] = dummy[['Action1', 'Action2']].replace({'TRUE':True, 'FALSE': False})

# set date to index
dummy.set_index('date', inplace=True)

dummy.resample('1D').sum()

See resample documentation请参阅重采样文档

Pandas Groupby 仅针对特定字符串值对多列进行计数

问题描述

6 个解决方案

解决方案1
7 2021-03-24 14:35:06

解决方案2
5 2021-03-24 14:36:39

解决方案3
4 已采纳 2021-03-24 14:34:53

Inspired by @QuangHoang灵感来自@QuangHoang

OLD ANSWER旧答案

解决方案4
1 2021-03-24 14:37:50

解决方案5
1 2021-03-24 14:41:19

解决方案6
1 2021-03-24 14:44:33

Pandas Groupby 仅针对特定字符串值对多列进行计数

问题描述

6 个解决方案

解决方案1 7 2021-03-24 14:35:06

解决方案2 5 2021-03-24 14:36:39

解决方案3 4 已采纳 2021-03-24 14:34:53

Inspired by @QuangHoang灵感来自@QuangHoang

OLD ANSWER旧答案

解决方案4 1 2021-03-24 14:37:50

解决方案5 1 2021-03-24 14:41:19

解决方案6 1 2021-03-24 14:44:33

解决方案1
7 2021-03-24 14:35:06

解决方案2
5 2021-03-24 14:36:39

解决方案3
4 已采纳 2021-03-24 14:34:53

解决方案4
1 2021-03-24 14:37:50

解决方案5
1 2021-03-24 14:41:19

解决方案6
1 2021-03-24 14:44:33