Return the count of unique column entries per day in a datetime DataFrame

Question

I have a DataFrame which looks like this:

            Col1    Col2        Col3    Col4
Datetime                                    
2016-11-01     1    Male  01/11/2016  Durham
2016-11-01     2  Female  01/11/2016  Durham
2016-11-02     3  Female  02/11/2016     New
2016-11-02     4    Male  02/11/2016     Ips
2016-11-03     5    Male  03/11/2016  Durham

What I am trying to do, is return the count of Col4 entries per day and hence return information like:

            ColA        ColB
Datetime                                    
2016-11-01     Durham   2
2016-11-02     New      1
2016-11-02     Ips      1
2016-11-03     Durham   1

IE Durham occurred twice on the 1st, so it has a count of 2. New and Ips both occurred once on the 2nd, so they both have a count of 1. Finally Durham occurred once on the 3rd, so it will be given a count of 1.

Ultimately I am trying to define a "frequency" so that I can define a "hotspot" by region. If something occurs at least once every day, then I'll call it a "hotspot".

Answer 1

You can use groupby on ( Datetime , Col4 ) + count here.

df = df.groupby([df.index, df.Col4]).Col4.count().reset_index(level=1, name='ColB')

Or,

df = df.groupby([df.index, df.Col4]).size().reset_index(level=1)

Next, set the column names:

df.columns = ['ColA', 'ColB']

df

              ColA  ColB
Datetime                
2016-11-01  Durham     2
2016-11-02     Ips     1
2016-11-02     New     1
2016-11-03  Durham     1

Return the count of unique column entries per day in a datetime DataFrame

Question

1 answers

solution1
2 ACCPTED 2018-01-14 21:57:42

Return the count of unique column entries per day in a datetime DataFrame

Question

1 answers

solution1 2 ACCPTED 2018-01-14 21:57:42

solution1
2 ACCPTED 2018-01-14 21:57:42