How to get frequency count for a column value, sorted by aa categorical value in another column

Question

I have a pandas dataframe that includes two columns, vessel name and delay indicator. Vessel name is a string name of a vessel, and delay indicator is either a 0 or 1 (boolean).

My DataFrame:

df = pd.DataFrame({
    "Vessel.Name": ["Spirit of British Columbia", "Queen of New Westminster", "Spirit of Vancouver Island", "Coastal Celebration", "Spirit of British Columbia"],
    "Delay.Indicator":[0, 0, 0, 1, 0]
})

How it looks:

Vessel.Name                 Delay.Indicator
Spirit of British Columbia  0
Queen of New Westminster    0
Spirit of Vancouver Island  0
Coastal Celebration         1
Spirit of British Columbia  0

My goal is to get a DataFrame that includes each different ship name, and two new columns indicating its count, and its total number of "1" in delay indicator, for each different ship name. Not sure if there are pandas methods for this or if I should iterate through python lists?

Answer 1

A simple groupby with aggregate functions applied should do the trick:

df.groupby("Vessel.Name")["Delay.Indicator"].agg(['count', sum])

Output:

                            count   sum
Vessel.Name     
Coastal Celebration         1       1
Queen of New Westminster    1       0
Spirit of British Columbia  2       0
Spirit of Vancouver Island  1       0

How to get frequency count for a column value, sorted by aa categorical value in another column

Question

1 answers

solution1
0 ACCPTED 2019-10-25 03:08:02

How to get frequency count for a column value, sorted by aa categorical value in another column

Question

1 answers

solution1 0 ACCPTED 2019-10-25 03:08:02

solution1
0 ACCPTED 2019-10-25 03:08:02