Python: Pandas: Groupby & Pivot Tables are missing rows

Question

I have a dataframe composed of individuals (their ID's in), activities, and corresponding scores. I'm trying to get the sum of the scores when grouping by the student and an activity type. I can do this with the following:

data_detail.pivot_table(["total_scored","total_scored_omitted"], index = ["id","activity"], aggfunc="sum")

data_detail.groupby(["id","activity"]).sum()

However, when I check the results by looking at a typical student:

data_detail[data_detail["id"]== 41824840].sort_values("activity")

I see that there are some activities listed for that given student which are missing from the groupby/pivot table. How can I ensure the final groupby/pivot table is complete and isn't missing any values?

Answer 1

The problem is that the data type for the scores wasn't consistent (and a float at that!).

Some of them were strings. After I converted all of the scores into floats, the missing activities showed up.

As an added benefit, having the datatypes be uniform, made the calculation much faster!

Python: Pandas: Groupby & Pivot Tables are missing rows

Question

1 answers

solution1
1 ACCPTED 2016-08-17 20:56:36

Python: Pandas: Groupby & Pivot Tables are missing rows

Question

1 answers

solution1 1 ACCPTED 2016-08-17 20:56:36

solution1
1 ACCPTED 2016-08-17 20:56:36