使用三列的分组问题创建一个数据框

Question

I have the following dataframe: 我有以下数据帧：

       A               B                  C
  I am motivated     Agree                4
  I am motivated     Strongly Agree       5
  I am motivated     Disagree             6
  I am open-minded   Agree                4
  I am open-minded   Disagree             4
  I am open-minded   Strongly Disagree    3

Where column A is the question, column B is the answer, and column C is the frequency of "Strongly Agree", "Agree", "Disagree", and "Strongly Disagree" for the questions in column A. 如果列A是问题，则列B是答案，列C是A列中问题的“强烈同意”，“同意”，“不同意”和“非常不同意”的频率。

How can I convert it into the following dataframe? 如何将其转换为以下数据帧？

                  Strongly Agree    Agree     Disagree   Strongly Disagree
I am motivated        5               4           6             0
I am open-minded      0               4           4             3

I tried looking at groupby() for columns from other posts but could not figure it out. 我试着在groupby（）中查看其他帖子中的列，但无法弄明白。 Using python 3 使用python 3

Answer 1

Use DataFrame.pivot_table() method: 使用DataFrame.pivot_table（）方法：

In [250]: df.pivot_table(index='A', columns='B', values='C', aggfunc='sum', fill_value=0)
Out[250]:
B                 Agree  Disagree  Strongly Agree  Strongly Disagree
A
I am motivated        4         6               5                  0
I am open-minded      4         4               0                  3

Answer 2

Because these are already frequency counts, we can assume that we have unique Question / Opinion pairs. 因为这些已经是频率计数，我们可以假设我们有唯一的Question / Opinion对。 So, we can use set_index and unstack as there won't be a need to aggregate. 所以，我们可以使用set_index和unstack ，因为不会需要聚集。 This should save us some time with efficiency. 这应该可以为我们节省一些时间。 We could accomplish the same goal with pivot , however, pivot doesn't have a fill_value option that enables us to preserve dtype 我们可以使用pivot实现相同的目标，但是， pivot没有fill_value选项，使我们能够保留dtype

df.set_index(['A', 'B']).C.unstack(fill_value=0)

B                 Agree  Disagree  Strongly Agree  Strongly Disagree
A                                                                   
I am motivated        4         6               5                  0
I am open-minded      4         4               0                  3

Extra Credit 额外信用
Turn 'B' into a pd.Categorical and the columns will be sorted 将'B'转换为pd.Categorical ，列将被排序

df.B = pd.Categorical(
    df.B, ['Strongly Disagree', 'Disagree', 'Agree', 'Strongly Agree'], True)
df.set_index(['A', 'B']).C.unstack(fill_value=0)

B                 Strongly Disagree  Disagree  Agree  Strongly Agree
A                                                                   
I am motivated                    0         6      4               5
I am open-minded                  3         4      4               0

使用三列的分组问题创建一个数据框

问题描述

2 个解决方案

解决方案1
5 2017-05-03 21:57:33

解决方案2
3 已采纳 2017-05-03 21:58:30

使用三列的分组问题创建一个数据框

问题描述

2 个解决方案

解决方案1 5 2017-05-03 21:57:33

解决方案2 3 已采纳 2017-05-03 21:58:30

解决方案1
5 2017-05-03 21:57:33

解决方案2
3 已采纳 2017-05-03 21:58:30