简体   繁体   English

转换熊猫数据框时遇到问题

[英]Problems transforming a pandas dataframe

I have troubles converting a pandas dataframe into the format i need in order to analyze it further. 我在将熊猫数据框转换为我需要进行进一步分析的格式时遇到了麻烦。 The current data is derived from a survey where we asked people to order preferred means of communication (1=highest,4=lowest). 当前数据来自一项调查,我们在调查中要求人们订购首选的沟通方式(1 =最高,4 =最低)。 Every row is a respondee. 每行都是一个受访者。

The current dataframe: 当前数据框:

    A   B   C   D
0   1   2   4   3
1   2   3   1   4
2   2   1   4   3
3   2   1   4   3
4   1   3   4   2

... ...

For data analysis i want to transform this into the following dataframe, where every row is a different means of communication and the columns are the counts how often a person ranked it in that spot. 对于数据分析,我想将其转换为以下数据框,其中每一行是一种不同的通信方式,而列则是一个人在该位置对其进行排名的频率的计数。

   1st  2d   3th  4th
A  2    3    0    0
B  2    1    2    0
C  1    0    0    4
D  0    1    3    1

I have tried apply defined functions on the original dataframe, i have tried to apply .groupby function or .T on the dataframe with I don't seem to come closer to the result I actually want. 我试图在原始数据帧上应用定义的函数,我试图在数据帧上应用.groupby函数或.T,但似乎并没有更接近我想要的结果。

This is the function I wrote but I can't figure out how to apply it correctly to give me the desired result. 这是我编写的函数,但是我无法弄清楚如何正确应用它来给我想要的结果。

def count_values_rank(column,rank):
    total_count_n1 = 0
    for i in column:
        if i == rank:
            total_count_n1 += 1
    return total_count_n1

Running this piece of code on a single column of my dataframe get's the desired results but having troubles to actually write it so i can apply it to the dataframe and get the result I am looking for. 在我的数据框的单个列上运行这段代码会获得所需的结果,但是很难实际编写它,因此我可以将其应用于数据框并获得我想要的结果。 The below line of code would return 2. 下面的代码行将返回2。

count_values_rank(df.iloc[:,0],'1')

It is probably a really obvious solution but having troubles seeing the easiest way to solve this. 这可能是一个非常明显的解决方案,但是很难找到最简单的解决方法。

Thanks alot! 非常感谢!

melt with crosstab crosstab melt

pd.crosstab(df.melt().variable,df.melt().value).add_suffix('st')
Out[107]: 
value        1st   2st   3st   4st
variable                        
A            2     3     0     0
B            2     1     2     0
C            1     0     0     4
D            0     1     3     1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM