如何在分类响应中汇总数据以获取R中每种响应类型的百分比？

Question

I want to get percentages of categorical answer types for different types of questions (TYPE). 我想获取不同类型问题（TYPE）的分类答案类型的百分比。 I have multiple responses for each type for each individual, with multiple, categorical responses (different levels). 对于每个人的每种类型，我都有多个响应，以及多个分类响应（不同级别）。

1) each individual should be on a different row, and 1）每个人都应位于不同的行，并且
2) the columns should be the TYPES+Response Level, with the value being percentage of times that particular response level was given for that question type for that individual. 2）列应为TYPES + Response Level，其值为该个人对该问题类型给出特定响应级别的次数的百分比。

The DATA looks like this: 数据如下所示：

SUBJECT TYPE    RESPONSE  
John    a   kappa                       
John    b   gamma  
John    a   delta  
John    a   gamma  
Mary    a   kappa   
Mary    a   delta       
Mary    b   kappa  
Mary    a   gamma  
Bill    b   delta  
Bill    a   gamma

The result should look like this: 结果应如下所示：

SUBJECT a-kappa     a-gamma   a-delta   b-kappa     b-gamma b-delta
John    0.33        0.33      0.33      1.00        1.00    0.00
Mary    0.66        0.33      0.00      1.00        0.00    0.00
Bill    1.00        0.00      0.00      0.00        0.00    1.00

Based on c1au61o_HH's answer I was able to create something that works for my actual data file, but will still need some post-processing. 根据c1au61o_HH的回答，我能够创建一些适用于实际数据文件的内容，但仍需要进行一些后期处理。 (It is also not very elegant, but that's a minor concern.) （它也不是很优雅，但这是一个小问题。）

 Finaldf <- mydata %>%     
 group_by(Subject,Type) %>%     
 mutate(TOT = n()) %>%      
 group_by(Subject, Response, Type) %>%     
 mutate(RESPTOT = n())     

 Finaldf <- distinct(Finaldf)    
 Finaldf$Percentage <- Finaldf$RESPTOT/Finaldf$TOT

Any help is much appreciated, also please with some explanation. 任何帮助，不胜感激，也请一些解释。

Answer 1

Probably this is not the most efficient way, but if you want to use tidyverse you can unite the 2 columns and then do 2 different group_by to calculate totals for each subjects and percents. 可能这不是最有效的方法，但是如果您想使用tidyverse ，则可以将2列tidyverse ，然后进行2个不同的group_by来计算每个主题和百分比的总计。

library(tidyverse)
df %>% 
  unite(TYPE_RESPONSE, c("TYPE", "RESPONSE"), sep = "_") %>% 
  group_by(SUBJECT) %>% 
  mutate(TOT = n()) %>% 
  group_by(SUBJECT, TYPE_RESPONSE) %>% 
  summarize(perc = n()/TOT * 100) %>% 
  spread(TYPE_RESPONSE, perc)

DATA: 数据：

df <- tibble( SUBJECT= rep(c("John", "Mary","Bill"), each = 4), 
                 TYPE = rep(c("a","b"), 6),
                 RESPONSE = rep(c("kappa", "gamma", "delta"), 4)
)

EDIT in reply to comment: 编辑以回复评论：

I understand that you want to calculate the percentage by SUBJECT and TYPE , so the code would be something like this: 我了解您想通过SUBJECT和TYPE计算百分比，因此代码如下所示：

library(tidyverse)
df %>% 
  group_by(SUBJECT, TYPE) %>% 
  mutate(TOT = n()) %>%
  unite(TYPE_RESPONSE, c("TYPE", "RESPONSE"), sep = "_") %>% 
  group_by(SUBJECT, TYPE_RESPONSE) %>% 
  summarize(perc = n()/TOT * 100)%>% 
  spread(TYPE_RESPONSE, perc)

如何在分类响应中汇总数据以获取R中每种响应类型的百分比？

问题描述

1 个解决方案

解决方案1
1 2019-04-21 12:20:37

如何在分类响应中汇总数据以获取R中每种响应类型的百分比？

问题描述

1 个解决方案

解决方案1 1 2019-04-21 12:20:37

解决方案1
1 2019-04-21 12:20:37