[英]How can I convert this long format dataframe into a wide format?
I am using RStudio
for data analysis in R
. 我正在使用
RStudio
在R
进行数据分析。 I currently have a dataframe
which is in a long format
. 我目前有一个
long format
的dataframe
。 I want to convert it into the wide format
. 我想将其转换为
wide format
。
An extract of the dataframe
( df1
) is shown below. dataframe
( df1
)的提取如下所示。 I have converted the first column into a factor
. 我已经将第一列转换为一个
factor
。
Extract: 提取:
df1 <- read.csv("test1.csv", stringsAsFactors = FALSE, header = TRUE)
df1$Respondent <- factor(df1$Respondent)
df1
Respondent Question CS Imp LOS Type Hotel
1 1 Q1 Fully Applied High 12 SML ABC
2 1 Q2 Optimized Critical 12 SML ABC
I want a new dataframe
(say, df2
) to look like this: 我想要一个新的
dataframe
(例如df2
)看起来像这样:
Respondent Q1CS Q1Imp Q2CS Q2Imp LOS Type Hotel
1 Fully Applied High Optimized Critical 12 SML ABC
How can I do this in R
? 我如何在
R
做到这一点?
Additional notes: I have tried looking at the tidyr
package and its spread()
function but I am having a hard time implementing it to this specific problem. 附加说明:我曾尝试查看
tidyr
程序包及其spread()
函数,但是很难解决这个特定问题。
This can be achieved with a gather
- unite
- spread
approach 这可以通过
gather
- unite
- spread
方法来实现
df %>%
group_by(Respondent) %>%
gather(k, v, CS, Imp) %>%
unite(col, Question, k, sep = "") %>%
spread(col, v)
# Respondent LOS Type Hotel Q1CS Q1Imp Q2CS Q2Imp
#1 1 12 SML ABC Fully Applied High Optimized Critical
df <- read.table(text =
" Respondent Question CS Imp LOS Type Hotel
1 1 Q1 'Fully Applied' High 12 SML ABC
2 1 Q2 'Optimized' Critical 12 SML ABC", header = T)
In data.table, this can be done in a one-liner.... 在data.table中,这可以单线完成。
dcast(DT, Respondent ~ Question, value.var = c("CS", "Imp"), sep = "")[DT, `:=`(LOS = i.LOS, Type = i.Type, Hotel = i.Hotel), on = "Respondent"][]
Respondent CSQ1 CSQ2 ImpQ1 ImpQ2 LOS Type Hotel 1: 1 Fully Applied Optimized High Critical 12 SML ABC
explained step by step 逐步说明
create sample data 创建样本数据
DT <- fread("Respondent Question CS Imp LOS Type Hotel
1 Q1 'Fully Applied' High 12 SML ABC
1 Q2 'Optimized' Critical 12 SML ABC", quote = '\'')
Cast a part of the datatable to desired format by question 通过提问将数据表的一部分转换为所需格式
colnames might not be what you want... you can always change them using setnames()
. colnames可能不是您想要的...您可以始终使用
setnames()
更改它们。
dcast(DT, Respondent ~ Question, value.var = c("CS", "Imp"), sep = "")
# Respondent CSQ1 CSQ2 ImpQ1 ImpQ2
# 1: 1 Fully Applied Optimized High Critical
Then join by reference on the orikginal DT, to get the rest of the columns you need... 然后在原始DT上通过引用加入,以获取您需要的其余列...
result.from.dcast[DT, `:=`( LOS = i.LOS, Type = i.Type, Hotel = i.Hotel), on = "Respondent"]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.