简体   繁体   English

扩展到R中的一列

[英]Spreading over a column in R

So say I have a data frame like this: 所以说我有一个这样的数据框:

data.frame(x = c(1,1,1,3,3,3),y = c(12,32,43,16,32,65))

and I want to transform it into a data frame like this: 我想将其转换为这样的数据帧:

data.frame(x = c(1, 3), y_1 =  c(12,16), y_2 =c(32, 32),y_3= c(43, 65))

basically spreading the y values for each unique x value. 基本上将每个唯一x值的y值散布。 I've tried to do this using tidyr but can't quite see how it would work. 我尝试使用tidyr进行此操作,但还不太清楚它是如何工作的。 Any ideas? 有任何想法吗?

Thanks. 谢谢。

Here's a data.table solution: 这是一个data.table解决方案:

library(data.table)

dat = as.data.table(df) # or setDT to convert in place

dat[, obs := paste0('y_', 1:.N), by=x]
dcast(dat, x ~ obs, value.var="y")

#   x y_1 y_2 y_3
#1: 1  12  32  43
#2: 3  16  32  65

This will work even if the number of rows is not the same for all x . 即使所有x的行数都不相同,这也将起作用。

We can use aggregate , and then cSplit from splitstackshape package to coerce to data frame, 我们可以用aggregate ,然后cSplitsplitstackshape包强迫到数据帧时,

library(splitstackshape)
df1 <- aggregate(y ~ x, df, paste, collapse = ',')
df1 <- cSplit(df1, 'y', ',', direction = 'wide')
#   x y_1 y_2 y_3
#1: 1  12  32  43
#2: 3  16  32  65

The answer given by Sotos using aggregate is particularly elegant, but the following approach using reshape might also be instructive: Sotos使用aggregate给出的答案特别优雅,但是以下使用reshape方法也可能具有指导意义:

df <- data.frame(x = c(1,1,1,3,3,3),y = c(12,32,43,16,32,65))
df[,"time"] <- rep(1:3, 2)
wide_df <- reshape(df, direction="wide", timevar="time", idvar="x")

One option with dplyr/tidyr dplyr/tidyr一种选择

library(dplyr)
library(tidyr)
df1 %>% 
    group_by(x) %>% 
    mutate(n = paste("y", row_number(), sep="_")) %>%
    spread(n,y)
#     x   y_1   y_2   y_3
#   (dbl) (dbl) (dbl) (dbl)
#1     1    12    32    43
#2     3    16    32    65

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM