简体   繁体   English

如何根据R中的多个条件从大数据帧中提取不同长度的向量

[英]How to extract vectors of different lengths from large dataframe depending on multiple conditions in R

I have a data frame in R that consists of 3 columns. 我在R中有一个包含3列的数据框。 It looks a bit like this: 它看起来像这样:

  x      id trialNumber
1 1.4788 subj_01    trial010
2 1.4794 subj_01    trial010
3 1.4823 subj_01    trial010
4 1.4845 subj_01    trial010
5 1.4889 subj_01    trial010
6 1.4901 subj_01    trial010
...
20121 -1.3597 subj_03    trial042
20122 -1.3601 subj_03    trial042
20123 -1.3667 subj_03    trial042
20124 -1.3713 subj_03    trial042
20125 -1.3800 subj_03    trial042
20126 -1.3857 subj_03    trial042

I want to create a new data frame that consists of multiple columns for x; 我想创建一个新的数据框,其中包含x的多个列; where the columns are defined by id and trialNumber. 列由id和trialNumber定义。 The number of rows of each combination of id and trialNumber varies. id和trialNumber的每种组合的行数有所不同。 The number of rows in the new data frame should correspond to the largest number of rows of all the id and trialNumber combinations. 新数据框中的行数应与所有id和trialNumber组合中的最大行数相对应。 The result should look sth like this: 结果应该看起来像这样:

x1      x2   ... xi
1.4788  1.5678  ...
1.4794  1.5789  ...
1.4823  1.5984  ...
1.4845  ...     ...
1.4889  NA      ...
1.4901  NA      -1.3713
...     ...     -1.3800
NA      ...     -1.3857

x1 to xi in the new data frame should correspond to each unique combination of id and trialNumber in the original data frame, eg x1 would correspond to all x where id == 'subj01' and trialNumber == 'trial010'. 新数据帧中的x1至xi应该对应于原始数据帧中id和trialNumber的每个唯一组合,例如x1将对应于所有x,其中id =='subj01'和trialNumber =='trial010'。

There are a lot of combinations of id and trialNumber, so I don't want to manually define the conditions by which to subset the original data frame. id和trialNumber的组合很多,所以我不想手动定义对原始数据帧进行子集化的条件。

You could try (a suggestion after reading the above comments): 您可以尝试(阅读以上评论后的建议):

tapply(df$x, paste0(df$id,df$trialNumber), function(x) data.frame(mean = mean(x), lower_limit = mean(x) - sd(x), upper_limit = mean(x) + sd(x)))
$subj_01trial010
      mean lower_limit upper_limit
1 1.484871    1.479965    1.489778

$subj_03trial042
       mean lower_limit upper_limit
1 -1.370583   -1.381177    -1.35999

Or using aggregate you get a nicer outpur format: 或者使用aggregate您会得到更好的输出格式:

aggregate(x ~ id + trialNumber, data = df, FUN = function(x) c(mean = mean(x), lower_limit = mean(x) - sd(x), upper_limit = mean(x) + sd(x)))
       id trialNumber    x.mean x.lower_limit x.upper_limit
1 subj_01    trial010  1.484871      1.479965      1.489778
2 subj_03    trial042 -1.370583     -1.381177     -1.359990

Here's an approach if you really want columns of x for each combination of trial and subject bound together: 如果您确实希望将试验和主题的每种组合的x列绑定在一起,则可以采用以下方法:

#step 1: create vector of x per combination

step1 <- split(dat2$x, list(dat2$trial,dat2$subject))

#calculate max length(to add padding)
max_length <- max(sapply(step1,length))

#make all vectors same length padded with NA
step2 <- lapply(step1, function(x){
  length(x) <- max_length
  x
})

#combine

res <- do.call(cbind,step2)
res

Code used for data generating: 用于生成数据的代码:

set.seed(100)

dat1 <-expand.grid(trial=sprintf("trial_%.03d",1:10), 
                   subject= sprintf("subj_%.02d",1:3))

dat2 <- dat1[sample(nrow(dat1),1000,T),]
dat2$x <- rnorm(nrow(dat2))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 R-从多个不同长度的向量制作一个图形 - R- making a figure from multiple vectors of different lengths 当JSON中的向量长度不同时,如何将JSON字符串作为数据帧导入R中? - How do I import a JSON string into R as a dataframe when vectors in the JSON are different lengths? R中不同长度的向量组合 - Combination of Vectors in R of different Lengths R-根据多个条件提取值 - R - Extract values depending on multiple conditions 根据R中不同长度的唯一矢量创建数据帧? - Create dataframe from unique vectors of differing lengths in R? 比较不同长度的多个向量 - compare multiple vectors of different lengths 从R中具有不同长度的向量获取所有布尔比较 - Get all boolean comparisons from vectors with different lengths in R R编程 - 如何创建具有不同长度的向量的二维数组 - R programming - How to create a 2 dimensional array of vectors which are of different lengths 在 R 中将具有不同长度和两个条件的不同数据帧的列相乘 - Multiplying columns from different dataframes with different lengths and two conditions in R 如何根据来自2个以上其他数据帧的条件从数据帧填充空列,所有数据帧都具有不同的长度? - How to fill an empty column from a dataframe, based on conditions from more than 2 other dataframes, all with different lengths?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM