[英]R- format a data.frame into another 'combined' data.frame based on common values within a column dependent across different columns
I'm starting with a data frame that consists of three columns. 我从一个包含三列的数据框架开始。 Column#1 contains ids that indicate 3 different time periods when the weight (column#3) of some persons (column#2) has been measured in kg.
第1列包含的ID表示3个不同的时间段,其中某些人(第2列)的体重(第3列)的重量(以kg为单位)。
All persons have been measured irregularly, which means, that some persons are measured multiple times or just once within a time period but not across all time periods. 对所有人员的测量都是不规则的,这意味着某些人员在一个时间段内被多次测量或仅被测量一次,但并非在所有时间段内都被测量。
id person_name person_weight
1 Carol 51
1 Mike 76
1 Mike 81
1 Dave 66
1 Carol 59
2 James 78
2 Simone 55
2 Simone 49
2 David 85
3 Mike 93
3 Dave 110
3 Dave 98
Actually, the whole thing here is just a simplified example.. so dont bother if this kind of data collections makes no sense. 实际上,整个过程只是一个简化的示例..因此,如果这种数据收集没有任何意义,请不要打扰。
Now, I want to calculate the average (mean) weight for each person within a time period and then end up with a combined data frame that looks like the following one: 现在,我想计算一个时间段内每个人的平均(平均)体重,然后得出一个类似于以下内容的组合数据框:
group_id Carol Mike Dave James Simone David
1 55 78.5 66 NA NA NA
2 NA NA NA 78 52 85
3 NA 93 104 NA NA NA
I tried some basic R functions (table, apply etc.) but couldn't deal with the dependence across the columns. 我尝试了一些基本的R函数(表,应用等),但无法处理各列之间的依赖关系。
Thanks in advance for any help that brings me closer to the second/'combined' dataframe. 在此先感谢您提供的帮助,使我更接近第二个“组合”数据框。
Seems like a simple dcast
: 看起来像一个简单的
dcast
:
library(reshape2)
dcast(dat,id ~person_name,
fun.aggregate = mean,
value.var = "person_weight",fill = NA_real_)
id Carol Dave David James Mike Simone
1 1 55 66 NA NA 78.5 NA
2 2 NA NA 85 78 NA 52
3 3 NA 104 NA NA 93.0 NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.