[英]R- Populating a dataframe based on another one with conditions
im new on R and i have a data set of 22x252, the 252 have many repeated values on column 1(ID).我是 R 的新手,我有一个 22x252 的数据集,252 在第 1 列(ID)上有许多重复值。 I made another dataset that has nrows of the unique values (with those values already populated), and i want to populate the rest of the columns based on the other dataset (basically summing all the values that share the same value in column 1.)我制作了另一个具有 nrows 唯一值的数据集(已经填充了这些值),并且我想根据另一个数据集填充列的 rest(基本上将在第 1 列中共享相同值的所有值相加。)
Is there a basic function that enables me to do this?是否有基本的 function 使我能够做到这一点?
Thanks & Regards感谢和问候
We can use aggregate
in base R
.我们可以在base R
中使用aggregate
。 Assuming the column name of first column is 'ID' and all other columns are numeric class, we group by 'ID' and get the sum
of the rest of the columns in aggregate
假设第一列的列名是“ID”,所有其他列都是数字 class,我们按“ID”分组并得到aggregate
列的 rest 的sum
aggregate(.~ ID, df1, sum, na.rm = TRUE)
Or with dplyr
或与dplyr
library(dplyr)
df1 %>%
group_by(ID) %>%
summarise_at(vars(-group_cols()), sum, na.rm = TRUE)
Or with new version with across
或与新版本across
df1 %>%
group_by(ID) %>%
summarise(across(-group_cols(), sum, na.rm = TRUE))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.