[英]R-new variables according to rank within groups
I have such a data frame(df) which is just a sample: 我有这样一个数据框(df),它只是一个示例:
group value condition
1 12 1
1 14 1
1 18 1
1 10 0
1 7 1
2 12 1
2 9 0
2 12 1
2 16 1
2 15 0
Namely; 即;
df<-data.frame(group=c(1,1,1,1,1,2,2,2,2,2), value=c(12,14,18,10,7,12,9,12,16,15), condition=c(1,1,1,0,1,1,0,1,1,0))
I want to create 3 new colums named "rank1", "rank2" and "rank3" where 我想创建3个名为“rank1”,“rank2”和“rank3”的新列
Namely, desired output is: 即,期望的输出是:
group rank1 rank2 ran3
1 7 12 14
2 12 12 16
How can I do that with R? 我怎么能用R做到这一点? I will be very glad for any help.
我会很高兴得到任何帮助。 Thanks a lot.
非常感谢。
With data.table : 使用data.table :
library(data.table)
setDT(df)[condition == 1,
setNames(as.list(sort(value)[1:3]), paste0("rank", 1:3)),
by = group]
# group rank1 rank2 rank3
# 1: 1 7 12 14
# 2: 2 12 12 16
Here is one way using dplyr/tidyr
这是使用
dplyr/tidyr
一种方法
library(dplyr)
library(tidyr)
df %>%
group_by(group) %>%
filter(condition!=0)
arrange(value) %>%
slice(1:3) %>%
mutate(n=paste0('rank', row_number())) %>%
select(-condition)
spread(n, value)
# group rank1 rank2 rank3
#1 1 7 12 14
#2 2 12 12 16
Or using data.table
或者使用
data.table
library(data.table)
dcast.data.table(setkey(setDT(df), value)[condition!=0,
list(rank=paste0('rank', 1:3), value[1:3]), group],
group~rank, value.var='V2')
# group rank1 rank2 rank3
#1: 1 7 12 14
#2: 2 12 12 16
Or using base R
或使用
base R
df1 <- subset(df[order(df$value),], condition!=0 , select=1:2)
df2 <- subset(transform(df1, .id=ave(group, group, FUN=seq_along)), .id<4)
reshape(df2, idvar='group', timevar='.id', direction='wide')
# group value.1 value.2 value.3
#5 1 7 12 14
#6 2 12 12 16
Yet another dplyr answer... 另一个dplyr答案......
myData <- read.csv(text=" group,value
1,12
1,14
1,18
1,10
1,7
2,12
2,9
2,12
2,16 ")
library(dplyr)
myData %>% filter(condition==1) %>% group_by(group) %>% summarise(rank1=nth(sort(value),1),
rank2=nth(sort(value),2),
rank3=nth(sort(value),3))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.