[英]Create the new column based on the condition
我有一个数据框以下格式
如果用户购买了新商品,则他将获得唯一的id
值;如果同一用户购买了另一商品,则child
列将具有先前的id
。
df <- data.frame(id= c('s123','s1004','s1009','s1010'),child = c("",'s123','s1004',""))
> df
id child
1 s123
2 s1004 s123
3 s1009 s1004
4 s1010
现在,我想将新列创建parent
列并具有初始ID值
expect_df <- data.frame(id= c('s123','s1004','s1009','s1010'),child = c("",'s123','s1004',""),parent = c('s123','s123','s123','s1010'))
> expect_df
id child parent
1 s123 s123
2 s1004 s123 s123
3 s1009 s1004 s123
4 s1010 s1010
数据:(确保您输入的内容是characters
而不是 factors
,请确保您的""
是不NA
)
df <- data.frame(id= c('s123','s1004','s1009','s1010'),child = c(NA,'s123','s1004',NA),stringsAsFactors = F)
码:
df$parent <- NA
repeat {
sid <- df$id[which(is.na(df$parent))[1]]
df$parent[apply(df,1,function(x){x<-na.omit(x);if(any(x%in%sid)){sid<<-c(sid,x);T;}else{F}})] <- sid[1]
if (all(!is.na(df$parent))) break
}
结果:
# id child parent
# 1 s123 <NA> s123
# 2 s1004 s123 s123
# 3 s1009 s1004 s123
# 4 s1010 <NA> s1010
m=function(x,df){
n=with(df,child[x==id])
ifelse(is.na(n),x, m(n,df))
}
transform(df,parent=sapply(id,m,df1),row.names=NULL)
id child parent
1 s123 <NA> s123
2 s1004 s123 s123
3 s1009 s1004 s123
4 s1010 <NA> s1010
5 s1103 s1009 s123
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.