简体   繁体   English

R-根据值的组合将值分配给多个单元格

[英]R - Assign value to multiple cell based on a combination of values

I have the following data.frame , where the multiple X columns (1,2,3... N) are blank: 我有以下data.frame ,其中多个X列(1,2,3 ... N)为空:

df1 <- data.frame( name = c("A","B","C"), 
                   X1 = c("","", ""), 
                   Y1 = c("aa","bb","cc"), 
                   Z1 = c("AA","BB","CC"),
                   X2 = c("","", ""), 
                   Y2 = c("dd","",""),
                   Z2 = c("AA","",""),
                   X3 = c("","", ""), 
                   Y3 = c("","","ee"), 
                   Z3 = c("","","CC"))

Another data.frame contains the value that should be assigned to the X columns accordingly to the combination of values observed in the Ys and Zz columns: 另一个data.frame包含应根据在Ys和Zz列中观察到的值的组合而应分配给X列的值:

df2 <- data.frame( Y = c("aa","bb","cc","dd","ee"), 
                   Z = c("AA","BB","CC","AA","CC"),
                   X = c (1,2,3,4,5))

How could I assign the values of X in df1 based on the information I have on df2, So I can get df3?: 我如何根据关于df2的信息在df1中分配X的值,所以我可以得到df3?

df3 <- data.frame( name = c("A","B","C"), 
                   X1 = c("1","2", "3"), 
                   Y1 = c("aa","bb","cc"), 
                   Z1 = c("AA","BB","CC"),
                   X2 = c("4","", ""), 
                   Y2 = c("dd","",""),
                   Z2 = c("AA","",""),
                   X3 = c("","", "5"), 
                   Y3 = c("","","ee"), 
                   Z3 = c("","","CC"))`

Please note that in my real database each name may contain, ,but not necessarily does, several columns (for example, X1,Y1,Z1... X10,Y10,Z10 ). 请注意,在我的真实数据库中,每个名称可能包含但不一定包含几个列(例如X1,Y1,Z1... X10,Y10,Z10 )。

This strategy reshapes your data from a wide format to a long format, does the merge, then reshapes everything back. 此策略将您的数据从宽格式重整为长格式,进行合并,然后将所有内容重整。

# go from wide to long
x1 <- reshape(df1, 
    varying=Map(function(x) paste0(x, 1:3), c("X","Y","Z")),
    v.names=c("X","Y","Z"),
    idvar="name",    
    timevar="time",
    direction="long")

x2 <- merge(subset(x1, select=-X), df2, by=c("Y","Z"), all.x=T)
# replace NA values with blanks
x2[is.na(x2$X),"X"] <- ""

# go back to wide
x3 <- reshape(x2,idvar="name",direction="wide", sep="")

and x3 is then 然后x3

  name Y1 Z1 X1 Y2 Z2 X2 Y3 Z3 X3
1    A aa AA  1 dd AA  4         
2    B bb BB  2                  
3    C cc CC  3          ee CC  5

Here you get the columns in a slightly different order but you can easily fix after the fact if necessary. 在这里,您获得的列顺序略有不同,但是如果有必要,可以在事后轻松修复。

You can see there is one place i hard-coded 1:3 . 您可以看到有一个地方我硬编码为1:3 If you have more repetitions of columns, you can adjust that vector. 如果列的重复次数更多,则可以调整该向量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM