根据R或python中的条件替换另一列列表中的一个列列表中的值

Question

(for pythonistas, the below code is in R's format before I get some #hatehard) （对于pythonistas，下面的代码为R格式，直到获得#hatehard为止）

This one has been frustrating me for a way too long. 这个让我感到沮丧的时间已经太久了。

I have 2 datasets 我有2个数据集

df1 <- data.frame(ID = c("Person.A", "Person.B", "Person.C", "Person.D", "Person.E", "Person.F"),
                  Aa = c(0,1,2,NA,1,1),
                  Ab = c(0,NA,2,1,1,1),
                  Ac = c(NA,NA,2,2,1,1),
                  no.match = c(0,1,2,2,1,2))

df2 <- data.frame(ID = c("Person.A", "Person.B", "Person.C", "Person.D", "Person.E"),
                  Ba = c(0,NA,2,1,1),
                  Bb = c(NA,1,2,2,1),
                  Bc = c(0,1,2,2,1))

I then merge these 2 datasets using merge(df1, df2, all.x = T, by = "ID" to get: 然后，我使用merge(df1, df2, all.x = T, by = "ID"合并这两个数据集，得到：

         ID Aa Ab Ac no.match Ba Bb Bc
1 Person.A  0  0 NA        0  0 NA  0
2 Person.B  1 NA NA        1 NA  1  1
3 Person.C  2  2  2        2  2  2  2
4 Person.D NA  1  2        2  1  2  2
5 Person.E  1  1  1        1  1  1  1
6 Person.F  1  1  1        2 NA NA NA

The actual datasets are much more complicated with lots of columns that have no matches in other columns. 实际的数据集要复杂得多，因为许多列在其他列中都没有匹配项。 So I don't think I could do something that depends on the arrangement of the columns. 因此，我认为我不能根据列的排列来做些什么。

Columns Aa and Ba contain the same information; Aa和Ba列包含相同的信息； and columns Ab and Bb do as well, and so on, but column no.match does not contain a matching column. Ab和Bb列也是如此，依此类推，但是no.match列不包含匹配的列。

I want to "map" the values from from the same row of Ba to Aa if Aa is NA and do the same for Ab and Bb , Ac and Bc , etc. 如果 Aa是NA，我想将Ba的同一行中的值“映射”到Aa ，并对Ab和Bb ， Ac和Bc等执行相同的操作。

The result DF in this case would look like: 在这种情况下，结果DF看起来像：

        ID Aa Ab Ac no.match Ba Bb Bc
1 Person.A  0  0  0      0    0 NA  0
2 Person.B  1  1  1      1    NA  1  1
3 Person.C  2  2  2      2    2  2  2
4 Person.D  1  1  2      2    1  2 NA
5 Person.E  1  1  1      1    1  1  1
6 Person.F  1  1  1      2    NA NA NA

Where element [4,2] was replaced by element [4,6] The rows and the columns need to match up. 其中元素[4,2]被元素[4,6]取代。行和列需要匹配。

I've tried an embarrassingly large number of things: apply , ifelse , iterating through a list of columns l1 = c('Aa','Ab','Ac'), l2 = c('Ba', 'Bb', 'Bc') 我尝试了很多令人尴尬的事情： apply ， ifelse ，遍历列l1 = c('Aa','Ab','Ac'), l2 = c('Ba', 'Bb', 'Bc')

I can do the one-off: which(is.na(mdf$Aa)) <- mdf[which(is.na(mdf$Aa)), c("Ba")] 我可以一次性完成： which(is.na(mdf$Aa)) <- mdf[which(is.na(mdf$Aa)), c("Ba")]

But how can I do this iteratively? 但是我该怎么做呢？

Thank you! 谢谢！ (sorry for the long-windedness) （很抱歉）

Answer 1

Here's one using data.table v1.9.5 - installation instructions here : 这是一个使用data.table v1.9.5 - 此处的安装说明：

require(data.table) # v1.9.5+
cols1 = names(df1)[2:4]
cols2 = names(df2)[2:4]

foo <- function(x, y) {
    nas = is.na(x)
    x[nas] = y[nas]
    x
}
setDT(df1)[df2, c(cols1, cols2) := c(Map(foo, mget(cols1), 
                   mget(cols2)), mget(cols2)), on = "ID"]

> df1
#          ID Aa Ab Ac no.match Ba Bb Bc
# 1: Person.A  0  0  0        0  0 NA  0
# 2: Person.B  1  1  1        1 NA  1  1
# 3: Person.C  2  2  2        2  2  2  2
# 4: Person.D  1  1  2        2  1  2  2
# 5: Person.E  1  1  1        1  1  1  1
# 6: Person.F  1  1  1        2 NA NA NA

setDT() converts df1 to a data.table by reference. setDT()通过引用将df1转换为data.table 。
setDT(df1)[df2, on = "ID"] performs a join. setDT(df1)[df2, on = "ID"]执行setDT(df1)[df2, on = "ID"] 。 For each row of df2 , we find the matching rows in df1 and extract the columns corresponding to matching rows.. 对于df2每一行，我们在df1找到匹配的行，并提取与匹配的行相对应的列。
On the matching rows, we update columns in cols1 and add new columns in cols2 by reference using the := operator. 在匹配的行，我们更新列cols1和添加新列cols2使用引用 :=操作符。 For updating columns, we extract the columns specified in cols1 and cols2 and replace NA s with the function foo() . 为了更新列，我们提取在cols1和cols2指定的列，并将NA替换为函数foo() 。 For adding columns, we simply pull the columns cols2 , using mget() . 为了添加列，我们只需使用mget()拉列cols2 。 We concatenate the two lists using c() . 我们使用c()连接两个列表。

If you're interested, have a look at the HTML vignettes to learn more. 如果您有兴趣，请查看HTML小插图以了解更多信息。

根据R或python中的条件替换另一列列表中的一个列列表中的值

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-07-29 23:13:22

根据R或python中的条件替换另一列列表中的一个列列表中的值

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-07-29 23:13:22

解决方案1
1 已采纳 2015-07-29 23:13:22