用于将一列中的值替换为另一列中的缺失值的R代码

Question

I have a data set named one with four columns: D1 , D2 , D3 and D4 . 我有一个名为one的数据集，有四列： D1 ， D2 ， D3和D4 。 D1 is the id. D1是id。 D2 has seven levels ( a , b , c , d , e , f , g ). D2有七个等级（ a ， b ， c ， d ， e ， f ， g ）。 D3 has missing data, which I want to fill by matching conditions from columns D2 and D4 . D3缺少数据，我想通过匹配D2和D4列的条件来填充数据。 I am selecting values from column D4 corresponding to four levels ( a , c , d , e ) of column D2 and then replacing the missing values of column D3 with those from D4 . 我从列D4选择对应于列D2四个级别（ a ， c ， d ， e ）的值，然后将列D3的缺失值替换为来自D4 。

D1  D2  D3  D4
1   a   .   5
2   c   12  6
3   e   .   3
4   b   .   7
5   f   .   8
6   e   .   9
7   e   11  8
8   c   .   3
9   c   52  5
10  a   .   6
11  b   4   7
12  f   .   2
13  f   .   10
14  d   .   12
15  d   .   13
16  e   .   24
17  a   1   54
18  b   2   19
19  c   5   21

I have following solution but it is not working. 我有以下解决方案，但它无法正常工作。 Any suggestion or help? 有任何建议或帮助吗？ Thanks. 谢谢。

index <- with(one, D2 %in% c('a','c','d','e'))
one$D4[index] <- one$D3[index]
one

Answer 1

Assuming that you actually do have "." 假设你确实有“。” in the data, and that the data are read in as characters instead of numbers/NAs, the following solution should be easier to understand than the with() call: 在数据中，并且数据作为字符而不是数字/ NA读入，以下解决方案应该比with（）调用更容易理解：

d <- read.table(header=T, stringsAsFactors=F, text=
"D1  D2  D3  D4
1   a   .   5
2   c   12  6
3   e   .   3
4   b   .   7
5   f   .   8
6   e   .   9
7   e   11  8
8   c   .   3
9   c   52  5
10  a   .   6
11  b   4   7
12  f   .   2
13  f   .   10
14  d   .   12
15  d   .   13
16  e   .   24
17  a   1   54
18  b   2   19
19  c   5   21"
)

indices <- d$D2 %in% c("a","c","d","e") & d$D3 == "."
d$D3[ indices ] <- d$D4[ indices ]

And if you actually do have NAs instead of the "." 如果你确实有NA而不是“。” characters you could easily just use is.na(d$D3) as your vector indices. 您可以轻松使用的is.na(d$D3)作为矢量索引。

Answer 2

Another way is to use na.strings when reading the table and then using ifelse . 另一种方法是在读表时使用na.strings ，然后使用ifelse 。 Slightly verbose but easy to understand ! 略显冗长但易于理解！

d <- read.table(header=T, stringsAsFactors=F, na.strings=".", text=
                  "D1  D2  D3  D4
1   a   .   5
2   c   12  6
3   e   .   3
4   b   .   7
5   f   .   8
6   e   .   9
7   e   11  8
8   c   .   3
9   c   52  5
10  a   .   6
11  b   4   7
12  f   .   2
13  f   .   10
14  d   .   12
15  d   .   13
16  e   .   24
17  a   1   54
18  b   2   19
19  c   5   21"
)


d$D3 <- ifelse(is.na(d$D3) & (d$D2 == 'a' | d$D2 == 'c' | d$D2 == 'd' | d$D2 == 'e'), d$D4, d$D3)

用于将一列中的值替换为另一列中的缺失值的R代码

问题描述

2 个解决方案

解决方案1
2 2015-05-05 23:09:33

解决方案2
1 2015-05-06 02:39:54

用于将一列中的值替换为另一列中的缺失值的R代码

问题描述

2 个解决方案

解决方案1 2 2015-05-05 23:09:33

解决方案2 1 2015-05-06 02:39:54

解决方案1
2 2015-05-05 23:09:33

解决方案2
1 2015-05-06 02:39:54