简体   繁体   English

用下一列 (R) 中的变量替换 dataframe 中的 NA 值

[英]Replace NA values in dataframe with variables in the next column (R)

I am new still trying to learn R and I could not find the answers I am looking for in any other thread.我是新手,仍在尝试学习 R,但我无法在任何其他线程中找到我正在寻找的答案。

I have a dataset with (for simplicity) 5 columns.我有一个包含(为简单起见)5 列的数据集。 Columns 1,2, and4 always have values, but in some rows column 3 doesn't.第 1、2 和 4 列始终有值,但在某些行中,第 3 列没有。 Below is an example:下面是一个例子:

Current当前的

A  B  C  D  E
1  1  2  3 
1  2  NA 4  5
1  2  3  4 
1  3  NA 9  7
1  2  NA 5  6

I want to make it so that the NA's are replaced by the value in column D, and then the value in col E is shifted to D, etc.我想这样做,以便将 NA 替换为 D 列中的值,然后将 col E 中的值转移到 D,等等。

Desired output:所需的 output:

A  B  C  D  E
1  1  2  3  NA
1  2  4  5  NA
1  2  3  4  NA
1  3  9  7  NA
1  2  5  6  NA

I copied what was on different Stack overflow threads and none achieved what I wanted.我复制了不同堆栈溢出线程上的内容,但没有一个达到我想要的。

na.omit gets rid of the row. na.omit删除该行。 Any help is greatly appreciated.任何帮助是极大的赞赏。

Data数据

data <- structure(list(A = c(1L, 1L, 1L, 1L, 1L), B = c(1L, 2L, 2L, 3L, 
2L), C = c(2L, NA, 3L, NA, NA), D = c(3L, 4L, 4L, 9L, 5L), E = c(NA, 
5L, NA, 7L, 6L)), class = "data.frame", row.names = c(NA, -5L
))

Code代码

library(dplyr)

data %>% 
  mutate(
    aux = C,
    C = if_else(is.na(aux),D,C),
    D = if_else(is.na(aux),E,D),
    E = NA
  ) %>% 
  select(-aux)

Output Output

  A B C D  E
1 1 1 2 3 NA
2 1 2 4 5 NA
3 1 2 3 4 NA
4 1 3 9 7 NA
5 1 2 5 6 NA

Replacement operation all in one go:一次更换操作 go:

dat[is.na(dat$C), c("C","D","E")] <- c(dat[is.na(dat$C), c("D","E")], NA)
dat
#  A B C D  E
#1 1 1 2 3 NA
#2 1 2 4 5 NA
#3 1 2 3 4 NA
#4 1 3 9 7 NA
#5 1 2 5 6 NA

Where dat was: dat在哪里:

dat <- read.table(text="A  B  C  D  E
1  1  2  3 
1  2  NA 4  5
1  2  3  4 
1  3  NA 9  7
1  2  NA 5  6", fill=TRUE, header=TRUE)

Using shift_row_values使用shift_row_values

library(hacksaw)
shift_row_values(df1)
  A B C D  E
1 1 1 2 3 NA
2 1 2 4 5 NA
3 1 2 3 4 NA
4 1 3 9 7 NA
5 1 2 5 6 NA

data数据

df1 <- structure(list(A = c(1L, 1L, 1L, 1L, 1L), B = c(1L, 2L, 2L, 3L, 
2L), C = c(2L, NA, 3L, NA, NA), D = c(3L, 4L, 4L, 9L, 5L), E = c(NA, 
5L, NA, 7L, 6L)), class = "data.frame", row.names = c(NA, -5L
))

A base R universal approach using order without prior knowledge of NA positions.一种基本 R通用方法,在不事先了解NA位置的情况下使用order

setNames(data.frame(t(apply(data, 1, function(x) 
  x[order(is.na(x))]))), colnames(data))
  A B C D  E
1 1 1 2 3 NA
2 1 2 4 5 NA
3 1 2 3 4 NA
4 1 3 9 7 NA
5 1 2 5 6 NA

Using dplyr使用dplyr

library(dplyr)

t(data) %>% 
  data.frame() %>% 
  mutate(across(everything(), ~ .x[order(is.na(.x))])) %>% 
  t() %>% 
  as_tibble()
# A tibble: 5 × 5
      A     B     C     D     E
  <int> <int> <int> <int> <int>
1     1     1     2     3    NA
2     1     2     4     5    NA
3     1     2     3     4    NA
4     1     3     9     7    NA
5     1     2     5     6    NA

Data数据

data <- structure(list(A = c(1L, 1L, 1L, 1L, 1L), B = c(1L, 2L, 2L, 3L, 
2L), C = c(2L, NA, 3L, NA, NA), D = c(3L, 4L, 4L, 9L, 5L), E = c(NA, 
5L, NA, 7L, 6L)), class = "data.frame", row.names = c(NA, -5L
))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM