简体   繁体   English

根据另一列的值替换列中的 NA

[英]Replace NAs in a column based on values of another column

I have a dataframe like below:我有一个 dataframe 如下所示:

    name<-c("Fred","George","","Fred","George")
wif<-c("fd","gf",NA,NA,NA)
asv<-c("hj","fd",NA,NA,NA)
wdf<-c("bn","jk",NA,NA,NA)
label<-c("Fred","George","","Fred","George")
fam<-data.frame(name,wif,asd,wdf,label)

As you can see the first 2 rows are exactly the same with the last 2 rows but wife1 and wife2 and wife3 values are NAs .正如您所看到的,前 2 行与后 2 行完全相同,但wife1wife2wife3的值是NAs The middle has blank values and NAs and should remain like that.I want to fill the last 2 rows with the same values with the first 2 rows.中间有空白值和NAs ,应该保持这样。我想用与前 2 行相同的值填充最后 2 行。 Note that the solution should be applied in a dataset with different number of rows.请注意,该解决方案应应用于具有不同行数的数据集中。

I tried fam %>% group_by(name) %>% mutate_all(~.[.is.na(.)]) but I get:我试过fam %>% group_by(name) %>% mutate_all(~.[.is.na(.)])但我得到:

mutate_all()` ignored the following grouping variables:
Column `name`
Use `mutate_at(df, vars(-group_cols()), myoperation)` to silence the message.
Error: Column `wife1` must be length 1 (the group size), not 0

You can match the name column with itself to get the index of the first time the name occurred and use the value from that row for the columns you want to modify.您可以将名称列与其自身匹配以获取名称第一次出现的索引,并将该行中的值用于要修改的列。

cols <- 2:4 # or if your column names contain a pattern: grep(pattern, names(fam))
fam[cols] <- fam[match(fam$name, fam$name), cols]

fam
#     name wife1 wife2 wife3  label
# 1   Fred    fd    hj    bn   Fred
# 2 George    gf    fd    jk George
# 3         <NA>  <NA>  <NA>       
# 4   Fred    fd    hj    bn   Fred
# 5 George    gf    fd    jk George

To make your approach to work here is one way.让你的方法在这里工作是一种方法。 As stated in the error message use mutate_at and ignore groups which has no - non NA value and return them as it is.如错误消息中所述,使用mutate_at并忽略没有 - 非 NA 值的组并按原样返回它们。

library(dplyr)

fam %>%
  group_by(name) %>% 
  mutate_at(vars(starts_with("wife")), ~ if (any(!is.na(.))) .[!is.na(.)] else .)


#  name   wife1 wife2 wife3 label 
#  <chr>  <chr> <chr> <chr> <chr> 
#1 Fred   fd    hj    bn    Fred  
#2 George gf    fd    jk    George
#3 ""     NA    NA    NA    ""    
#4 Fred   fd    hj    bn    Fred  
#5 George gf    fd    jk    George

data数据

name<-c("Fred","George","","Fred","George")
wife1<-c("fd","gf",NA,NA,NA)
wife2<-c("hj","fd",NA,NA,NA)
wife3<-c("bn","jk",NA,NA,NA)
label<-c("Fred","George","","Fred","George")
fam<-data.frame(name,wife1,wife2,wife3,label, stringsAsFactors = FALSE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM