简体   繁体   English

使用 mutate 和 gsub 将特定列中的所有值替换为 NA

[英]Replace all values to NA in specific columns using mutate and gsub

In my data frame I want to replace all values in certain columns to NA.在我的数据框中,我想将某些列中的所有值替换为 NA。

Test2
       ID    Sex Location Obs1 Obs4        Obs5
1  291978 FEMALE        2 16.5 4836 0.563636364
2  292429 FEMALE        2 20.2 5428 0.584158416
3  292466 FEMALE        2 19.2   48 0.005208333
4  293656 FEMALE        2 15.8 2904 0.417721519
5  291993 FEMALE        2 18.1 6194 0.900552486
6  293263 FEMALE        2 17.9  616 0.078212291
7  290200 FEMALE        2 16.7  792 0.107784431
8  292511 FEMALE        2 18.3 4992 0.568306011
9  291510 FEMALE        2 18.6  350 0.037634409
10 293711 FEMALE        2 18.2  264 0.032967033
11 295234 FEMALE        2 16.5  216 0.036363636
12 293839 FEMALE        2 15.0 4114 0.806666667
13 291057 FEMALE        2 16.7   56 0.005988024
14 295094 FEMALE        2 16.5 3154 0.503030303
15 295562 FEMALE        2 14.7  966 0.142857143
16 292381 FEMALE        2 17.4 1980 0.258620690
17 289765 FEMALE        2 17.8 3492 0.544943820
18 293871 FEMALE        2 18.2 3760 0.516483516
19 291076 FEMALE        2 16.8   88 0.011904762
20 293851 FEMALE        2 16.1 2242 0.366459627

Firstly, I want to specify for which columns the values should be replaced to NA.首先,我想指定哪些列的值应该替换为 NA。 This can be only one columns, or multiple.这可以只有一列,也可以是多列。 That's why I prefer to put it into a vector.这就是为什么我更喜欢将它放入向量中。

> Obs <- c('Obs1')

Then, I've tried to replace all values in column 'Obs1' to NA, using:然后,我尝试使用以下方法将“Obs1”列中的所有值替换为 NA:

> deselect <- Test2 %>% mutate(across(paste(Obs), gsub(".*",NA,paste(Obs))))

However, it gives me this error:但是,它给了我这个错误:

Error: Problem with `mutate()` input `..1`.
x Problem with `across()` input `.fns`.
i Input `.fns` must be NULL, a function, a formula, or a list of functions/formulas.
i Input `..1` is `across(paste(Obs), gsub(".*", NA, paste(Obs)))`.
Run `rlang::last_error()` to see where the error occurred.

Anyone an idea how to use gsub within across, within mutate?任何人都知道如何在cross、mutate 中使用gsub? Or should I use another method?还是我应该使用其他方法?

Many thanks!非常感谢!

Or use mutate_at :或使用mutate_at

> Obs = c("Obs1", "Obs4")
> df %>% mutate_at(Obs, function(x) x = NA)
       ID    Sex Location Obs1 Obs4        Obs5
1  291978 FEMALE        2   NA   NA 0.563636364
2  292429 FEMALE        2   NA   NA 0.584158416
3  292466 FEMALE        2   NA   NA 0.005208333
4  293656 FEMALE        2   NA   NA 0.417721519
5  291993 FEMALE        2   NA   NA 0.900552486
6  293263 FEMALE        2   NA   NA 0.078212291
7  290200 FEMALE        2   NA   NA 0.107784431
8  292511 FEMALE        2   NA   NA 0.568306011
9  291510 FEMALE        2   NA   NA 0.037634409
10 293711 FEMALE        2   NA   NA 0.032967033
11 295234 FEMALE        2   NA   NA 0.036363636
12 293839 FEMALE        2   NA   NA 0.806666667
13 291057 FEMALE        2   NA   NA 0.005988024
14 295094 FEMALE        2   NA   NA 0.503030303
15 295562 FEMALE        2   NA   NA 0.142857143
16 292381 FEMALE        2   NA   NA 0.258620690
17 289765 FEMALE        2   NA   NA 0.544943820
18 293871 FEMALE        2   NA   NA 0.516483516
19 291076 FEMALE        2   NA   NA 0.011904762
20 293851 FEMALE        2   NA   NA 0.366459627

Here is how you would do it with mutate and across .这里是你如何与做mutateacross

cols_na <- c("wt", "hp")

mtcars %>% 
  mutate(across(one_of(cols_na), ~ NA))

I would suggest a base R approach, where you define in Obs the columns to replace:我建议使用base R方法,在Obs定义要替换的列:

#Data
df <- structure(list(ID = c(291978L, 292429L, 292466L, 293656L, 291993L, 
293263L, 290200L, 292511L, 291510L, 293711L, 295234L, 293839L, 
291057L, 295094L, 295562L, 292381L, 289765L, 293871L, 291076L, 
293851L), Sex = c("FEMALE", "FEMALE", "FEMALE", "FEMALE", "FEMALE", 
"FEMALE", "FEMALE", "FEMALE", "FEMALE", "FEMALE", "FEMALE", "FEMALE", 
"FEMALE", "FEMALE", "FEMALE", "FEMALE", "FEMALE", "FEMALE", "FEMALE", 
"FEMALE"), Location = c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Obs1 = c(16.5, 20.2, 
19.2, 15.8, 18.1, 17.9, 16.7, 18.3, 18.6, 18.2, 16.5, 15, 16.7, 
16.5, 14.7, 17.4, 17.8, 18.2, 16.8, 16.1), Obs4 = c(4836L, 5428L, 
48L, 2904L, 6194L, 616L, 792L, 4992L, 350L, 264L, 216L, 4114L, 
56L, 3154L, 966L, 1980L, 3492L, 3760L, 88L, 2242L), Obs5 = c(0.563636364, 
0.584158416, 0.005208333, 0.417721519, 0.900552486, 0.078212291, 
0.107784431, 0.568306011, 0.037634409, 0.032967033, 0.036363636, 
0.806666667, 0.005988024, 0.503030303, 0.142857143, 0.25862069, 
0.54494382, 0.516483516, 0.011904762, 0.366459627)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "20"))

Next the code:接下来是代码:

#Define cols
Obs <- c('Obs1')
#Assign
index <- which(names(df) %in% Obs) 
df[,index] <- gsub(".*",NA,df[,index])

Output:输出:

       ID    Sex Location Obs1 Obs4        Obs5
1  291978 FEMALE        2 <NA> 4836 0.563636364
2  292429 FEMALE        2 <NA> 5428 0.584158416
3  292466 FEMALE        2 <NA>   48 0.005208333
4  293656 FEMALE        2 <NA> 2904 0.417721519
5  291993 FEMALE        2 <NA> 6194 0.900552486
6  293263 FEMALE        2 <NA>  616 0.078212291
7  290200 FEMALE        2 <NA>  792 0.107784431
8  292511 FEMALE        2 <NA> 4992 0.568306011
9  291510 FEMALE        2 <NA>  350 0.037634409
10 293711 FEMALE        2 <NA>  264 0.032967033
11 295234 FEMALE        2 <NA>  216 0.036363636
12 293839 FEMALE        2 <NA> 4114 0.806666667
13 291057 FEMALE        2 <NA>   56 0.005988024
14 295094 FEMALE        2 <NA> 3154 0.503030303
15 295562 FEMALE        2 <NA>  966 0.142857143
16 292381 FEMALE        2 <NA> 1980 0.258620690
17 289765 FEMALE        2 <NA> 3492 0.544943820
18 293871 FEMALE        2 <NA> 3760 0.516483516
19 291076 FEMALE        2 <NA>   88 0.011904762
20 293851 FEMALE        2 <NA> 2242 0.366459627

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM