R。删除“#NAME？”时出现问题？（来自 excel 导入）在 dataframe

Question

I have a.csv import from excel that has formula hangups that I am trying to remove.我有一个从 excel 导入的 .csv ，其中有我要删除的公式挂断。 A simple version of the data is below.数据的简单版本如下。

library(tidyverse)
df <- data.frame(
  species = letters[1:5],
  param1 = c("Place", "creek", "river", "#VALUE!", "desert"),
  param2 = c(-23.8, 43.23, "#NAME?", 45, 0.23),
  param3 = c(2.4, 2, 5.7, 0.00003, -2.5),
  stringsAsFactors = FALSE
) # This is a simplified version of the excel .csv import

df[df == "#VALUE!"] <- ""     # Removes excel cells where the formula left "#VALUE!"
df[df == "#NAME\\?"] <- ""   # This does not work

ndf <- df  # This is an attempt to reassign the columns to numeric
ndf
class(ndf$param2)
class(ndf$param3)

The main problem is that the data column Param2 with this left in it is assigned to character when it needs to be numeric , or the functions I have to run on it do not work.主要问题是数据列Param2在它需要为numeric时被分配给character ，或者我必须在其上运行的功能不起作用。

I've tried many different things, however I always nothing seems to recognise the cell.我尝试了很多不同的东西，但是我似乎总是什么都认不出这个细胞。 How do I remove "#NAME?"如何删除“#NAME”？ across the df please?请穿过df？

Answer 1

You are doing an exact match (and not a regex match) so you don't need to escape special variables (like ? , ! ) differently.您正在进行完全匹配（而不是正则表达式匹配），因此您不需要以不同的方式转义特殊变量（如? , ! ）。 Try:尝试：

df[df == "#VALUE!"] <- ""  
df[df == "#NAME?"] <- NA
df <- type.convert(df, as.is = TRUE)
df
#  species param1 param2   param3
#1       a  Place -23.80  2.40000
#2       b  creek  43.23  2.00000
#3       c  river     NA  5.70000
#4       d         45.00  0.00003
#5       e desert   0.23 -2.50000

str(df)
#'data.frame':  5 obs. of  4 variables:
# $ species: chr  "a" "b" "c" "d" ...
# $ param1 : chr  "Place" "creek" "river" "" ...
# $ param2 : num  -23.8 43.23 NA 45 0.23
# $ param3 : num  2.4 2 5.7 0.00003 -2.5

Answer 2

Here's a dplyr solution with sub to replace the unwanted values in one go:这是一个dplyr解决方案，用sub替换 go 中不需要的值：

df %>%
  mutate(across(matches("\\d"), ~sub("#.*", "NA", .)))
  species param1 param2 param3
1       a  Place  -23.8    2.4
2       b  creek  43.23      2
3       c  river     NA    5.7
4       d     NA     45  3e-05
5       e desert   0.23   -2.5

This solution is helpful if you do not know in which columns the unwanted values occur:如果您不知道不需要的值出现在哪些列中，此解决方案会很有帮助：

library(stringr)
df %>% 
  mutate(across(where(~any(str_detect(.,"#"))), ~sub("#.*", "NA", .)))

This third solution both replaces the unwanted values anywhere and converts the columns to their correct type (thanks to @Ronak for inspiration):这第三个解决方案既替换了任何地方不需要的值，又将列转换为正确的类型（感谢@Ronak 的启发）：

df %>% 
  mutate(across(where(~any(str_detect(.,"#"))), ~sub("#.*", "NA", .)),
         across(everything(), ~type.convert(., as.is = TRUE)))

R。删除“#NAME？”时出现问题？（来自 excel 导入）在 dataframe

问题描述

2 个解决方案

解决方案1
2 已采纳 2021-05-13 06:36:36

解决方案2
0 2021-05-14 08:01:25

R。 删除“#NAME？”时出现问题？ （来自 excel 导入）在 dataframe

问题描述

2 个解决方案

解决方案1 2 已采纳 2021-05-13 06:36:36

解决方案2 0 2021-05-14 08:01:25

R。删除“#NAME？”时出现问题？（来自 excel 导入）在 dataframe

解决方案1
2 已采纳 2021-05-13 06:36:36

解决方案2
0 2021-05-14 08:01:25