How to recode dot to NA in R?

Question

I have a data set where missing values have been coded with a dot. I would like to have missing values blank (NA).

Here is the data frame:

df <- data.frame(ITEM1 = c(6, 8, '.'),
                   ITEM2 = c(1, 6, 9),
                   ITEM3 = c(4, 2, 5),
                   ITEM4 = c('.', 3, 2),
                   ITEM5 = c(1, 6, 9)
)

df

ITEM1 ITEM2 ITEM3 ITEM4 ITEM5
1     6     1     4     .     1
2     8     6     2     3     6
3     .     9     5     2     9
>

Answer 1

The columns will be character class because of the presence of . . Create a logical matrix with == and assign those elements to NA , then convert the data.frame columns to its appropriate type with type.convert

df[df == "." & !is.na(df)] <- NA
df <- type.convert(df, as.is = TRUE)

Or in a single step with replace (which internally does the assignment)

df <- type.convert(replace(df, df == "." & !is.na(df), NA), as.is = TRUE)

Or another approach is

df[] <- lapply(df, function(x) replace(x x %in% '.', NA))
df <- type.convert(df, as.is = TRUE)

Generally, this can be avoided all together, while reading the data itself ie specify na.strings = "." in read.csv/read.table etc.

Answer 2

You could use the na_if function from dplyr . Note that the dot changes the type of your columns to be char which might not be what you want afterwards! The following code finds all char columns, replaces . with NA and converts the column to be numeric:

df <- df %>%
    mutate(across(where(is.character), ~as.numeric(na_if(., "."))))

Answer 3

Here is an alternativ with set_na from sjlabelled package. Note the columns will remain as character type.

library(sjlabelled)
set_na(df, na = ".", as.tag = FALSE)

Output:

ITEM1 ITEM2 ITEM3 ITEM4 ITEM5
1     6     1     4  <NA>     1
2     8     6     2     3     6
3  <NA>     9     5     2     9

How to recode dot to NA in R?

Question

3 answers

solution1
5 ACCPTED 2021-05-04 02:23:56

solution2
3 2021-05-04 08:04:11

solution3
3 2021-05-04 08:26:12

How to recode dot to NA in R?

Question

3 answers

solution1 5 ACCPTED 2021-05-04 02:23:56

solution2 3 2021-05-04 08:04:11

solution3 3 2021-05-04 08:26:12

solution1
5 ACCPTED 2021-05-04 02:23:56

solution2
3 2021-05-04 08:04:11

solution3
3 2021-05-04 08:26:12