简体   繁体   中英

R if is.na() is True then perform a function not working; the condition has length > 1 and only the first element will be used

I'm trying to use an if statement that says if a value is.na, then perform a function on a different column.

I can't get it to work, and I keep getting an error:

  the condition has length > 1 and only the first element will be used

I've looked at the other questions regarding if statements, but I don't need to substitute one value for another. Instead, I need to run a function if is.na() = TRUE, and the function that I'm using (mutate_geocode) automatically makes the new columns, so I don't need to assign it to a new column. Here's what I've been trying:

library(dplyr)
library(ggmap)

Enrollment_Report2 <- if (is.na(Enrollment_Report$lon)) {
  mutate_geocode(facility_city)
}

A sample of the data looks like this:

library(dplyr)
Enrollment_Report <- tibble(facility_city = c("Atlanta", "Boston", "Tokyo"),
lon = c(NA, NA, 139.65),
lat = c(NA, NA, 35.68))

We can filter out the NA rows and then apply the mutate_geocode

library(dplyr)
library(ggmap)
Enrollment_Report %>% 
   filter(is.na(lon)) %>%
   summarise(fac_city = list(facility_city), 
            out = list(geocode(facility_city))) %>% 
   unnest %>% 
   rename(facility_city = fac_city) %>% 
   bind_rows(Enrollment_Report %>%
   filter(!is.na(lon)))
# A tibble: 3 x 3
# facility_city   lon   lat
#  <chr>         <dbl> <dbl>
#1 Atlanta       -84.4  33.7
#2 Boston        -71.1  42.4
#3 Tokyo         140.   35.7

Or create a logical index and then update the rows

i1 <- is.na(Enrollment_Report$lon)
Enrollment_Report[i1, -1] <- geocode(Enrollment_Report$facility_city[i1])
Enrollment_Report
# A tibble: 3 x 3
#  facility_city   lon   lat
#  <chr>         <dbl> <dbl>
#1 Atlanta       -84.4  33.7
#2 Boston        -71.1  42.4
#3 Tokyo         140.   35.7

I have to open a new answer as I do not have enough reputation to comment on akrun's reply. The ifelse function does what you are looking for. The error that you get is because if looks for a singular value as stated above, but you pass in a vector. Here is a small example:

a <- c(NA, 1, NA, 0)
if(is.na(a)){}
# NULL
# Warning message:
# In if (is.na(a)) { :
# the condition has length > 1 and only the first element will be used

If you instead wrote

result <- rep(NA, 4)
for(i in 1:length(a)){
  if(is.na(a[i])){
    result[i] <- 1
  } else {
    result[i] <- 0
  }
}
result
# [1] 1 0 1 0

you don't get the error, because you are only using singular values with if(). Now, instead of using a for-loop and an if() else() clause, you can simply use the vectorized ifelse() function as suggested above. For every element of the vector Enrollment_Report$lon, it checkts if it is a NA value, and if so, it applies the function mutate_geocode on Enrollment_Report$facility_city, and if not, it just returns facility city.

Enrollment_Report2 <- ifelse(is.na(Enrollment_Report$lon), mutate_geocode(Enrollment_Report$facility_city), Enrollment_Report$facility_city)

It is a vectorized version of if(){} else(){}.

I dont have the Google API to test, but i think this could should work:

library(ggmap)
library(dplyr)
library(hablar)

Enrollment_Report %>% 
  mutate(geocode = if_else_(is.na(lon), geocode(facility_city), NA))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM