Happy to award the answer points to someone who can help me vectorize this process. I'd like to search to see if a string is missing a city name and tack on the missing city name if it is indeed missing.
Suppose I have data like this:
df <- data.frame(X=c(1:5), Houston.Addresses=c("548 w 19th st", "6611 Portwest Dr. #190, houston, tx", "3555 Timmons Ln Ste 300, Houston, TX, 77027-6466", "3321 Westpark Dr", "16221 north freeway"))
I'd like data like this:
df.desired <- data.frame(X=c(1:5), Houston.Addresses=c("548 w 19th st, houston, tx", "6611 Portwest Dr. #190, houston, tx", "3555 Timmons Ln Ste 300, Houston, TX, 77027-6466", "3321 Westpark Dr, houston, tx", "16221 north freeway, houston, tx"))
My current method is very inefficient over large datasets, I'm sure there is a vectorization. Can someone assist with the vectorization of this loop?:
foreach(i=1:nrow(df))%do%{
t <- tolower(df[i,"Houston.Addresses"])
x <- grepl("houston", t)
if(!isTRUE(x)){
df[i, "Houston.Addresses" ] <-
paste0(df[i, "Houston.Addresses" ], ", houston, tx")
}
}
Thanks in advance!
Instead of running through each row, we create a logical index with grep
(which is vectorized
) and then assign the elements of 'Houston.Addresses'that corresponds to the index 'i1' (after converting to character
class) by paste
ing the substring
i1 <- !grepl("houston", tolower(df$Houston.Addresses))
df$Houston.Addresses <- as.character(df$Houston.Addresses)
df$Houston.Addresses[i1] <- paste0(df$Houston.Addresses[i1], ", houston, tx")
If we wanted to make it more efficient, we could use data.table
to do the assignment ( :=
)
library(data.table)
setDT(df)[, Houston.Addresses := as.character(Houston.Addresses)
][!grepl("houston", tolower(Houston.Addresses)),
Houston.Addresses := paste0(Houston.Addresses, ", houston, tx")]
Another suggesting using ifelse
df$Houston.Addresses <- ifelse(grepl("houston", df$Houston.Addresses, ignore.case=TRUE),
paste0(df$Houston.Addresses, ", Houston, TX"),
df$Houston.Addresses)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.