I have a csv file with 1 column, Address. It has values like:
A <- structure(list(Address = structure(1:3, .Label = c("&2340 P St",
"&5656 N St", "456 B Street"), class = "factor")), .Names = "Address",
row.names = c(NA, 3L), class = "data.frame")
A
## Address
## 1 &2340 P St
## 2 &5656 N St
## 3 456 B Street
I need to clean data – to erase all characters (or replace with space) if there is “&”. I am expecting this result in my 2.csv file:
## Address
## 1 456 B Street
Here's the code:
A <-read.csv("U:/161/1.csv", header=T,sep=",")
B<-gsub("&", " ", A$ADDRESS1, ignore.case = TRUE)
write.table(B, file = "U:/161/2.csv", sep = ","
, col.names = NA, qmethod = "double")
It only removes “&”. How do I remove the rest of address?
Use grep
or grepl
to identify which rows contain &
then exclude those rows
B <- droplevels(A[!grepl('&', A$Address), ,drop=FALSE])
B
## Address
## 3 456 B Street
notice that I wrap the [
call in droplevels
to ensure that the unused levels (those with &
) are dropped also, as in this case the result would only be 1 line, I included drop=FALSE
to keep B
as a data.frame
To use grep
(which returns the index) you would use -
B <- droplevels(A[-grep('&', A$Address), ,drop=FALSE])
B
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.