简体   繁体   中英

Replacing characters using R

I have a csv file with 1 column, Address. It has values like:

A <- structure(list(Address = structure(1:3, .Label = c("&2340 P St", 
 "&5656 N St", "456 B Street"), class = "factor")), .Names = "Address", 
 row.names = c(NA, 3L), class = "data.frame")
A
##        Address
## 1   &2340 P St
## 2   &5656 N St
## 3 456 B Street

I need to clean data – to erase all characters (or replace with space) if there is “&”. I am expecting this result in my 2.csv file:

##        Address
## 1 456 B Street

Here's the code:

 A <-read.csv("U:/161/1.csv", header=T,sep=",")
 B<-gsub("&", " ", A$ADDRESS1, ignore.case = TRUE)
 write.table(B, file = "U:/161/2.csv", sep = ","
 , col.names = NA, qmethod = "double")

It only removes “&”. How do I remove the rest of address?

Use grep or grepl to identify which rows contain & then exclude those rows

B <- droplevels(A[!grepl('&', A$Address), ,drop=FALSE])
B
##       Address
## 3 456 B Street

notice that I wrap the [ call in droplevels to ensure that the unused levels (those with & ) are dropped also, as in this case the result would only be 1 line, I included drop=FALSE to keep B as a data.frame

To use grep (which returns the index) you would use -

B <- droplevels(A[-grep('&', A$Address), ,drop=FALSE])
B

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM