Replacing characters using R

Question

I have a csv file with 1 column, Address. It has values like:

A <- structure(list(Address = structure(1:3, .Label = c("&2340 P St", 
 "&5656 N St", "456 B Street"), class = "factor")), .Names = "Address", 
 row.names = c(NA, 3L), class = "data.frame")
A
##        Address
## 1   &2340 P St
## 2   &5656 N St
## 3 456 B Street

I need to clean data – to erase all characters (or replace with space) if there is “&”. I am expecting this result in my 2.csv file:

##        Address
## 1 456 B Street

Here's the code:

 A <-read.csv("U:/161/1.csv", header=T,sep=",")
 B<-gsub("&", " ", A$ADDRESS1, ignore.case = TRUE)
 write.table(B, file = "U:/161/2.csv", sep = ","
 , col.names = NA, qmethod = "double")

It only removes “&”. How do I remove the rest of address?

Answer 1

Use grep or grepl to identify which rows contain & then exclude those rows

B <- droplevels(A[!grepl('&', A$Address), ,drop=FALSE])
B
##       Address
## 3 456 B Street

notice that I wrap the [ call in droplevels to ensure that the unused levels (those with & ) are dropped also, as in this case the result would only be 1 line, I included drop=FALSE to keep B as a data.frame

To use grep (which returns the index) you would use -

B <- droplevels(A[-grep('&', A$Address), ,drop=FALSE])
B

Replacing characters using R

Question

1 answers

solution1
0 2012-10-16 23:27:43

Replacing characters using R

Question

1 answers

solution1 0 2012-10-16 23:27:43

solution1
0 2012-10-16 23:27:43