[英]Replacing characters using R
I have a csv file with 1 column, Address. 我有一个带有1列“地址”的csv文件。 It has values like: 其值如下:
A <- structure(list(Address = structure(1:3, .Label = c("&2340 P St",
"&5656 N St", "456 B Street"), class = "factor")), .Names = "Address",
row.names = c(NA, 3L), class = "data.frame")
A
## Address
## 1 &2340 P St
## 2 &5656 N St
## 3 456 B Street
I need to clean data – to erase all characters (or replace with space) if there is “&”. 我需要清除数据–如果有“&”,则擦除所有字符(或用空格替换)。 I am expecting this result in my 2.csv file: 我期望在我的2.csv文件中获得此结果:
## Address
## 1 456 B Street
Here's the code: 这是代码:
A <-read.csv("U:/161/1.csv", header=T,sep=",")
B<-gsub("&", " ", A$ADDRESS1, ignore.case = TRUE)
write.table(B, file = "U:/161/2.csv", sep = ","
, col.names = NA, qmethod = "double")
It only removes “&”. 它仅删除“&”。 How do I remove the rest of address? 如何删除其余地址?
Use grep
or grepl
to identify which rows contain &
then exclude those rows 使用grep
或grepl
识别哪些行包含&
然后排除那些行
B <- droplevels(A[!grepl('&', A$Address), ,drop=FALSE])
B
## Address
## 3 456 B Street
notice that I wrap the [
call in droplevels
to ensure that the unused levels (those with &
) are dropped also, as in this case the result would only be 1 line, I included drop=FALSE
to keep B
as a data.frame
注意,我将[
调用包装在droplevels
以确保也删除未使用的水平(那些带有&
水平),因为在这种情况下,结果只会是1行,我添加了drop=FALSE
来将B
保留为data.frame
To use grep
(which returns the index) you would use -
要使用grep
(返回索引),您可以使用-
B <- droplevels(A[-grep('&', A$Address), ,drop=FALSE])
B
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.