简体   繁体   English

使用R替换字符

[英]Replacing characters using R

I have a csv file with 1 column, Address. 我有一个带有1列“地址”的csv文件。 It has values like: 其值如下:

A <- structure(list(Address = structure(1:3, .Label = c("&2340 P St", 
 "&5656 N St", "456 B Street"), class = "factor")), .Names = "Address", 
 row.names = c(NA, 3L), class = "data.frame")
A
##        Address
## 1   &2340 P St
## 2   &5656 N St
## 3 456 B Street

I need to clean data – to erase all characters (or replace with space) if there is “&”. 我需要清除数据–如果有“&”,则擦除所有字符(或用空格替换)。 I am expecting this result in my 2.csv file: 我期望在我的2.csv文件中获得此结果:

##        Address
## 1 456 B Street

Here's the code: 这是代码:

 A <-read.csv("U:/161/1.csv", header=T,sep=",")
 B<-gsub("&", " ", A$ADDRESS1, ignore.case = TRUE)
 write.table(B, file = "U:/161/2.csv", sep = ","
 , col.names = NA, qmethod = "double")

It only removes “&”. 它仅删除“&”。 How do I remove the rest of address? 如何删除其余地址?

Use grep or grepl to identify which rows contain & then exclude those rows 使用grepgrepl识别哪些行包含&然后排除那些行

B <- droplevels(A[!grepl('&', A$Address), ,drop=FALSE])
B
##       Address
## 3 456 B Street

notice that I wrap the [ call in droplevels to ensure that the unused levels (those with & ) are dropped also, as in this case the result would only be 1 line, I included drop=FALSE to keep B as a data.frame 注意,我将[调用包装在droplevels以确保也删除未使用的水平(那些带有&水平),因为在这种情况下,结果只会是1行,我添加了drop=FALSE来将B保留为data.frame

To use grep (which returns the index) you would use - 要使用grep (返回索引),您可以使用-

B <- droplevels(A[-grep('&', A$Address), ,drop=FALSE])
B

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM