简体   繁体   中英

Removing few characters from factor in R

I don't know how can I remove some part of factor in R. I have data like this :

District                       X
District - Purba Champaran    12
District - Purba Champaran    86
District - Purba Champaran    56
District - Sheohar            13 
District - Sheohar            45
District - Sheohar            13

I want to eliminate the "District -" part from each district names. Please also tell me, What if some district names not include "District -" in their names then how it can be done?

We have :

df<-structure(list(District = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("District - Purba Champaran", 
"District - Sheohar"), class = "factor"), X = c(12, 86, 56, 13, 
45, 13)), .Names = c("District", "X"), class = "data.frame", row.names = c(NA, 
-6L))

We can use sub : df[,1]<-sub('District - ','',df[,1])

df
#          District  X
# 1 Purba Champaran 12
# 2 Purba Champaran 86
# 3 Purba Champaran 56
# 4         Sheohar 13
# 5         Sheohar 45
# 6         Sheohar 13

It will remove the "District - " from each row of df for the first column. If there is no "District - " in a row, it will do nothing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM