简体   繁体   中英

Renaming multiple cells in a data frame in R at once

I want to group each police station in the UK based on its region, however being a newbie I don't know how to rename multiple elements at once.

Example: How it currently looks like

The police stations of Avon and Somerset, Dorset, Gloucester and Wiltshire are located in the South West. I need a function that renames the police stations above "South West".

I would do it in the original csv data set I donwloaded from the UK police website, however my analysis ranges from January 2019 to November 2020 and each csv data set can only be downloaded by month, by region (for a total of about 900 csv files).

I am aware of the function below to select single cells in a data frame, however this data set is way too big for this to be viable.

data[row number, col number] <- "South West"

Any suggestion would be greatly appreciated. Thanks in advance for rescuing a newbie.

ps I merged every single csv dataset of every police station throghout 2019 and 2020 using

crimedata19_20 <- list.files(path="C:/Users/X/Desktop/Crime data/2019-2020",
                    pattern="*.csv")
crimedata19_20 <- do.call("rbind",lapply(crimedata19_20,FUN=function(files){ read.csv(files)})) 

Using gsub with which you may replace a pattern. Example using the iris data set that comes with R:

iris[49:52, ]
#    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 49          5.3         3.7          1.5         0.2     setosa
# 50          5.0         3.3          1.4         0.2     setosa
# 51          7.0         3.2          4.7         1.4 versicolor
# 52          6.4         3.2          4.5         1.5 versicolor

Replace all "setosa" with "South West" in the "Species" column.

res <- transform(iris,
          Species=gsub(pattern="setosa", replacement="south West", Species))
res[49:52, ]
#    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 49          5.3         3.7          1.5         0.2 south West
# 50          5.0         3.3          1.4         0.2 south West
# 51          7.0         3.2          4.7         1.4 versicolor
# 52          6.4         3.2          4.5         1.5 versicolor

Edit

Multiple replacements you may separate with an |(or).

res2 <- transform(iris,
                 Species=gsub(pattern="setosa|versicolor", replacement="south West", Species))
res2[49:52, ]
#    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 49          5.3         3.7          1.5         0.2 south West
# 50          5.0         3.3          1.4         0.2 south West
# 51          7.0         3.2          4.7         1.4 south West
# 52          6.4         3.2          4.5         1.5 south West

Using same data as @jay.sf , you could store unique values in a dataframe and then make the replace using match() :

#Keys
Keys <- data.frame(Species=unique(iris$Species),
                   Replace=c('South','North','East'),stringsAsFactors = F)

It will look like this:

Keys
     Species Replace
1     setosa   South
2 versicolor   North
3  virginica    East

Next, the replacement:

#Replace
iris$Species <- Keys[match(iris$Species,Keys$Species),"Replace"]

Output:

head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2   South
2          4.9         3.0          1.4         0.2   South
3          4.7         3.2          1.3         0.2   South
4          4.6         3.1          1.5         0.2   South
5          5.0         3.6          1.4         0.2   South
6          5.4         3.9          1.7         0.4   South

tail(iris)
    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
145          6.7         3.3          5.7         2.5    East
146          6.7         3.0          5.2         2.3    East
147          6.3         2.5          5.0         1.9    East
148          6.5         3.0          5.2         2.0    East
149          6.2         3.4          5.4         2.3    East
150          5.9         3.0          5.1         1.8    East

Just to complete methods

library(data.table)
crimedata19_20 <-data.table(crimedata19_20)
West_cols<-c("name1", "name2", ...)
crimedata19_20[Falls.within %in% West_cols, Area:="South West"]

I would not use gsub and instead create a new column for your Areas. Maybe you need the information about the stations later on.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM