简体   繁体   English

一次重命名 R 中数据框中的多个单元格

[英]Renaming multiple cells in a data frame in R at once

I want to group each police station in the UK based on its region, however being a newbie I don't know how to rename multiple elements at once.我想根据地区对英国的每个警察局进行分组,但是作为一个新手,我不知道如何一次重命名多个元素。

Example: How it currently looks like示例:它目前的样子

The police stations of Avon and Somerset, Dorset, Gloucester and Wiltshire are located in the South West.雅芳和萨默塞特、多塞特、格洛斯特和威尔特郡的警察局位于西南部。 I need a function that renames the police stations above "South West".我需要一个 function 将警察局重命名为“西南”上方。

I would do it in the original csv data set I donwloaded from the UK police website, however my analysis ranges from January 2019 to November 2020 and each csv data set can only be downloaded by month, by region (for a total of about 900 csv files). I would do it in the original csv data set I donwloaded from the UK police website, however my analysis ranges from January 2019 to November 2020 and each csv data set can only be downloaded by month, by region (for a total of about 900 csv文件)。

I am aware of the function below to select single cells in a data frame, however this data set is way too big for this to be viable.我知道数据框中的 function 到 select 单个单元格,但是这个数据集太大了,不可行。

data[row number, col number] <- "South West"

Any suggestion would be greatly appreciated.任何建议将不胜感激。 Thanks in advance for rescuing a newbie.在此先感谢您救了一个新手。

ps I merged every single csv dataset of every police station throghout 2019 and 2020 using ps 我使用 2019 年和 2020 年每个警察局的每个 csv 数据集合并

crimedata19_20 <- list.files(path="C:/Users/X/Desktop/Crime data/2019-2020",
                    pattern="*.csv")
crimedata19_20 <- do.call("rbind",lapply(crimedata19_20,FUN=function(files){ read.csv(files)})) 

Using gsub with which you may replace a pattern.使用gsub可以替换模式。 Example using the iris data set that comes with R:使用R自带的iris数据集示例:

iris[49:52, ]
#    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 49          5.3         3.7          1.5         0.2     setosa
# 50          5.0         3.3          1.4         0.2     setosa
# 51          7.0         3.2          4.7         1.4 versicolor
# 52          6.4         3.2          4.5         1.5 versicolor

Replace all "setosa" with "South West" in the "Species" column."Species"栏中将所有"setosa"替换为"South West"

res <- transform(iris,
          Species=gsub(pattern="setosa", replacement="south West", Species))
res[49:52, ]
#    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 49          5.3         3.7          1.5         0.2 south West
# 50          5.0         3.3          1.4         0.2 south West
# 51          7.0         3.2          4.7         1.4 versicolor
# 52          6.4         3.2          4.5         1.5 versicolor

Edit编辑

Multiple replacements you may separate with an |您可以使用|分隔多个替换项(or). (或者)。

res2 <- transform(iris,
                 Species=gsub(pattern="setosa|versicolor", replacement="south West", Species))
res2[49:52, ]
#    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
# 49          5.3         3.7          1.5         0.2 south West
# 50          5.0         3.3          1.4         0.2 south West
# 51          7.0         3.2          4.7         1.4 south West
# 52          6.4         3.2          4.5         1.5 south West

Using same data as @jay.sf , you could store unique values in a dataframe and then make the replace using match() :使用与@jay.sf相同的数据,您可以将唯一值存储在 dataframe 中,然后使用match()进行替换:

#Keys
Keys <- data.frame(Species=unique(iris$Species),
                   Replace=c('South','North','East'),stringsAsFactors = F)

It will look like this:它看起来像这样:

Keys
     Species Replace
1     setosa   South
2 versicolor   North
3  virginica    East

Next, the replacement:接下来,替换:

#Replace
iris$Species <- Keys[match(iris$Species,Keys$Species),"Replace"]

Output: Output:

head(iris)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2   South
2          4.9         3.0          1.4         0.2   South
3          4.7         3.2          1.3         0.2   South
4          4.6         3.1          1.5         0.2   South
5          5.0         3.6          1.4         0.2   South
6          5.4         3.9          1.7         0.4   South

tail(iris)
    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
145          6.7         3.3          5.7         2.5    East
146          6.7         3.0          5.2         2.3    East
147          6.3         2.5          5.0         1.9    East
148          6.5         3.0          5.2         2.0    East
149          6.2         3.4          5.4         2.3    East
150          5.9         3.0          5.1         1.8    East

Just to complete methods只是为了完成方法

library(data.table)
crimedata19_20 <-data.table(crimedata19_20)
West_cols<-c("name1", "name2", ...)
crimedata19_20[Falls.within %in% West_cols, Area:="South West"]

I would not use gsub and instead create a new column for your Areas.我不会使用gsub而是为您的区域创建一个新列。 Maybe you need the information about the stations later on.也许您稍后需要有关车站的信息。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM