[英]How to replace a specific word in dataframe
category.1 <- c("TM","TM","CPA","TM","CPC")
category.2 <- c("LS","LS","DSP","DSP","AF")
platform <- c("facebook","facebook","yahoo","google","google")
dat <- data.frame(platform,category.1,category.2)
dat
platform category.1 category.2
1 facebook TM LS
2 facebook TM LS
3 yahoo CPA DSP
4 google TM DSP
5 google CPC AF
when category.1 is 'TM' and category.2 'LS', I wanna replace 'LS' to 'LS1' 当category.1是'TM'和category.2'LS'时,我想把'LS'替换为'LS1'
platform category.1 category.2
1 facebook TM LS1
2 facebook TM LS1
3 yahoo CPA DSP
4 google TM DSP
5 google CPC AF
I tried this way, its return error. 我试过这种方式,它的返回错误。
dat$category.1[dat$category.1=='TM'& dat$category.2=='LS',] <- 'LS1'
thanks for your reading. 谢谢你的阅读。
Another approach; 另一种方法; using
dplyr
and the ifelse
base function. 使用
dplyr
和ifelse
基函数。
> library(dplyr)
> dat <-
dat %>%
mutate(category.2 = ifelse(category.1 == "TM" & category.2 == "LS",
"LS1",
as.character(category.2)))
> dat
platform category.1 category.2
1 facebook TM LS1
2 facebook TM LS1
3 yahoo CPA DSP
4 google TM DSP
5 google CPC AF
You can set stringsAsFactors = FALSE
when creating your data set 创建数据集时,可以设置
stringsAsFactors = FALSE
dat <- data.frame(platform,category.1,category.2, stringsAsFactors = FALSE)
Then you can use your code, just remove the comma like this 然后你可以使用你的代码,只需删除这样的逗号
dat$category.2[dat$category.1=='TM'& dat$category.2=='LS'] <- "LS1"
If you want a really efficient way of doing replacements by condition, check out the data.table
package and its binary search/replacement by reference 如果您想要一种非常有效的条件替换方法,请查看
data.table
包及其二进制搜索/替换参考
library(data.table)
setkey(setDT(dat), category.1, category.2)
dat[J("TM", "LS"), category.2 := "LS1"][]
# platform category.1 category.2
# 1: yahoo CPA DSP
# 2: google CPC AF
# 3: google TM DSP
# 4: facebook TM LS1
# 5: facebook TM LS1
setDT
converts to data.table
object. setDT
转换为data.table
对象。 setkey
keys the data in order to perform a binary join. setkey
键入数据以执行二进制连接。 J()
performs the actual binary join. J()
执行实际的二进制连接。 :=
performs assigment by reference and updates category.1
in place . :=
通过参考进行分配新建分配FY和更新category.1
到位 。
Though if your data set isn't big, you could just do 虽然如果你的数据集不大,你可以这样做
dat[category.1 == "TM" & category.2 == "LS", category.2 := "LS1"][]
Some benchmarks on a slightly bigger data set (I didn't test base because you need to convert to character class in order for this to work) 稍大一些数据集的一些基准测试(我没有测试基础,因为你需要转换为字符类才能使其工作)
library(data.table)
library(dplyr)
library(microbenchmark)
dat2 <- data.frame(lapply(dat, rep, 1e5))
dat3 <- copy(dat2)
dat4 <- copy(dat2)
dplyrfunc <- function(x) {
x <- x %>%
mutate(category.2 =
ifelse(category.1 == "TM" & category.2 == "LS",
"LS1", as.character(category.2)))
x
}
data.tablefunc1 <- function(x){
setkey(setDT(x), category.1, category.2)
x[J("TM", "LS"), category.2 := "LS1"][]
}
data.tablefunc2 <- function(x){
setDT(x)[category.1 == "TM" & category.2 == "LS", category.2 := "LS1"][]
}
## Unit: milliseconds
## expr min lq mean median uq max neval
## dplyrfunc(dat2) 277.261833 291.647719 313.76279 302.337902 335.703250 401.38212 100
## data.tablefunc1(dat3) 5.371047 5.905744 8.12169 6.904871 8.266383 59.83116 100
## data.tablefunc2(dat4) 31.980348 32.870719 38.26239 34.745612 39.309186 88.91202 100
You can use revalue
from the plyr
package: 您可以使用
revalue
从plyr
包:
library(plyr)
dat$category.2 <- revalue(dat$category.2, c("LS" = "LS1"))
dat
platform category.1 category.2
1 facebook TM LS1
2 facebook TM LS1
3 yahoo CPA DSP
4 google TM DSP
5 google CPC AF
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.