简体   繁体   English

如何替换数据框中的特定单词

[英]How to replace a specific word in dataframe

category.1 <- c("TM","TM","CPA","TM","CPC")
category.2 <- c("LS","LS","DSP","DSP","AF")
platform <- c("facebook","facebook","yahoo","google","google")

dat <- data.frame(platform,category.1,category.2)
dat
  platform category.1 category.2
1 facebook         TM         LS
2 facebook         TM         LS
3    yahoo        CPA        DSP
4   google         TM        DSP
5   google        CPC         AF

when category.1 is 'TM' and category.2 'LS', I wanna replace 'LS' to 'LS1' 当category.1是'TM'和category.2'LS'时,我想把'LS'替换为'LS1'

      platform category.1 category.2
    1 facebook         TM         LS1
    2 facebook         TM         LS1
    3    yahoo        CPA        DSP
    4   google         TM        DSP
    5   google        CPC         AF

I tried this way, its return error. 我试过这种方式,它的返回错误。

 dat$category.1[dat$category.1=='TM'& dat$category.2=='LS',] <- 'LS1'

thanks for your reading. 谢谢你的阅读。

Another approach; 另一种方法; using dplyr and the ifelse base function. 使用dplyrifelse基函数。

> library(dplyr)
> dat <-
    dat %>%
    mutate(category.2 = ifelse(category.1 == "TM" & category.2 == "LS",
           "LS1",
           as.character(category.2)))


> dat
  platform category.1 category.2
1 facebook         TM        LS1
2 facebook         TM        LS1
3    yahoo        CPA        DSP
4   google         TM        DSP
5   google        CPC         AF

You can set stringsAsFactors = FALSE when creating your data set 创建数据集时,可以设置stringsAsFactors = FALSE

dat <- data.frame(platform,category.1,category.2, stringsAsFactors = FALSE)     

Then you can use your code, just remove the comma like this 然后你可以使用你的代码,只需删除这样的逗号

dat$category.2[dat$category.1=='TM'& dat$category.2=='LS'] <- "LS1"

If you want a really efficient way of doing replacements by condition, check out the data.table package and its binary search/replacement by reference 如果您想要一种非常有效的条件替换方法,请查看data.table包及其二进制搜索/替换参考

library(data.table)
setkey(setDT(dat), category.1, category.2)
dat[J("TM", "LS"), category.2 := "LS1"][]
#    platform category.1 category.2
# 1:    yahoo        CPA        DSP
# 2:   google        CPC         AF
# 3:   google         TM        DSP
# 4: facebook         TM        LS1
# 5: facebook         TM        LS1

setDT converts to data.table object. setDT转换为data.table对象。 setkey keys the data in order to perform a binary join. setkey键入数据以执行二进制连接。 J() performs the actual binary join. J()执行实际的二进制连接。 := performs assigment by reference and updates category.1 in place . :=通过参考进行分配新建分配FY和更新category.1 到位

Though if your data set isn't big, you could just do 虽然如果你的数据集不大,你可以这样做

dat[category.1 == "TM" & category.2 == "LS", category.2 := "LS1"][]

Some benchmarks on a slightly bigger data set (I didn't test base because you need to convert to character class in order for this to work) 稍大一些数据集的一些基准测试(我没有测试基础,因为你需要转换为字符类才能使其工作)

library(data.table)
library(dplyr)
library(microbenchmark)

dat2 <- data.frame(lapply(dat, rep, 1e5))
dat3 <- copy(dat2)
dat4 <- copy(dat2)

dplyrfunc <- function(x) {
  x <- x %>%
    mutate(category.2 = 
          ifelse(category.1 == "TM" & category.2 == "LS",
          "LS1", as.character(category.2)))
  x
}

data.tablefunc1 <- function(x){
  setkey(setDT(x), category.1, category.2)
  x[J("TM", "LS"), category.2 := "LS1"][]
}

data.tablefunc2 <- function(x){
  setDT(x)[category.1 == "TM" & category.2 == "LS", category.2 := "LS1"][]
}

## Unit: milliseconds
##                  expr        min         lq      mean     median         uq       max neval
##       dplyrfunc(dat2) 277.261833 291.647719 313.76279 302.337902 335.703250 401.38212   100
## data.tablefunc1(dat3)   5.371047   5.905744   8.12169   6.904871   8.266383  59.83116   100
## data.tablefunc2(dat4)  31.980348  32.870719  38.26239  34.745612  39.309186  88.91202   100

You can use revalue from the plyr package: 您可以使用revalueplyr包:

library(plyr)
dat$category.2 <- revalue(dat$category.2, c("LS" = "LS1"))
dat

  platform category.1 category.2
1 facebook         TM        LS1
2 facebook         TM        LS1
3    yahoo        CPA        DSP
4   google         TM        DSP
5   google        CPC         AF

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将特定单词替换为 R 中的另一个单词 - how to replace a specific word into another word in R 如果包含特定单词,如何从 R 的列表中消除元素(数据帧) - how to eliminate element(dataframe) from a list in R if contain a specific word 如何从数据帧的特定列和从 R 中的另一个数据帧传递的特定列中替换模式? - how to replace pattern from dataframe's specific column and that specific column passing from another dataframe in R? 使用带有 gsub 的 lapply 替换 dataframe 中的单词,使用另一个 dataframe 作为 'dictionnary' - Using lapply with gsub to replace word in dataframe using another dataframe as 'dictionnary' 用一个特定的词替换整个字符串 - Replace entire string by one specific word 如何通过 R 中的循环将 dataframe 列中的特定值替换为其他特定值 - How do I replace specific values in dataframe colum with other specific values through a loop in R R-如何用一个单词替换两个单词? - R - How to replace two word with one word? 根据另一个数据框替换特定值 - Replace specific values based on another dataframe 替换数据框中多个列中的特定字符串 - replace a specific strings from multiple columns in a dataframe 如何用R替换来自特定单词且首字母大写的单词 - How to replace words that come a specific word and has its first letter in upper case with R
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM