简体   繁体   English

替换数据框中的特殊字符

[英]Replace special character in data frame

I have a dataframe which contains in different cells a special character which I know which is. 我有一个数据帧,在不同的单元格中包含一个我知道的特殊字符。 An example of the structure: 结构的一个例子:

df = data.frame(col_1 = c("21 myspec^ch2 12",NA), 
                col_2 = c("1 myspec^ch2 4","4 myspec^ch2 212"))

The character is this myspec^ch2 and I would like to replace with -. 这个角色是这个myspec^ch2 ,我想用 - 替换。 An example of expected output: 预期输出的一个例子:

df = data.frame(col_1 = c("21-12",NA), 
                col_2 = c("1-4","4-212"))

I tried this but it is not working: 我试过这个,但它不起作用:

df [ df == " myspec^ch2 " ] <- "-"

要使gsub在整个数据框架上工作,请使用apply

apply(df, 2, function(x) gsub(" myspec\\^ch2 ", "-", x))

You really want to do a regex-style substitution here. 你真的想在这里做一个正则表达式的替换。 However, in regex, ^ is seen as the beginning of the line (rather than a literal caret). 但是,在正则表达式中, ^被视为行的开头(而不是字面插入符号)。 So you can do something like this (using the stringr package): 所以你可以做这样的事情(使用stringr包):

library(dplyr)
library(stringr)

fixed_df  <- df %>%
    mutate_all(funs(str_replace_all( . , " myspec\\^ch2 ", "-"))

Note the double backslashes in front of the caret--that escapes the caret and tells R to interpret it literally, rather than as the beginning of the line. 注意插入符号前面的双反斜杠 - 它从插入符号中逃脱并告诉R按字面解释它,而不是作为行的开头。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM