[英]How to replace string for every row in specfic column using dplyr and stringr
I have the following tibble: 我有以下问题:
library(tidyverse)
df <- tibble::tribble(
~sample, ~colB, ~colC,
"foo", 1, 2,
"bar_x", 2, 3,
"qux.6hr.ID", 3, 4,
"dog", 1, 1
)
df
#> # A tibble: 4 x 3
#> sample colB colC
#> <chr> <dbl> <dbl>
#> 1 foo 1 2
#> 2 bar_x 2 3
#> 3 qux.6hr.ID 3 4
#> 4 dog 1 1
df <- factor(final_df$samples, levels=c("bar_x","foo","qux.6hr.ID","dog"))
df
#> [1] foo bar_x qux.6hr.ID dog
#> Levels: bar_x foo qux.6hr.ID dog
What I want to do is for every row in sample
column remove these substrings: _x
and .6hr
if exist. 我要为sample
列中的每一行删除以下子字符串: _x
和.6hr
如果存在)。 The final table looks like this: 决赛桌看起来像这样:
sample colB colC
foo 1 2
bar 2 3
qux.ID 3 4
dog 1 1
How can I achieve that? 我该如何实现?
We can use 我们可以用
df %>%
mutate(sample = gsub("_x|\\.\\d+[A-Za-z]+", "", sample))
# A tibble: 4 x 3
# sample colB colC
# <chr> <dbl> <dbl>
#1 foo 1 2
#2 bar 2 3
#3 qux.ID 3 4
#4 dog 1 1
If the 'sample' column is factor
class either we can wrap with factor
on the output of gsub
or do this on the levels
of sample 如果“样本”列是factor
类,我们可以在gsub
的输出中用factor
包装,也可以在样本levels
上进行
levels(df$sample) <- gsub("_x|\\.\\d+[A-Za-z]+", "", levels(df$sample))
df$sample
#[1] foo bar qux.ID dog
#Levels: bar foo qux.ID dog
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.