如何使用dplyr和stringr替换特定列中的每一行的字符串

Question

I have the following tibble: 我有以下问题：

library(tidyverse)

df <- tibble::tribble(
  ~sample, ~colB, ~colC,
  "foo",   1,  2,
  "bar_x",   2,  3,
  "qux.6hr.ID",   3,  4,
  "dog",   1,  1
)


df
#> # A tibble: 4 x 3
#>       sample  colB  colC
#>        <chr> <dbl> <dbl>
#> 1        foo     1     2
#> 2      bar_x     2     3
#> 3 qux.6hr.ID     3     4
#> 4        dog     1     1

df <- factor(final_df$samples, levels=c("bar_x","foo","qux.6hr.ID","dog"))

    df
#> [1] foo        bar_x      qux.6hr.ID dog       
#> Levels: bar_x foo qux.6hr.ID dog

What I want to do is for every row in sample column remove these substrings: _x and .6hr if exist. 我要为sample列中的每一行删除以下子字符串： _x和.6hr如果存在）。 The final table looks like this: 决赛桌看起来像这样：

     sample  colB  colC
        foo     1     2
        bar     2     3
     qux.ID     3     4
        dog     1     1

How can I achieve that? 我该如何实现？

Answer 1

We can use 我们可以用

df %>% 
     mutate(sample = gsub("_x|\\.\\d+[A-Za-z]+", "", sample))
# A tibble: 4 x 3 
#   sample  colB  colC
#    <chr> <dbl> <dbl>
#1    foo     1     2
#2    bar     2     3
#3 qux.ID     3     4
#4    dog     1     1

If the 'sample' column is factor class either we can wrap with factor on the output of gsub or do this on the levels of sample 如果“样本”列是factor类，我们可以在gsub的输出中用factor包装，也可以在样本levels上进行

levels(df$sample) <- gsub("_x|\\.\\d+[A-Za-z]+", "", levels(df$sample))
df$sample
#[1] foo    bar    qux.ID dog   
#Levels: bar foo qux.ID dog

如何使用dplyr和stringr替换特定列中的每一行的字符串

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-06-03 05:12:16

如何使用dplyr和stringr替换特定列中的每一行的字符串

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-06-03 05:12:16

解决方案1
2 已采纳 2017-06-03 05:12:16