如何从特定单词后的字符串中提取

Question

I have this string :我有这个字符串：

string <-"DIS_S_CD_EFS-NO_PCI-CD_ACT_CG-SOM_MT_ECT_CVE"

I need to extract only SOM_MT_ECT_CVE from it.我只需要从中提取SOM_MT_ECT_CVE 。

So for me the key word is SOM (identify its position ).所以对我来说，关键词是SOM （确定它的位置）。

I tried using this :我尝试使用这个：

d <-substr(gregexpr(pattern ='SOM',"DIS_S_CD_EFS-NO_PCI-CD_ACT_CG-SOM_MT_ECT_CVE"),
           nchar("DIS_S_CD_EFS-NO_PCI-CD_ACT_CG-SOM_MT_ECT_CVE"),"DIS_S_CD_EFS-NO_PCI-CD_ACT_CG-SOM_MT_ECT_CVE")

But it return NA values.但它返回 NA 值。

Answer 1

One option is sub to match characters ( .* ) until 'SOM', capture the 'SOM' to the rest of the characters in a group ( (...) ) and in the replacement use the backreference ( \\\\1 ) of the captured group一个选项是sub匹配字符 ( .* ) 直到 'SOM'，将 'SOM' 捕获到一组 ( (...) ) 中的其余字符，并在替换中使用反向引用 ( \\\\1 )被捕获的组

sub(".*(SOM_.*)", "\\1", string)
#[1] "SOM_MT_ECT_CVE"

Or using stringr或者使用stringr

library(stringr)
str_extract(string, "SOM.*")
#[1] "SOM_MT_ECT_CVE"

Answer 2

You can split on the hyphen and get the last word, ie您可以在连字符上拆分并获得最后一个单词，即

tail(strsplit(string, '-', fixed = TRUE)[[1]], 1)
#[1] "SOM_MT_ECT_CVE"

Or with word from stringr ,或者使用stringr word ，

stringr::word(string, -1, sep = '-')
#[1] "SOM_MT_ECT_CVE"

如何从特定单词后的字符串中提取

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-02-06 12:18:38

解决方案2
0 2019-02-06 12:20:42

如何从特定单词后的字符串中提取

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-02-06 12:18:38

解决方案2 0 2019-02-06 12:20:42

解决方案1
1 已采纳 2019-02-06 12:18:38

解决方案2
0 2019-02-06 12:20:42