简体   繁体   English

提取特定词,然后在R中提取另一个词

[英]Extracting specific word followed by another in R

I have a description of open positions. 我有空缺职位的描述。 I want to take grade out of them and post it in a column adjacent. 我想对他们进行评分,然后将其发布在相邻的栏中。 It can be done by fetching word next to "Grade:" in the text description 可以通过提取文字说明中“成绩:”旁边的单词来完成

Simulation 模拟

  structure(list(description = structure(2:1, .Label = c("Grade: L3 Position title bla bla bla", 
"Head of xxxxxxxx Grade: L5 Last Date to Apply: 22nd July 2019"
), class = "factor"), division = structure(2:1, .Label = c("ABC", 
"XYZ"), class = "factor")), class = "data.frame", row.names = c(NA, 
-2L))

Requested Result 要求的结果

Description     Division     Grade
sdsdsdsd         XYZ          L5
asdasdsadas      ABC          L3

I found this solution, it can get the word out, but not put it in column. 我找到了这个解决方案,它可以解决这个问题,但不能放在专栏中。

Extract text that follows a specific word/s in R 提取R中特定单词之后的文本

You can use sub and extract a word after "Grade" with 0 or more whitespace before and after : 您可以使用sub并在"Grade"之后的单词前后提取0或多个空格的单词:

sub(".*Grade\\s*:\\s*(\\w+).*", "\\1", df$description)
#[1] "L5" "L3"

you can use the stringr package like this: 您可以像这样使用stringr包:

library(stringr)
df[,"Grade"] <- sub("Grade: ", "", str_extract(df$description, "Grade: [^ ]+"))

Data: 数据:

df <- structure(list(description = structure(2:1, .Label = c("Grade: L3 Position title bla bla bla", 
                                                       "Head of xxxxxxxx Grade: L5 Last Date to Apply: 22nd July 2019"
), class = "factor"), division = structure(2:1, .Label = c("ABC", 
                                                           "XYZ"), class = "factor")), class = "data.frame", row.names = c(NA, 
                                                                                                                           -2L))

EDIT: I have just seen that there are far better answers inside the comments. 编辑:我刚刚看到评论中有更好的答案。 So better use one of them since they do not rely on an extra package. 因此最好使用其中一个,因为它们不依赖额外的程序包。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM