[英]Extracting specific word followed by another in R
I have a description of open positions. 我有空缺职位的描述。 I want to take grade out of them and post it in a column adjacent.
我想对他们进行评分,然后将其发布在相邻的栏中。 It can be done by fetching word next to "Grade:" in the text description
可以通过提取文字说明中“成绩:”旁边的单词来完成
Simulation 模拟
structure(list(description = structure(2:1, .Label = c("Grade: L3 Position title bla bla bla",
"Head of xxxxxxxx Grade: L5 Last Date to Apply: 22nd July 2019"
), class = "factor"), division = structure(2:1, .Label = c("ABC",
"XYZ"), class = "factor")), class = "data.frame", row.names = c(NA,
-2L))
Requested Result 要求的结果
Description Division Grade
sdsdsdsd XYZ L5
asdasdsadas ABC L3
I found this solution, it can get the word out, but not put it in column. 我找到了这个解决方案,它可以解决这个问题,但不能放在专栏中。
Extract text that follows a specific word/s in R 提取R中特定单词之后的文本
You can use sub
and extract a word after "Grade"
with 0 or more whitespace before and after :
您可以使用
sub
并在"Grade"
之后的单词前后提取0或多个空格的单词:
sub(".*Grade\\s*:\\s*(\\w+).*", "\\1", df$description)
#[1] "L5" "L3"
you can use the stringr package like this: 您可以像这样使用stringr包:
library(stringr)
df[,"Grade"] <- sub("Grade: ", "", str_extract(df$description, "Grade: [^ ]+"))
Data: 数据:
df <- structure(list(description = structure(2:1, .Label = c("Grade: L3 Position title bla bla bla",
"Head of xxxxxxxx Grade: L5 Last Date to Apply: 22nd July 2019"
), class = "factor"), division = structure(2:1, .Label = c("ABC",
"XYZ"), class = "factor")), class = "data.frame", row.names = c(NA,
-2L))
EDIT: I have just seen that there are far better answers inside the comments. 编辑:我刚刚看到评论中有更好的答案。 So better use one of them since they do not rely on an extra package.
因此最好使用其中一个,因为它们不依赖额外的程序包。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.