[英]R extract numeric values from text
Using R version 3.0.3 (2014-03-06): 使用R版本3.0.3(2014-03-06):
When UIM (unique identifying marker), what is the recommended approach for extracting the numeric portion of the string of text? 当UIM(唯一标识标记)时,提取文本字符串的数字部分的推荐方法是什么? Also, I don't to extract the numeric portion when BAD is present.
此外,当存在BAD时,我不提取数字部分。
text_entry<-rbind("Text words UIM 1234, 5678, 9012. More text and words.", #1
"Text words UIM 2143 6589 1032. More text and words.", #2
"Text words UIMs 4123 and 8657 and more text and words.", #3
"Text words BAD 4123 and 8657 and more text and words.", #4
"Text words UIM-2143, UIM-6589, and UIM-1032. More text and words.", #5
"Text words BAD-2143, BAD-6589, and BAD-1032. More text and words.", #6
"Text words UIM that follow: 2143, 6589, and UIM-1032. More text and words.", #7
"Text words UIM that follow: 2143 1032. More text and words.") #8
Extract_desired: Extract_desired:
1234, 5678, 9012 #1
2143, 6589, 1032 #2
4123, 8657 #3
NA (due to BAD) #4
2143, 6589, 1032 #5
NA (due to BAD) #6
2143, 6589, 1032 #7
2143, 1032 #8
Unfortunately, I need to maintain traceability back to the row number, ie know which rows provided UIM info and which ones didn't. 不幸的是,我需要保持可追溯性回到行号,即知道哪些行提供了UIM信息,哪些行没有。
Thanks a ton for any feedback and info. 非常感谢任何反馈和信息。
You can use the package stringr
, here the result is a list but it gives what you want : 你可以使用包
stringr
,这里的结果是一个列表,但它给出了你想要的:
library(stringr)
extract_desired <- lapply(text_entry, function(x) {if (!grepl("BAD", x)) unlist(str_extract_all(x, "\\d+"))})
extract_desired[c(1,4)]
## [[1]]
## [1] "1234" "5678" "9012"
## [[2]]
## NULL
If you want to know which row don't provide info, you can use : 如果您想知道哪一行不提供信息,您可以使用:
which(sapply(extract_desired, is.null))
## [1] 4 6
EDIT : 编辑:
If you don't want to use stringr
, you can do like sebastian-c say here : 如果你不希望使用
stringr
,你可以像塞巴斯蒂安-C说在这里 :
extract_desired <- lapply(text_entry, function(x) {if (!grepl("BAD", x)) unlist(regmatches(x, gregexpr("[[:digit:]]+", x)))})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.