I have the following dataframe
ColumnA=c("Kuala Lumpur Sector 2 new","old Jakarta Sector31", "Sector 9, 7 Hong Kong","Jakarta new Sector22")
and am extracting the Sector number to a separate column
gsub(".*Sector ?([0-9]+).*","\\1",ColumnA)
Is there a more elegant way to capture errors if 'Sector' does not appear on one line than an if else statement?
If the word 'Sector' does not appear on one line I simply want to set the value of that row to blank.
I thought of using str_detect first to see if 'Sector' was there TRUE/FALSE, but this is quite an ugly solution.
Thanks for any help.
If the word 'Sector' does not appear on one line I simply want to set the value of that row to blank.
To achieve that, use alternation operator |
:
ColumnA=c("Kuala Lumpur 2 new","old Jakarta Sector31", "Sector 9, 7 Hong Kong","Jakarta new Sector22")
gsub("^(?:.*Sector ?([0-9]+).*|.*)$","\\1",ColumnA)
Result: [1] "" "31" "9" "22"
(as Kuala Lumpur 2 new
has no Sector
, the second part with no capturing group matched the whole string).
See IDEONE demo
library(stringr)
as.vector(sapply(str_extract(ColumnA, "(?<=Sector\\s{0,10})([0-9]+)"),function(x) replace(x,is.na(x),'')))
I think this is what you need.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.