[英]Extracting century and year from a string
I have a large column displaying a string such as:我有一个显示字符串的大列,例如:
20-1843PA-HY-4563-214DF 20-1843PA-HY-4563-214DF
The "20" is the century while the "18 is the year. What is the simplest way to extract these two using a function and have an output of 2018 in R? “20”是世纪,而“18”是年份。使用 function 并在 ZE1E1D3D40573127E9AFZEE0480C 中有 2018 年的 output 提取这两者的最简单方法是什么?
We can use sub
to capture the digits as a group from the start ( ^
) of the string followed by the -
, then capture the two digits ( (\\d{2})
) and replace with the backreference ( \\1\\2
) of the captured group我们可以使用sub
从字符串的开头 ( ^
) 后跟-
将数字捕获为一组,然后捕获两个数字 ( (\\d{2})
) 并替换为反向引用 ( \\1\\2
) 被捕获的组
f1 <- function(nm) as.numeric(sub("^(\\d+)-(\\d{2}).*", "\\1\\2", nm))
f1(str1)
#[1] 2018
str1 <- "20-1843PA-HY-4563-214DF"
I would do something like this:我会做这样的事情:
chr_collumn<-"20-1843PA-HY-4563-214DF"
chr_collumn<-strsplit(chr_collumn,"-")
chr_collumn<-unlist(chr_collumn)[1:2]
chr_year<-paste0(chr_collumn[1],strtrim(chr_collumn[2],width=2))
chr_year<-as.numeric(chr_year)
chr_year
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.