从字符串中提取世纪和年份

Question

I have a large column displaying a string such as:我有一个显示字符串的大列，例如：

20-1843PA-HY-4563-214DF 20-1843PA-HY-4563-214DF

The "20" is the century while the "18 is the year. What is the simplest way to extract these two using a function and have an output of 2018 in R? “20”是世纪，而“18”是年份。使用 function 并在 ZE1E1D3D40573127E9AFZEE0480C 中有 2018 年的 output 提取这两者的最简单方法是什么？

Answer 1

We can use sub to capture the digits as a group from the start ( ^ ) of the string followed by the - , then capture the two digits ( (\\d{2}) ) and replace with the backreference ( \\1\\2 ) of the captured group我们可以使用sub从字符串的开头 ( ^ ) 后跟-将数字捕获为一组，然后捕获两个数字 ( (\\d{2}) ) 并替换为反向引用 ( \\1\\2 ) 被捕获的组

f1 <- function(nm) as.numeric(sub("^(\\d+)-(\\d{2}).*", "\\1\\2", nm))
f1(str1)
#[1] 2018

data数据

str1 <- "20-1843PA-HY-4563-214DF"

Answer 2

I would do something like this:我会做这样的事情：

chr_collumn<-"20-1843PA-HY-4563-214DF"
chr_collumn<-strsplit(chr_collumn,"-")
chr_collumn<-unlist(chr_collumn)[1:2]
chr_year<-paste0(chr_collumn[1],strtrim(chr_collumn[2],width=2))
chr_year<-as.numeric(chr_year)
chr_year

从字符串中提取世纪和年份

问题描述

2 个解决方案

解决方案1
1 已采纳 2021-01-08 19:28:27

data数据

解决方案2
1 2021-01-08 23:34:31

从字符串中提取世纪和年份

问题描述

2 个解决方案

解决方案1 1 已采纳 2021-01-08 19:28:27

data数据

解决方案2 1 2021-01-08 23:34:31

解决方案1
1 已采纳 2021-01-08 19:28:27

解决方案2
1 2021-01-08 23:34:31