[英]Calculating the difference between two two-digit years
在R中是否有任何簡單的方法來計算兩列兩位數年份之間的差異(僅幾年,沒有月份/天,因為這里沒有必要),以便生成一列年齡?
我對此很新,並且一直在使用'if'語句和代數而沒有成功。
數據看起來像這樣,但更大:
dat <- data.frame(year1=c("98","99","00","01","02"),
year2=c("03","04","05","06","07"))
您可以使用格式為%y
strptime()
:
dat <- data.frame(year1=c("98","99","00","01","02"),
year2=c("03","04","05","06","07"),
stringsAsFactors = F) # You might want to use this as a default!
dat$year1 <- strptime(dat$year1, format = "%y")
dat$year2 <- strptime(dat$year2, format = "%y")
as.vector(difftime(dat$year2,
dat$year1,
units = "days"))/365.242
4.999311 5.002163 4.999425 4.999425 4.999425
格式化為日期,格式化為數字,取差:
do.call(`-`, lapply(dat[1:2], function(x)
as.numeric(format(as.Date(x, format="%y"), "%Y"))))
#[1] -5 -5 -5 -5 -5
如果您在1900年代早期有舊日期,這可能會遇到無效的情況。 按照?strptime
:
‘%y’ Year without century (00-99). On input, values 00 to 68 are
prefixed by 20 and 69 to 99 by 19 - that is the behaviour
specified by the 2004 and 2008 POSIX standards, but they do
also say ‘it is expected that in a future version the default
century inferred from a 2-digit year will change’.
df$age <- ifelse(df$year2 < df$year1, df$year2 - df$year1 + 100, df$year2 -df$year1)
假設下應該工作year2
某種當年和year1
是出生年份,並沒有1918年以前出生的人。
例:
df <- data.frame(year1 = sample(18:99, 1000, replace = T),
year2 = sample(1:99, 1000, replace = T))
> head(df)
year1 year2
1 27 88
2 41 55
3 90 36
4 81 93
5 56 60
6 27 61
df$age <- ifelse(df$year2 < df$year1, df$year2 - df$year1 + 100, df$year2 -df$year1)
> head(df)
year1 year2 age
1 73 88 15
2 50 17 67
3 47 41 94
4 54 43 89
5 36 82 46
6 62 85 23
使用您的數據示例:
dat <- data.frame(year1=c("98","99","00","01","02"),
year2=c("03","04","05","06","07"))
dat$age <- ifelse(as.numeric(as.character(dat$year2)) < as.numeric(as.character(dat$year1)),
as.numeric(as.character(dat$year2)) - as.numeric(as.character(dat$year1)) + 100,
as.numeric(as.character(dat$year2)) - as.numeric(as.character(dat$year1)))
> dat
year1 year2 age
1 98 03 5
2 99 04 5
3 00 05 5
4 01 06 5
5 02 07 5
一個方法是使用as.Date
與dplyr
鏈:
dat %>%
mutate(year1 = as.Date(year1, format = "%y"),
year2 = as.Date(year2, format = "%y")) %>%
mutate(age = year2 - year1)
返回:
year1 year2 age
1 1998-10-26 2003-10-26 1826 days
2 1999-10-26 2004-10-26 1827 days
3 2000-10-26 2005-10-26 1826 days
4 2001-10-26 2006-10-26 1826 days
5 2002-10-26 2007-10-26 1826 days
ps它假定兩列的默認日期和月份,但它假設兩者都是相同的值,因此不會影響差異計算。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.