简体   繁体   English

R中带有as.date的新变量

[英]new variable with as.date in R

this is head(both$stterm) 这是head(both$stterm)

 stterm
1 2011-01-19
2 2012-01-19
3 2007-09-01
4 2011-09-01
5 2008-09-01
6 2013-09-01

this is, as I said just part of a dataset I have 4021 observations. 正如我所说,这只是我有4021个观测值的一部分数据集。 I want to create a new column, were each date instead represents a value as follows below. 我想创建一个新列,每个日期分别代表一个值,如下所示。

Variable should be continuous. 变量应该是连续的。

I have tested as.date but I just got a column full of NULL then. 我已经测试了as.date,但是那时候我只得到了一个充满NULL的列。

Important that it is 2008-09-01 = 8 and not 08 重要的是2008-09-01 = 8而不是08

"2007-09-01"=7,
"2008-09-01"=8,
"2009-01-19"=9,
"2009-09-01"=9,
"2010-01-19"=10,
"2010-09-01"=10,
"2011-01-19"=11,
"2011-09-01"=11,
"2012-01-19"=12,
"2012-09-01"=12,
"2013-01-19"=13,
"2013-09-01"=13,
"2014-01-19"=14)

so what I want to do is simply to create a column with the digits instead of the actual dates. 所以我想做的就是简单地用数字而不是实际日期创建一列。 the new variable will be called: calenderyear. 新变量将称为: calenderyear.

I need tips on how to write this in R 我需要有关如何在R中编写此代码的提示

You can do this as follows: 您可以按照以下步骤进行操作:

require(lubridate)
dat$year <- year(as.Date(dat$stterm))-2000

Result: 结果:

> dat
      stterm year
1 2011-01-19   11
2 2012-01-19   12
3 2007-09-01    7
4 2011-09-01   11
5 2008-09-01    8
6 2013-09-01   13

Data: 数据:

dat <- read.table(header = TRUE, stringsAsFactors = FALSE, text = " stterm
1 2011-01-19
2 2012-01-19
3 2007-09-01
4 2011-09-01
5 2008-09-01
6 2013-09-01")

Try the lubridate library 尝试lubridate

install.packages(lubridate)
library(lubridate)
year(ymd(both$stterm))-2000

you can try this 你可以试试看

d <- as.Date(c("2007-09-01", "2008-09-01", "2009-01-19", "2009-09-01", "2010-01-19", "2010-09-01", "2011-01-19", "2011-09-01", "2012-01-19", "2012-09-01", "2013-01-19", "2013-09-01", "2014-01-19"), format="%Y-%m-%d")
sub("^0", "", sub("[[:digit:]]{2}([[:digit:]]{2}).*", "\\1", d))
 [1] "7"  "8"  "9"  "9"  "10" "10" "11" "11" "12" "12" "13" "13" "14"

You can try do this using base R: First to reproduce a subset of your dataset: 您可以尝试使用基数R进行此操作:首先重现数据集的子集:

both <- data.frame( stterm=as.Date(c('2011-01-19','2012-01-19', '2007-09-01','2011-09-01','2008-09-01','2013-09-01')))

both
      stterm
1 2011-01-19
2 2012-01-19
3 2007-09-01
4 2011-09-01
5 2008-09-01
6 2013-09-01

both$calenderyear <- as.numeric(format(both$stterm,"%y"))
both
      stterm calenderyear
1 2011-01-19           11
2 2012-01-19           12
3 2007-09-01            7
4 2011-09-01           11
5 2008-09-01            8
6 2013-09-01           13

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM