[英]new variable with as.date in R
this is head(both$stterm)
这是
head(both$stterm)
stterm
1 2011-01-19
2 2012-01-19
3 2007-09-01
4 2011-09-01
5 2008-09-01
6 2013-09-01
this is, as I said just part of a dataset I have 4021 observations. 正如我所说,这只是我有4021个观测值的一部分数据集。 I want to create a new column, were each date instead represents a value as follows below.
我想创建一个新列,每个日期分别代表一个值,如下所示。
Variable should be continuous. 变量应该是连续的。
I have tested as.date but I just got a column full of NULL then. 我已经测试了as.date,但是那时候我只得到了一个充满NULL的列。
Important that it is 2008-09-01 = 8 and not 08 重要的是2008-09-01 = 8而不是08
"2007-09-01"=7,
"2008-09-01"=8,
"2009-01-19"=9,
"2009-09-01"=9,
"2010-01-19"=10,
"2010-09-01"=10,
"2011-01-19"=11,
"2011-09-01"=11,
"2012-01-19"=12,
"2012-09-01"=12,
"2013-01-19"=13,
"2013-09-01"=13,
"2014-01-19"=14)
so what I want to do is simply to create a column with the digits instead of the actual dates. 所以我想做的就是简单地用数字而不是实际日期创建一列。 the new variable will be called:
calenderyear.
新变量将称为:
calenderyear.
I need tips on how to write this in R 我需要有关如何在R中编写此代码的提示
You can do this as follows: 您可以按照以下步骤进行操作:
require(lubridate)
dat$year <- year(as.Date(dat$stterm))-2000
Result: 结果:
> dat
stterm year
1 2011-01-19 11
2 2012-01-19 12
3 2007-09-01 7
4 2011-09-01 11
5 2008-09-01 8
6 2013-09-01 13
Data: 数据:
dat <- read.table(header = TRUE, stringsAsFactors = FALSE, text = " stterm
1 2011-01-19
2 2012-01-19
3 2007-09-01
4 2011-09-01
5 2008-09-01
6 2013-09-01")
Try the lubridate
library 尝试
lubridate
库
install.packages(lubridate)
library(lubridate)
year(ymd(both$stterm))-2000
you can try this 你可以试试看
d <- as.Date(c("2007-09-01", "2008-09-01", "2009-01-19", "2009-09-01", "2010-01-19", "2010-09-01", "2011-01-19", "2011-09-01", "2012-01-19", "2012-09-01", "2013-01-19", "2013-09-01", "2014-01-19"), format="%Y-%m-%d")
sub("^0", "", sub("[[:digit:]]{2}([[:digit:]]{2}).*", "\\1", d))
[1] "7" "8" "9" "9" "10" "10" "11" "11" "12" "12" "13" "13" "14"
You can try do this using base R: First to reproduce a subset of your dataset: 您可以尝试使用基数R进行此操作:首先重现数据集的子集:
both <- data.frame( stterm=as.Date(c('2011-01-19','2012-01-19', '2007-09-01','2011-09-01','2008-09-01','2013-09-01')))
both
stterm
1 2011-01-19
2 2012-01-19
3 2007-09-01
4 2011-09-01
5 2008-09-01
6 2013-09-01
both$calenderyear <- as.numeric(format(both$stterm,"%y"))
both
stterm calenderyear
1 2011-01-19 11
2 2012-01-19 12
3 2007-09-01 7
4 2011-09-01 11
5 2008-09-01 8
6 2013-09-01 13
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.