[英]Date format conversion in R
I was using the syntax我正在使用语法
df$new_column_name <- format(as.Date(df$original_column_name, format = "%d/%m/%Y"),"%m/%d/%Y")
to convert date format of a column in a dataframe called Dailyactivity_Records.转换 dataframe 中名为 Dailyactivity_Records 的列的日期格式。 The original date format for the column
ActvityDate
is mm/dd/YYYY in chr format. ActvityDate
列的原始日期格式是 chr 格式的 mm/dd/YYYY。 as can be seen in the console.在控制台中可以看到。
> dailyactivity_Records <- read.csv("dailyActivity_calories_intensities_steps.csv")
> str(dailyactivity_Records)
'data.frame': 940 obs. of 15 variables:
$ Id : num 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ ActivityDate : chr "04/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
$ TotalSteps : int 13162 10735 10460 9762 12669 9705 13019 15506 10544 9819 ...
$ TotalDistance : num 8.5 6.97 6.74 6.28 8.16 6.48 8.59 9.88 6.68 6.34 ...
$ TrackerDistance : num 8.5 6.97 6.74 6.28 8.16 6.48 8.59 9.88 6.68 6.34 ...
$ LoggedActivitiesDistance: num 0 0 0 0 0 0 0 0 0 0 ...
$ VeryActiveDistance : num 1.88 1.57 2.44 2.14 2.71 3.19 3.25 3.53 1.96 1.34 ...
$ ModeratelyActiveDistance: num 0.55 0.69 0.4 1.26 0.41 0.78 0.64 1.32 0.48 0.35 ...
$ LightActiveDistance : num 6.06 4.71 3.91 2.83 5.04 2.51 4.71 5.03 4.24 4.65 ...
$ SedentaryActiveDistance : num 0 0 0 0 0 0 0 0 0 0 ...
$ VeryActiveMinutes : int 25 21 30 29 36 38 42 50 28 19 ...
$ FairlyActiveMinutes : int 13 19 11 34 10 20 16 31 12 8 ...
$ LightlyActiveMinutes : int 328 217 181 209 221 164 233 264 205 211 ...
$ SedentaryMinutes : int 728 776 1218 726 773 539 1149 775 818 838 ...
$ Calories : int 1985 1797 1776 1745 1863 1728 1921 2035 1786 1775 ...
The required converted date format for the column ActvityDate specified in the last line of code is "%d/%m" where as the converted date format is %Y/%m/%d (which I believe is the default date format of date in R).最后一行代码中指定的 ActvityDate 列所需的转换日期格式为“%d/%m”,其中转换日期格式为 %Y/%m/%d(我认为这是 date 的默认日期格式在R)。 Can someone clarify why?
有人可以澄清为什么吗?
Please see the console below:请看下面的控制台:
> ## converting column ID to character and ACtivityDate to date format
> dailyactivity_Records$Id <- as.character(dailyactivity_Records$Id)
> dailyactivity_Records$Date_ddmm = as.Date(dailyactivity_Records$ActivityDate, format = "%m/%d/%Y", "%d/%m")
> str(dailyactivity_Records)
'data.frame': 940 obs. of 16 variables:
$ Id : chr "1503960366" "1503960366" "1503960366" "1503960366" ...
$ ActivityDate : chr "04/12/2016" "4/13/2016" "4/14/2016" "4/15/2016" ...
$ TotalSteps : int 13162 10735 10460 9762 12669 9705 13019 15506 10544 9819 ...
$ TotalDistance : num 8.5 6.97 6.74 6.28 8.16 6.48 8.59 9.88 6.68 6.34 ...
$ TrackerDistance : num 8.5 6.97 6.74 6.28 8.16 6.48 8.59 9.88 6.68 6.34 ...
$ LoggedActivitiesDistance: num 0 0 0 0 0 0 0 0 0 0 ...
$ VeryActiveDistance : num 1.88 1.57 2.44 2.14 2.71 3.19 3.25 3.53 1.96 1.34 ...
$ ModeratelyActiveDistance: num 0.55 0.69 0.4 1.26 0.41 0.78 0.64 1.32 0.48 0.35 ...
$ LightActiveDistance : num 6.06 4.71 3.91 2.83 5.04 2.51 4.71 5.03 4.24 4.65 ...
$ SedentaryActiveDistance : num 0 0 0 0 0 0 0 0 0 0 ...
$ VeryActiveMinutes : int 25 21 30 29 36 38 42 50 28 19 ...
$ FairlyActiveMinutes : int 13 19 11 34 10 20 16 31 12 8 ...
$ LightlyActiveMinutes : int 328 217 181 209 221 164 233 264 205 211 ...
$ SedentaryMinutes : int 728 776 1218 726 773 539 1149 775 818 838 ...
$ Calories : int 1985 1797 1776 1745 1863 1728 1921 2035 1786 1775 ...
$ Date_ddmm : Date, format: "2016-04-12" "2016-04-13" "2016-04-14" "2016-04-15" ...
First, if you read ?as.Date
(strongly encouraged), you'll see that your third argument (unnamed) is being interpreted as tryFormats = "%d/%m"
.首先,如果您阅读
?as.Date
(强烈建议),您会看到您的第三个参数(未命名)被解释为tryFormats = "%d/%m"
。 However, since然而,由于
tryFormats: 'character' vector of 'format' strings to try if 'format'
is not specified.
and you do include formats=
, then it is doing nothing.并且您确实包含
formats=
,那么它什么都不做。
Second, what you are trying to do should be done in two steps: first convert to a Date
, then convert it from a number-like object to a string as you want.其次,您尝试做的事情应该分两步完成:首先转换为
Date
,然后根据需要将其从类似数字的 object 转换为字符串。 From here, btw, your dates are no longer dates, they will no longer be something on which you can do date-math (eg, add/substract/difference).从这里开始,顺便说一句,你的日期不再是日期,它们将不再是你可以进行日期数学运算的东西(例如,加/减/差)。
vec <- c("04/12/2016", "4/13/2016", "4/14/2016", "4/15/2016")
as.Date(vec, format = "%m/%d/%Y")
# [1] "2016-04-12" "2016-04-13" "2016-04-14" "2016-04-15"
as.Date(vec, format = "%m/%d/%Y") + 5
# [1] "2016-04-17" "2016-04-18" "2016-04-19" "2016-04-20"
format(as.Date(vec, format = "%m/%d/%Y"), "%d/%m")
# [1] "12/04" "13/04" "14/04" "15/04"
format(as.Date(vec, format = "%m/%d/%Y"), "%d/%m") + 5
# Error in format(as.Date(vec, format = "%m/%d/%Y"), "%d/%m") + 5 :
# non-numeric argument to binary operator
If you need number-like operations on it (including ranges), you must keep it as a Date
.如果您需要对其进行类似数字的操作(包括范围),则必须将其保留为
Date
。 If that's the case, I suggest you consider that you can keep it as a date for all of your processing, and then only when you render your data for visualization (plots, tables, etc), then and only then do you need the string representation of %d/%m
.如果是这种情况,我建议您考虑将其作为所有处理的日期,然后仅当您呈现数据以进行可视化(绘图、表格等)时,然后才需要字符串
%d/%m
的表示。 You can add that as another column in addition to the "real" Date
object, perhaps除了“真实”
Date
object 之外,您还可以将其添加为另一列,也许
dailyactivity_Records$ActivityDate <- as.Date(dailyactivity_Records$ActivityDate, format = "%m/%d/%Y")
dailyactivity_Records$Date_ddmm <- format(dailyactivity_Records$ActivityDate , "%d/%m")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.