如何使用已知日期格式的因子水平来通知其余数据框？

Question

I have GPS data for multiple individuals but the dates are inconsistent. 我有多个人的GPS数据，但日期不一致。 For instance, some are in "%d/%m/%Y %H:%M" format but others are in "%m/%d/%Y %H:%M" format. 例如，某些格式为"%d/%m/%Y %H:%M"格式，而其他格式为"%m/%d/%Y %H:%M"格式。 This is very confusing but I know the correct order of one of the individuals. 这非常令人困惑，但是我知道其中一个人的正确顺序。

Can I use this to inform the date transformation parse_date_time from the lubridate package? 我可以使用它从lubridate软件包中通知日期转换parse_date_time吗？ Or what is the best way around this ambiguity? 或者解决这种歧义的最佳方法是什么？

date, id,
"10/01/2014 08:00", A # these are day/month/year format
"10/01/2014 06:00", A
"09/01/2014 18:00", A
"09/01/2014 15:00", A
"09/01/2014 12:00", A
"09/01/2014 10:00", A
"10/01/2014 10:00", B # these are month/day/year format
"10/01/2014 10:00", B
"10/01/2014 10:00", B
"10/01/2014 10:00", B

Answer 1

You can do this with dplyr functions mutate and case_when to use the id column to apply the dmy_hm or mdy_hm function accordingly. 当使用id列相应地应用dmy_hm或mdy_hm函数时，可以使用dplyr函数mutate和case_when来执行此操作。

library(dplyr)
library(tibble) # for tribble
library(lubridate)

df <- tribble(~date, ~id,
"10/01/2014 08:00", "A", # these are day/month/year format
"10/01/2014 06:00", "A",
"09/01/2014 18:00", "A",
"09/01/2014 15:00", "A",
"09/01/2014 12:00", "A",
"09/01/2014 10:00", "A",
"10/01/2014 10:00", "B", # these are month/day/year format
"10/01/2014 10:00", "B",
"10/01/2014 10:00", "B",
"10/01/2014 10:00", "B")

mutate(df, date = case_when(id == "A" ~ dmy_hm(date),
                        id == "B" ~ mdy_hm(date)))
#> # A tibble: 10 x 2
#>    date                id   
#>    <dttm>              <chr>
#>  1 2014-01-10 08:00:00 A    
#>  2 2014-01-10 06:00:00 A    
#>  3 2014-01-09 18:00:00 A    
#>  4 2014-01-09 15:00:00 A    
#>  5 2014-01-09 12:00:00 A    
#>  6 2014-01-09 10:00:00 A    
#>  7 2014-10-01 10:00:00 B    
#>  8 2014-10-01 10:00:00 B    
#>  9 2014-10-01 10:00:00 B    
#> 10 2014-10-01 10:00:00 B

^{Created on 2019-01-18 by the reprex package (v0.2.1)} ^{由reprex软件包（v0.2.1）创建于2019-01-18}

Answer 2

After a few trials, I learnt that ifelse from base coerces to double. 经过几次试验，我了解到ifelse从基本ifelse变为两倍。 if_else however helps solve the problem: if_else可帮助解决问题：

   library(tidyverse)
      df %>% 
  mutate(id=as.factor(id),
         date=if_else(id=="A",dmy_hm(date),mdy_hm(date)))

Result: 结果：

 date                id   
   <dttm>              <fct>
 1 2014-01-10 08:00:00 A    
 2 2014-01-10 06:00:00 A    
 3 2014-01-09 18:00:00 A    
 4 2014-01-09 15:00:00 A    
 5 2014-01-09 12:00:00 A    
 6 2014-01-09 10:00:00 A    
 7 2014-10-01 10:00:00 B    
 8 2014-10-01 10:00:00 B    
 9 2014-10-01 10:00:00 B    
10 2014-10-01 10:00:00 B

如何使用已知日期格式的因子水平来通知其余数据框？

问题描述

2 个解决方案

解决方案1
3 已采纳 2019-01-18 17:01:23

解决方案2
0 2019-01-18 17:30:19

如何使用已知日期格式的因子水平来通知其余数据框？

问题描述

2 个解决方案

解决方案1 3 已采纳 2019-01-18 17:01:23

解决方案2 0 2019-01-18 17:30:19

解决方案1
3 已采纳 2019-01-18 17:01:23

解决方案2
0 2019-01-18 17:30:19