[英]How can I use factor level with known date format to inform the rest of my dataframe?
I have GPS data for multiple individuals but the dates are inconsistent. 我有多个人的GPS数据,但日期不一致。 For instance, some are in
"%d/%m/%Y %H:%M"
format but others are in "%m/%d/%Y %H:%M"
format. 例如,某些格式为
"%d/%m/%Y %H:%M"
格式,而其他格式为"%m/%d/%Y %H:%M"
格式。 This is very confusing but I know the correct order of one of the individuals. 这非常令人困惑,但是我知道其中一个人的正确顺序。
Can I use this to inform the date transformation parse_date_time
from the lubridate
package? 我可以使用它从
lubridate
软件包中通知日期转换parse_date_time
吗? Or what is the best way around this ambiguity? 或者解决这种歧义的最佳方法是什么?
date, id,
"10/01/2014 08:00", A # these are day/month/year format
"10/01/2014 06:00", A
"09/01/2014 18:00", A
"09/01/2014 15:00", A
"09/01/2014 12:00", A
"09/01/2014 10:00", A
"10/01/2014 10:00", B # these are month/day/year format
"10/01/2014 10:00", B
"10/01/2014 10:00", B
"10/01/2014 10:00", B
You can do this with dplyr
functions mutate
and case_when
to use the id
column to apply the dmy_hm
or mdy_hm
function accordingly. 当使用
id
列相应地应用dmy_hm
或mdy_hm
函数时,可以使用dplyr
函数mutate
和case_when
来执行此操作。
library(dplyr)
library(tibble) # for tribble
library(lubridate)
df <- tribble(~date, ~id,
"10/01/2014 08:00", "A", # these are day/month/year format
"10/01/2014 06:00", "A",
"09/01/2014 18:00", "A",
"09/01/2014 15:00", "A",
"09/01/2014 12:00", "A",
"09/01/2014 10:00", "A",
"10/01/2014 10:00", "B", # these are month/day/year format
"10/01/2014 10:00", "B",
"10/01/2014 10:00", "B",
"10/01/2014 10:00", "B")
mutate(df, date = case_when(id == "A" ~ dmy_hm(date),
id == "B" ~ mdy_hm(date)))
#> # A tibble: 10 x 2
#> date id
#> <dttm> <chr>
#> 1 2014-01-10 08:00:00 A
#> 2 2014-01-10 06:00:00 A
#> 3 2014-01-09 18:00:00 A
#> 4 2014-01-09 15:00:00 A
#> 5 2014-01-09 12:00:00 A
#> 6 2014-01-09 10:00:00 A
#> 7 2014-10-01 10:00:00 B
#> 8 2014-10-01 10:00:00 B
#> 9 2014-10-01 10:00:00 B
#> 10 2014-10-01 10:00:00 B
Created on 2019-01-18 by the reprex package (v0.2.1) 由reprex软件包 (v0.2.1)创建于2019-01-18
After a few trials, I learnt that ifelse
from base coerces to double. 经过几次试验,我了解到
ifelse
从基本ifelse
变为两倍。 if_else
however helps solve the problem: if_else
可帮助解决问题:
library(tidyverse)
df %>%
mutate(id=as.factor(id),
date=if_else(id=="A",dmy_hm(date),mdy_hm(date)))
Result: 结果:
date id
<dttm> <fct>
1 2014-01-10 08:00:00 A
2 2014-01-10 06:00:00 A
3 2014-01-09 18:00:00 A
4 2014-01-09 15:00:00 A
5 2014-01-09 12:00:00 A
6 2014-01-09 10:00:00 A
7 2014-10-01 10:00:00 B
8 2014-10-01 10:00:00 B
9 2014-10-01 10:00:00 B
10 2014-10-01 10:00:00 B
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.