简体   繁体   English

如何使用已知日期格式的因子水平来通知其余数据框?

[英]How can I use factor level with known date format to inform the rest of my dataframe?

I have GPS data for multiple individuals but the dates are inconsistent. 我有多个人的GPS数据,但日期不一致。 For instance, some are in "%d/%m/%Y %H:%M" format but others are in "%m/%d/%Y %H:%M" format. 例如,某些格式为"%d/%m/%Y %H:%M"格式,而其他格式为"%m/%d/%Y %H:%M"格式。 This is very confusing but I know the correct order of one of the individuals. 这非常令人困惑,但是我知道其中一个人的正确顺序。

Can I use this to inform the date transformation parse_date_time from the lubridate package? 我可以使用它从lubridate软件包中通知日期转换parse_date_time吗? Or what is the best way around this ambiguity? 或者解决这种歧义的最佳方法是什么?

date, id,
"10/01/2014 08:00", A # these are day/month/year format
"10/01/2014 06:00", A
"09/01/2014 18:00", A
"09/01/2014 15:00", A
"09/01/2014 12:00", A
"09/01/2014 10:00", A
"10/01/2014 10:00", B # these are month/day/year format
"10/01/2014 10:00", B
"10/01/2014 10:00", B
"10/01/2014 10:00", B

You can do this with dplyr functions mutate and case_when to use the id column to apply the dmy_hm or mdy_hm function accordingly. 当使用id列相应地应用dmy_hmmdy_hm函数时,可以使用dplyr函数mutatecase_when来执行此操作。

library(dplyr)
library(tibble) # for tribble
library(lubridate)

df <- tribble(~date, ~id,
"10/01/2014 08:00", "A", # these are day/month/year format
"10/01/2014 06:00", "A",
"09/01/2014 18:00", "A",
"09/01/2014 15:00", "A",
"09/01/2014 12:00", "A",
"09/01/2014 10:00", "A",
"10/01/2014 10:00", "B", # these are month/day/year format
"10/01/2014 10:00", "B",
"10/01/2014 10:00", "B",
"10/01/2014 10:00", "B")

mutate(df, date = case_when(id == "A" ~ dmy_hm(date),
                        id == "B" ~ mdy_hm(date)))
#> # A tibble: 10 x 2
#>    date                id   
#>    <dttm>              <chr>
#>  1 2014-01-10 08:00:00 A    
#>  2 2014-01-10 06:00:00 A    
#>  3 2014-01-09 18:00:00 A    
#>  4 2014-01-09 15:00:00 A    
#>  5 2014-01-09 12:00:00 A    
#>  6 2014-01-09 10:00:00 A    
#>  7 2014-10-01 10:00:00 B    
#>  8 2014-10-01 10:00:00 B    
#>  9 2014-10-01 10:00:00 B    
#> 10 2014-10-01 10:00:00 B

Created on 2019-01-18 by the reprex package (v0.2.1) reprex软件包 (v0.2.1)创建于2019-01-18

After a few trials, I learnt that ifelse from base coerces to double. 经过几次试验,我了解到ifelse从基本ifelse变为两倍。 if_else however helps solve the problem: if_else可帮助解决问题:

   library(tidyverse)
      df %>% 
  mutate(id=as.factor(id),
         date=if_else(id=="A",dmy_hm(date),mdy_hm(date)))

Result: 结果:

 date                id   
   <dttm>              <fct>
 1 2014-01-10 08:00:00 A    
 2 2014-01-10 06:00:00 A    
 3 2014-01-09 18:00:00 A    
 4 2014-01-09 15:00:00 A    
 5 2014-01-09 12:00:00 A    
 6 2014-01-09 10:00:00 A    
 7 2014-10-01 10:00:00 B    
 8 2014-10-01 10:00:00 B    
 9 2014-10-01 10:00:00 B    
10 2014-10-01 10:00:00 B    

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我如何在没有循环的情况下通过数据帧中该级别中另一个因子的子集来操作因子级别内的数据 - How can i manipulate data within a factor level by a subset of another factor in that level in a dataframe without loops 如何将因子转换为日期格式? - How do I convert a factor into date format? 如何在R中的数据框中按列的每个因子级别查找行数? - How can I find count of rows by each factor level of each column in a dataframe in R? 按已知因子级别过滤数据会产生空 dataframe - Filtering data by known factor level produces empty dataframe 如何将按因子级别划分的列表重新组合到原始数据框? - How do I recombine a list split by factor level to the original dataframe? 如何在R中使用子函数更改具有加号(+)的因子水平? - How can I use a sub function in R to change a factor level that has a plus (+ ) symbol? 如何将底图添加到按因子级别拆分的数据? - How can I add a base map to data split by a factor level? 在R中,如何访问因子的每个级别的第一个元素? - In R, how can I access the first element of each level of a factor? 如何将我的 ts 预测转换为带有日期值的数据框? - How can I turn my ts forecast into a dataframe with date values? 如何在不使用Apply的情况下检查数据框中的列是否是一个因素? - How can I check if a column in a dataframe is a factor without using apply?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM