简体   繁体   English

r在多个数据帧上进行计算的条件

[英]r ifelse condition for the calculation on multiple dataframes

I have 3 data frames, df1 = a time interval, df2 = list of IDs, df3 = list of IDs with associated date. 我有3个数据帧,df1 =一个时间间隔,df2 = ID列表,df3 = ID及相关日期列表。

df1 <- structure(list(season = structure(c(2L, 1L), .Label = c("summer", 
    "winter"), class = "factor"), mindate = structure(c(1420088400, 
    1433131200), class = c("POSIXct", "POSIXt")), maxdate = structure(c(1433131140, 
    1448945940), class = c("POSIXct", "POSIXt")), diff = structure(c(150.957638888889, 
    183.040972222222), units = "days", class = "difftime")), .Names = c("season", 
    "mindate", "maxdate", "diff"), row.names = c(NA, -2L), class = "data.frame")

df2 <- structure(list(ID = c(23796, 23796, 23796)), .Names = "ID", row.names = c(NA, 
    -3L), class = "data.frame")

df3 <- structure(list(ID = c("23796", "123456", "12134"), time = structure(c(1420909920, 
1444504500, 1444504500), class = c("POSIXct", "POSIXt"), tzone = "US/Eastern")), .Names = c("ID", 
"time"), row.names = c(NA, -3L), class = "data.frame")

The code should compare if df2$ID == df3$ID. 如果df2 $ ID == df3 $ ID,则代码应进行比较。 If true, and if df3$time >= df1$mindate and df3$time <= df1$maxdate, then df1$maxdate - df3$time, else df1$maxdate - df1$mindate. 如果为true,并且df3 $ time> = df1 $ mindate和df3 $ time <= df1 $ maxdate,则df1 $ maxdate-df3 $ time,否则df1 $ maxdate-df1 $ mindate。 I tried using the ifelse function. 我尝试使用ifelse函数。 This works when i manually specify specific cells, but this is not what i want as I have many more (uneven rows) for each of the dfs. 当我手动指定特定的单元格时此方法有效,但这不是我想要的,因为我为每个dfs有更多(行不均匀)。

df1$result <- ifelse(df2[1,1] == df3[1,1] & df3[1,2] >= df1$mindate & df3[1,2] <= df1$maxdate, 
                     difftime(df1$maxdate,df3[1,2],units="days"),
                     difftime(df1$maxdate,df1$mindate,units="days")

EDIT: The desired output is (when removing last row of df2): 编辑:所需的输出是(当删除df2的最后一行时):

 season    mindate             maxdate          diff   result
1 winter 2015-01-01 2015-05-31 23:59:00 150.9576 days 141.9576
2 summer 2015-06-01 2015-11-30 23:59:00 183.0410 days 183.0410

Any ideas? 有任何想法吗? I don't see how I could merge dfs to make them of the same length. 我看不到如何合并df以使其具有相同的长度。 Note that df2 can be of any row length and not affect the code. 注意,df2可以是任何行长,并且不影响代码。 Issues arise when df1 and df3 differ in # of rows. 当df1和df3的行数不同时,会出现问题。

The > and < are vectorized: ><被矢量化:

transform(df1,result=ifelse(df3$ID%in%df2$ID & df3$time>mindate & df3$time <maxdate, difftime(maxdate,df3$time),difftime(maxdate,mindate)))
  season             mindate             maxdate          diff   result
1 winter 2014-12-31 21:00:00 2015-05-31 20:59:00 150.9576 days 141.9576
2 summer 2015-05-31 21:00:00 2015-11-30 20:59:00 183.0410 days 183.0410

You can also use the between function from data.table library 您也可以使用data.table库中的between函数

library(data.table)
transform(df1,result=ifelse(df3$ID%in%df2$ID&df3$time%between%df1[2:3],
               difftime(maxdate,df3$time),difftime(maxdate,mindate)))

  season             mindate             maxdate          diff   result
1 winter 2014-12-31 21:00:00 2015-05-31 20:59:00 150.9576 days 141.9576
2 summer 2015-05-31 21:00:00 2015-11-30 20:59:00 183.0410 days 183.0410

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM