簡體   English   中英

在 R 中,如何計算滿足條件的日期之間的差異?

[英]In R, how do I calculate difference between dates when condition is met?

我有兩個數據框( df1df2 ),其中包含某些事件的開始日期和結束日期。 我已經確定了哪些日期有重疊事件,這里定義為df1中的開始日期在df2的開始日期和結束日期之內。 如果發生重疊,則將它們標記為TRUE ,如果沒有重疊,則將它們標記為FALSE 我想知道的是...當OverlapTRUE時,如何計算df2df1的開始時間之間的差異?

> df1$aa
    date_start  date_end    Site
1   2002-04-14  2002-04-21  aa
2   2002-06-26  2002-07-05  aa
3   2002-08-15  2002-08-20  aa
4   2004-05-12  2004-05-19  aa
> df2$bb
    date_start  date_end    Site
1   2002-04-13  2002-04-19  bb
2   2002-08-11  2002-08-19  bb
3   2005-06-09  2005-06-14  bb
4   2005-08-10  2005-08-14  bb

此代碼確定是否有重疊

df1$aa$Overlap <- df1$aa$date_start %in% unlist(Map(':', df2$bb$date_start, df2$bb$date_end))
> df1$aa
    date_start  date_end    Site    Overlap
1   2002-04-14  2002-04-21  aa      TRUE
2   2002-06-26  2002-07-05  aa      FALSE
3   2002-08-15  2002-08-20  aa      TRUE
4   2004-05-12  2004-05-19  aa      FALSE

您可以看到OverlapTRUE的兩個事件(第 1 行和第 3 行)。 Overlap等於TRUE時,我想做的是確定df1df2date_start之間的時間差( Diff )。

我正在尋找的結果應該是這樣的。

    date_start  date_end    Site    Overlap   Diff
1   2002-04-13  2002-04-21  aa      TRUE      1
2   2002-08-13  2002-08-20  aa      TRUE      4

這應該可以解決一些嵌套for循環的問題。

# Setup df1
df1 <- read.table(textConnection(
  '    date_start  date_end    Site
1   2002-04-14  2002-04-21  aa
2   2002-06-26  2002-07-05  aa
3   2002-08-15  2002-08-20  aa
4   2004-05-12  2004-05-19  aa'
))
df1$date_start <- as.Date(df1$date_start)
df1$date_end <- as.Date(df1$date_end)

# Setup df1
df2 <- read.table(textConnection(
  '    date_start  date_end    Site
1   2002-04-13  2002-04-19  bb
2   2002-08-11  2002-08-19  bb
3   2005-06-09  2005-06-14  bb
4   2005-08-10  2005-08-14  bb'
))
df2$date_start <- as.Date(df2$date_start)
df2$date_end <- as.Date(df2$date_end)


# Find overlap of dates
df1$Overlap <- df1$date_start %in% unlist(Map(':', df2$date_start, df2$date_end))


# Loop through rows
for (i in 1:nrow(df1)) {

  # Go through only those that overlap
  if (df1[i, "Overlap"]) {

    # Loop through all rows in other data frame
    for (j in 1:nrow(df2)) {

      # Check if within range of df1
      sec_date_range <- df2[j, "date_start"]:df2[j, "date_end"]
      if (df1[i, "date_start"] %in% sec_date_range) {

        # Find absolute difference in start dates
        df1[i, "diff"] <- df1[i, "date_start"] - df2[j, "date_start"]
        df1[i, "diff"] <- abs(df1[i, "diff"])
      }
    }
  }
}

# Filter and print result
df1[df1$Overlap, ]
#>   date_start   date_end Site Overlap    diff
#> 1 2002-04-14 2002-04-21   aa    TRUE  1 days
#> 3 2002-08-15 2002-08-20   aa    TRUE  4 days

reprex package (v0.3.0) 於 2020 年 6 月 15 日創建

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM