简体   繁体   English

具有多个条件的 R 中的嵌套 if-else 循环

[英]Nested if-else loop in R with multiple conditions

I need to write a nested loop to go through IDs annually to compare multiple variables from dataframes D1 and D2 on an if-else condition.我需要每年通过 ID 向 go 编写一个嵌套循环,以在 if-else 条件下比较来自数据帧 D1 和 D2 的多个变量。

D1: D1:

ID    year         X1      
 1    2000      34563     
 1    2001      34563     
 1    2002      12367     
 2    2010      14363     
 2    2011      14363     
 2    2012      13312     
 2    2013      13312     
 2    2014      13312     

D2: D2:

year       X1      X2      
2001    34563   12367  
2011    14363   13312  
 

I created X2 in D1 (X2 is the following year's X1 in D1) by duplicating column X1 and shifting it up by 1 row (this is a rough aproach as well since if for an ID and year there is no data for the following year X2 should be filled as NA, instead of X1 for the next ID in the dataframe.)我在 D1 中创建了 X2(X2 是 D1 中的下一年的 X1),方法是复制 X1 列并将其向上移动 1 行(这也是一种粗略的方法,因为如果对于 ID 和年份,下一年 X2 没有数据dataframe 中的下一个 ID 应填写为 NA,而不是 X1。)

For an ID in D1, I need to loop through each year for that ID, and for a year N, if对于 D1 中的 ID,我需要遍历该 ID 的每一年,以及 N 年,如果

  1. D1$X1 == D2$X1 D1$X1 == D2$X1
  2. D1$X2 == D2$X2 D1$X2 == D2$X2

D1$G = 1 else D1$G = 0. D1$G = 1 否则 D1$G = 0。

If there is no data for year N+1, condition 2 is ignored.如果没有第 N+1 年的数据,则忽略条件 2。

Now I want to compare each row in D1 directly with D2.现在我想直接将 D1 中的每一行与 D2 进行比较。 I tried an if-else statement as follows我尝试了一个 if-else 语句如下

D1$G <- ifelse(D1$X1 == D2$X1 & D1$X2 == D2$X2 & D1$year == D2$year, "1", "0")

This is what I'm ending up with, however然而,这就是我的结局

  ID   year      X1      X2    G
1  1   2000   34563   34563    0
2  1   2001   34563   12367    0
3  1   2002   12367   14363    0
4  2   2010   14363   14363    0
5  2   2011   14363   13312    0
6  2   2012   13312   13312    0
7  2   2013   13312   13312    0
8  2   2014   13312      NA    0

Instead of代替

  ID   year      X1      X2    G
1  1   2000   34563   34563    0
2  1   2001   34563   12367    1
3  1   2002   12367   14363    0
4  2   2010   14363   14363    0
5  2   2011   14363   13312    1
6  2   2012   13312   13312    0
7  2   2013   13312   13312    0
8  2   2014   13312      NA    0

Want to understand where I'm going wrong (or if there are simpler methods).想了解我哪里出错了(或者是否有更简单的方法)。 Any help is appreciated.任何帮助表示赞赏。

Reproducible code:可重现的代码:

D1 <- data.frame(ID = c(1, 1, 1, 2, 2, 2, 2, 2),
                 year = c(2000, 2001, 2002, 2010, 2011, 2012, 2013, 2014),
                 X1 = c(34563, 34563, 12367, 14363, 14363, 13312, 13312, 13312)
)
D2 <- data.frame(year = c(2001, 2011),
                 X1 = c(34563, 14363),
                 X2 = c(12367, 13312)
)

# creating X2 in D1
D1$X2 = D1$X1
D1$X2 <- shift(D1$X1, 1)

Maybe this might be helpful.也许这可能会有所帮助。 Add a G column to D2 of 1. Then, you can merge the two data.frames, and replace NA where there was no match with 0.D2加一个G列为1。然后,可以合并两个data.frame,把没有匹配到的NA替换为0。

library(tidyverse)

D2$G <- 1

D1 %>%
  group_by(ID) %>%
  mutate(X2 = lead(X1, 1)) %>%
  left_join(D2, by = c("year", "X1", "X2")) %>%
  replace_na(list(G = 0))

Output Output

     ID  year    X1    X2     G
  <dbl> <dbl> <dbl> <dbl> <dbl>
1     1  2000 34563 34563     0
2     1  2001 34563 12367     1
3     1  2002 12367    NA     0
4     2  2010 14363 14363     0
5     2  2011 14363 13312     1
6     2  2012 13312 13312     0
7     2  2013 13312 13312     0
8     2  2014 13312    NA     0

Edit : To explain the problem with the ifelse statement, you are comparing two vectors of different lengths, in a way likely not intended.编辑:为了解释ifelse语句的问题,您正在比较两个不同长度的向量,这可能不是有意的。

Consider two vectors from your data.frames:考虑 data.frames 中的两个向量:

year1 = c(2000, 2001, 2002, 2010, 2011, 2012, 2013, 2014)
year2 = c(2001, 2011)

If you compare using == operator:如果您使用==运算符进行比较:

year1 == year2

You will get all FALSE :你会得到所有FALSE

[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

This is essentially comparing in order 2000 with 2001, 2001 with 2011, 2002 with 2001 (again, recycling vector year2 given shorter length), 2010 with 2011, 2011 with 2001 (again), etc.这本质上是按 2000 与 2001、2001 与 2011、2002 与 2001 的顺序进行比较(同样,循环向量year2给出较短的长度)、2010 与 2011、2011 与 2001(再次)等。

Another way to compare the two vectors is using %in% :比较两个向量的另一种方法是使用%in%

year1 %in% year2

[1] FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE FALSE

This will give you logical results based on each value in year1 contained in the vector year2 .这将根据向量year2中包含的year1中的每个值为您提供逻辑结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM