[英]Nested if-else loop in R with multiple conditions
I need to write a nested loop to go through IDs annually to compare multiple variables from dataframes D1 and D2 on an if-else condition.我需要每年通过 ID 向 go 编写一个嵌套循环,以在 if-else 条件下比较来自数据帧 D1 和 D2 的多个变量。
D1: D1:
ID year X1
1 2000 34563
1 2001 34563
1 2002 12367
2 2010 14363
2 2011 14363
2 2012 13312
2 2013 13312
2 2014 13312
D2: D2:
year X1 X2
2001 34563 12367
2011 14363 13312
I created X2 in D1 (X2 is the following year's X1 in D1) by duplicating column X1 and shifting it up by 1 row (this is a rough aproach as well since if for an ID and year there is no data for the following year X2 should be filled as NA, instead of X1 for the next ID in the dataframe.)我在 D1 中创建了 X2(X2 是 D1 中的下一年的 X1),方法是复制 X1 列并将其向上移动 1 行(这也是一种粗略的方法,因为如果对于 ID 和年份,下一年 X2 没有数据dataframe 中的下一个 ID 应填写为 NA,而不是 X1。)
For an ID in D1, I need to loop through each year for that ID, and for a year N, if对于 D1 中的 ID,我需要遍历该 ID 的每一年,以及 N 年,如果
D1$G = 1 else D1$G = 0. D1$G = 1 否则 D1$G = 0。
If there is no data for year N+1, condition 2 is ignored.如果没有第 N+1 年的数据,则忽略条件 2。
Now I want to compare each row in D1 directly with D2.现在我想直接将 D1 中的每一行与 D2 进行比较。 I tried an if-else statement as follows
我尝试了一个 if-else 语句如下
D1$G <- ifelse(D1$X1 == D2$X1 & D1$X2 == D2$X2 & D1$year == D2$year, "1", "0")
This is what I'm ending up with, however然而,这就是我的结局
ID year X1 X2 G
1 1 2000 34563 34563 0
2 1 2001 34563 12367 0
3 1 2002 12367 14363 0
4 2 2010 14363 14363 0
5 2 2011 14363 13312 0
6 2 2012 13312 13312 0
7 2 2013 13312 13312 0
8 2 2014 13312 NA 0
Instead of代替
ID year X1 X2 G
1 1 2000 34563 34563 0
2 1 2001 34563 12367 1
3 1 2002 12367 14363 0
4 2 2010 14363 14363 0
5 2 2011 14363 13312 1
6 2 2012 13312 13312 0
7 2 2013 13312 13312 0
8 2 2014 13312 NA 0
Want to understand where I'm going wrong (or if there are simpler methods).想了解我哪里出错了(或者是否有更简单的方法)。 Any help is appreciated.
任何帮助表示赞赏。
Reproducible code:可重现的代码:
D1 <- data.frame(ID = c(1, 1, 1, 2, 2, 2, 2, 2),
year = c(2000, 2001, 2002, 2010, 2011, 2012, 2013, 2014),
X1 = c(34563, 34563, 12367, 14363, 14363, 13312, 13312, 13312)
)
D2 <- data.frame(year = c(2001, 2011),
X1 = c(34563, 14363),
X2 = c(12367, 13312)
)
# creating X2 in D1
D1$X2 = D1$X1
D1$X2 <- shift(D1$X1, 1)
Maybe this might be helpful.也许这可能会有所帮助。 Add a
G
column to D2
of 1. Then, you can merge the two data.frames, and replace NA
where there was no match with 0.给
D2
加一个G
列为1。然后,可以合并两个data.frame,把没有匹配到的NA
替换为0。
library(tidyverse)
D2$G <- 1
D1 %>%
group_by(ID) %>%
mutate(X2 = lead(X1, 1)) %>%
left_join(D2, by = c("year", "X1", "X2")) %>%
replace_na(list(G = 0))
Output Output
ID year X1 X2 G
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 2000 34563 34563 0
2 1 2001 34563 12367 1
3 1 2002 12367 NA 0
4 2 2010 14363 14363 0
5 2 2011 14363 13312 1
6 2 2012 13312 13312 0
7 2 2013 13312 13312 0
8 2 2014 13312 NA 0
Edit : To explain the problem with the ifelse
statement, you are comparing two vectors of different lengths, in a way likely not intended.编辑:为了解释
ifelse
语句的问题,您正在比较两个不同长度的向量,这可能不是有意的。
Consider two vectors from your data.frames:考虑 data.frames 中的两个向量:
year1 = c(2000, 2001, 2002, 2010, 2011, 2012, 2013, 2014)
year2 = c(2001, 2011)
If you compare using ==
operator:如果您使用
==
运算符进行比较:
year1 == year2
You will get all FALSE
:你会得到所有
FALSE
:
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
This is essentially comparing in order 2000 with 2001, 2001 with 2011, 2002 with 2001 (again, recycling vector year2
given shorter length), 2010 with 2011, 2011 with 2001 (again), etc.这本质上是按 2000 与 2001、2001 与 2011、2002 与 2001 的顺序进行比较(同样,循环向量
year2
给出较短的长度)、2010 与 2011、2011 与 2001(再次)等。
Another way to compare the two vectors is using %in%
:比较两个向量的另一种方法是使用
%in%
:
year1 %in% year2
[1] FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE
This will give you logical results based on each value in year1
contained in the vector year2
.这将根据向量
year2
中包含的year1
中的每个值为您提供逻辑结果。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.