[英]Logical function across multiple columns using “any” function
I would like to run a logical operation (multiple conditions) across many columns.我想在许多列中运行逻辑操作(多个条件)。 I have written a query which is working fine.
我写了一个工作正常的查询。 However, I want to shorten my code as I have to write several queries.
但是,我想缩短我的代码,因为我必须编写几个查询。
I have tried shortening the query using "any" and "brackets".我尝试使用“任何”和“括号”来缩短查询。 However, the second query is running fine but giving me a different answer.
但是,第二个查询运行良好,但给了我不同的答案。 Does "any" function work on multiple columns?
“任何”功能是否适用于多列?
Here are my conditions -这是我的条件 -
Participate![]() |
B1 ![]() |
B2 ![]() |
B3 ![]() |
B4 ![]() |
B5 ![]() |
Query1![]() |
Query2![]() |
---|---|---|---|---|---|---|---|
3 ![]() |
-1 ![]() |
-1 ![]() |
-1 ![]() |
-1 ![]() |
-1 ![]() |
Noissue![]() |
Noissue![]() |
1 ![]() |
-1 ![]() |
1 ![]() |
-1 ![]() |
-1 ![]() |
1 ![]() |
Noissue![]() |
Noissue![]() |
1 ![]() |
-1 ![]() |
-1 ![]() |
-1 ![]() |
-1 ![]() |
-1 ![]() |
Issue![]() |
Noissue![]() |
2 ![]() |
-1 ![]() |
1 ![]() |
1 ![]() |
-1 ![]() |
1 ![]() |
Noissue![]() |
Noissue![]() |
2 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
-1 ![]() |
Noissue![]() |
Noissue![]() |
1 ![]() |
-99 ![]() |
-99 ![]() |
-99 ![]() |
-99 ![]() |
-99 ![]() |
Noissue![]() |
Noissue![]() |
I appreciate if anyone help me on reducing the code lines using different functions.如果有人帮助我减少使用不同功能的代码行,我将不胜感激。
mutate(Batch_v1,
case_when (
((Batch_v1$B1 == 1 | Batch_v1$B2 == 1 | Batch_v1$B3 == 1 | Batch_v1$B4 == 1 | Batch_v1$B5 == 1| Batch_v1$B6 == 1| Batch_v1$B7 == 1|Batch_v1$B8 == 1|Batch_v1$B9 == 1|Batch_v1$B10 == 1|Batch_v1$BOth == 1) &
Batch_v1$Participate %in% c(1,2,-99))~"Noissue",
((Batch_v1$B1 == -99 | Batch_v1$B2 == -99 | Batch_v1$B3 == -99|Batch_v1$B4 == -99 |Batch_v1$B5 == -99|Batch_v1$B6 == -99|Batch_v1$B7 == -99|Batch_v1$B8 == 1|Batch_v1$B9 == -99|Batch_v1$B10 == -99|Batch_v1$BOth == -99) &
Batch_v1$Participate %in% c(1,2,-99))~"Noissue",
Batch_v1$Participate ==3 ~ "Noissue",
TRUE ~ "Issue"))
mutate(Batch_v1,
case_when (
((any(Batch_v1[,2:6] == 1)) & Batch_v1$Participate %in% c(1,2,-99))~ "Noissue",
((any(Batch_v1[,2:6] == -99)) & Batch_v1$Participate %in% c(1,2,-99))~ "Noissue",
Batch_v1$Participate ==3 ~ "Noissue",
TRUE ~ "Issue"))
We could uses across
with case_when
我们可以使用
across
与case_when
library(dplyr)
df %>%
mutate(across(B2:B5, ~case_when(. == 1 & B1 <=2 ~ "Noissue",
. == -99 & B1 <=2 ~ "Noissue",
B1 == 3 ~ "Noissue",
TRUE ~ "issue")
)
)
Output:输出:
Participate B1 B2 B3 B4 B5 Query1 Query2
<dbl> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 3 -1 issue issue issue issue Noissue Noissue
2 1 -1 Noissue issue issue Noissue Noissue Noissue
3 1 -1 issue issue issue issue Issue Noissue
4 2 -1 Noissue Noissue issue Noissue Noissue Noissue
5 2 1 Noissue Noissue Noissue issue Noissue Noissue
6 1 -99 Noissue Noissue Noissue Noissue Noissue Noissue
data:数据:
df <- structure(list(Participate = c(3, 1, 1, 2, 2, 1), B1 = c(-1,
-1, -1, -1, 1, -99), B2 = c(-1, 1, -1, 1, 1, -99), B3 = c(-1,
-1, -1, 1, 1, -99), B4 = c(-1, -1, -1, -1, 1, -99), B5 = c(-1,
1, -1, 1, -1, -99), Query1 = c("Noissue", "Noissue", "Issue",
"Noissue", "Noissue", "Noissue"), Query2 = c("Noissue", "Noissue",
"Noissue", "Noissue", "Noissue", "Noissue")), problems = structure(list(
row = 6L, col = "Query2", expected = "", actual = "embedded null",
file = "'test'"), row.names = c(NA, -1L), class = c("tbl_df",
"tbl", "data.frame")), class = c("spec_tbl_df", "tbl_df", "tbl",
"data.frame"), row.names = c(NA, -6L))
Whenever we have to use logical conditions rowwise across many columns, two main approaches should usually be considered.每当我们必须在许多列中按行使用逻辑条件时,通常应该考虑两种主要方法。 These obviate the need for
rowwise()
and Reduce()
in the alternative with lapply/map %>% Reduce/reduce
, or complex case_when()
statements.这些通过
lapply/map %>% Reduce/reduce
或复杂的case_when()
语句消除了对rowwise()
和Reduce()
的需要。
-1) rowSums(condition)
-1)
rowSums(condition)
-2) if_any() / if_all()
-2)
if_any() / if_all()
This question is most suited for a solution with if_any()
.这个问题最适合使用
if_any()
的解决方案。
With if_any()
使用
if_any()
Batch_v1 %>% mutate(query3 = ifelse(if_any(B2:B5, ~.x %in% c(-99, 1)) & B1<=2,
"Noissue",
"Issue"))
With rowSums()
使用
rowSums()
Batch_v1 %>% mutate(query3 = ifelse(rowSums(across(B2:B5, ~.x %in% c(-99, 1)))>0 & B1<=2,
"Noissue",
"Issue"))
Output输出
# A tibble: 6 x 9
Participate B1 B2 B3 B4 B5 Query1 Query2 query3
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
1 3 -1 -1 -1 -1 -1 Noissue Noissue Issue
2 1 -1 1 -1 -1 1 Noissue Noissue Noissue
3 1 -1 -1 -1 -1 -1 Issue Noissue Issue
4 2 -1 1 1 -1 1 Noissue Noissue Noissue
5 2 1 1 1 1 -1 Noissue Noissue Noissue
6 1 -99 -99 -99 -99 -99 Noissue Noissue Noissue
There are some good answers to similar questions in here:这里有一些类似问题的很好的答案:
Rowwise logical operations with mutate() and filter() in R and here: 在 R和这里使用 mutate() 和 filter() 进行行逻辑运算:
R - Remove rows from dataframe that contain only zeros in numeric columns, base R and pipe-friendly methods? R - 从数据框中删除在数字列中仅包含零的行、基本 R 和管道友好方法?
You could use你可以用
library(dplyr)
Batch_v1 %>%
rowwise() %>%
mutate(
Query3 = case_when(
any(B1:B5 == 1) & Participate %in% c(1,2,-99) ~ "Noissue",
any(B1:B5 == -99) & Participate %in% c(1,2,-99) ~ "Noissue",
Participate == 3 ~ "Noissue",
TRUE ~ "Issue"
)
)
which returns返回
# A tibble: 6 x 9
# Rowwise:
Participate B1 B2 B3 B4 B5 Query1 Query2 Query3
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
1 3 -1 -1 -1 -1 -1 Noissue Noissue Noissue
2 1 -1 1 -1 -1 1 Noissue Noissue Noissue
3 1 -1 -1 -1 -1 -1 Issue Noissue Issue
4 2 -1 1 1 -1 1 Noissue Noissue Noissue
5 2 1 1 1 1 -1 Noissue Noissue Noissue
6 1 -99 -99 -99 -99 -99 Noissue Noissue Noissue
The main problem with your second code is the function你的第二个代码的主要问题是函数
any(Batch_v1[,2:6] == 1)
Let's take a look at让我们来看看
Batch_v1[,2:6] == 1
#> B1 B2 B3 B4 B5
#> [1,] FALSE FALSE FALSE FALSE FALSE
#> [2,] FALSE TRUE FALSE FALSE TRUE
#> [3,] FALSE FALSE FALSE FALSE FALSE
#> [4,] FALSE TRUE TRUE FALSE TRUE
#> [5,] TRUE TRUE TRUE TRUE FALSE
#> [6,] FALSE FALSE FALSE FALSE FALSE
So Batch_v1[,2:6] == 1
returns a data.frame of booleans.所以
Batch_v1[,2:6] == 1
返回一个布尔值的 data.frame。 Applying any
on this data.frame returns TRUE
if any
of the values inside this data.frame is TRUE
.如果此 data.frame 中的
any
值为TRUE
则在此 data.frame 上应用any
将返回TRUE
。 That's clearly not your desired behaviour.这显然不是您想要的行为。 Using
rowwise()
forces any
to be applied... well... per row.使用
rowwise()
强制any
应用......好吧......每行。
Note: Inside a tidyverse
-pipe, you don't want to use Batch_v1$B1
if you are refering on the current object you are working with.注意:在
tidyverse
-pipe 中,如果您正在使用的当前对象上引用,则不希望使用Batch_v1$B1
。 Batch_v1$B1
for example refers to the original Batch_v1
, without any transformations done.例如,
Batch_v1$B1
指的是原始Batch_v1
,没有进行任何转换。 In this case, there is no real difference, but you shouldn't rely on this in general.在这种情况下,没有真正的区别,但通常不应依赖于此。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.