I am trying to summarize a dataframe to create two summaries:
QUOT
or QUOG
appearQUOT
or QUOG
appear and where there are other Holds
appearing tooBelow is the start of the code:
library(dplyr)
dat <- data.frame(Order = c(123,123,123,145,145,189,210,210,123,123,164),
Location = c("Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Charlotte","Charlotte","Charlotte"),
Hold = c("QUOT","ENGR","VEND","QUOG","ENGR","QUOT","ENGR","VEND","QUOT","CUST","QUOT")
)
test <- dat %>%
group_by(Order, Location) %>%
.....
I get stuck with trying to find out if a particular order only has QUOT
or QUOG
and then if it has QUOT
or QUOG
and others.
Expected output:
Location Only Multiple
1 Chicago 1 2
2 Charlotte 1 1
So for the expected output:
QUOT
in it and another hold ( ENGR
& VEND
) so this would be considered a multiple for ChicagoQUOG
in it and another hold ( ENGR
) so this would be considered a multiple for ChicagoQUOT
in it and no other holds so this would be considered a only for ChicagoQUOT
or QUOG
so this order gets excluded in the countQUOT
in it and another hold ( CUST
) so this would be considered a multiple for CharlotteQUOT
in it and no other holds so this would be considered a only for CharlotteI think this should work -- you may want to test this with a few other Orders:
library(dplyr)
library(tidyr)
dat <- data.frame(
Order = c(123,123,123,145,145,189,210,210,123,123,164),
Location = c("Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Charlotte","Charlotte","Charlotte"),
Hold = c("QUOT","ENGR","VEND","QUOG","ENGR","QUOT","ENGR","VEND","QUOT","CUST","QUOT")
)
dat %>%
group_by(Order, Location) %>%
mutate(
quot_or_quog = Hold %in% c("QUOT", "QUOG"),
distinct_quot_or_quog = n_distinct(quot_or_quog)
) %>%
# Remove those that do not have "QUOT" or "QUOG"
filter(quot_or_quog) %>%
mutate(
label = if_else(distinct_quot_or_quog == 1, "Only", "Multiple")
) %>%
group_by(label, add = TRUE) %>%
summarise(num_label = n_distinct(label)) %>%
group_by(Location, label) %>%
count(num_label) %>%
pivot_wider(
names_from = label,
values_from = n
) %>%
select(-num_label)
#> # A tibble: 2 x 3
#> # Groups: Location [2]
#> Location Multiple Only
#> <fct> <int> <int>
#> 1 Charlotte 1 1
#> 2 Chicago 2 1
Created on 2020-02-24 by the reprex package (v0.3.0)
Here is another solution using dplyr
and tidyr
. This time the pivoting happens first, and then filtering and summarising are done afterward to get to your solution.
library(dplyr)
library(tidyr)
dat.summary <- dat %>%
mutate(hold_count = 1) %>%
pivot_wider(names_from = Hold, values_from = hold_count) %>%
mutate(only = if_else((QUOT == 1 | QUOG == 1) & is.na(ENGR) & is.na(VEND) & is.na(CUST), 1, 0),
multiple = if_else((QUOT == 1 | QUOG == 1) & (ENGR == 1 | VEND == 1 | CUST ==1), 1, 0)) %>%
group_by(Location) %>%
summarise(only = sum(only, na.rm = T), multiple = sum(multiple, na.rm = T))
dat.summary
gives you:
# A tibble: 2 x 3
Location only multiple
<fct> <dbl> <dbl>
1 Charlotte 1 1
2 Chicago 1 2
DATA
dat <- data.frame(
Order = c(123,123,123,145,145,189,210,210,123,123,164),
Location = c("Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Chicago","Charlotte","Charlotte","Charlotte"),
Hold = c("QUOT","ENGR","VEND","QUOG","ENGR","QUOT","ENGR","VEND","QUOT","CUST","QUOT")
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.