I enjoy the syntax of dplyr, but I'm struggling with easily obtaining a contingency table in the same way that I can get with the base R table() function. table() is OK, but I can't figure out how to incorporate it into the dplyr pipe syntax.
Thank you for your help.
Here is some example data that has the output I'm trying to get to.
df <- tibble(id=c(rep("A",100),rep("B",100),rep("C",100)),
val=c(rnorm(300,mean=500,sd=100))) %>%
mutate(val_bin=cut(val,breaks=5))
table(df$id,df$val_bin)
Output:
(210,325] (325,440] (440,554] (554,669] (669,784]
A 4 22 55 18 1
B 6 19 46 24 5
C 3 23 44 22 8
One option is to use with
:
df %>%
with(., table(id, val_bin))
# val_bin
# id (228,327] (327,426] (426,525] (525,624] (624,723]
# A 4 19 39 22 16
# B 5 15 41 32 7
# C 5 14 44 25 12
Technically, the .
is not required,
df %>%
with(table(id, val_bin))
but I find it is perhaps a little clearer in situations where it might be easy to confuse where the data is going (within with
or table
). (Hint: it's just about always the first function, with
here.)
We can select
the columns of interest and apply the table
library(dplyr)
df %>%
select(id, val_bin) %>%
table
Or another option is to wrap within {}
df %>%
{table(.$id, .$val_bin)}
In tidyverse
, it is a bit more convoluted to get the required output
library(dplyr)
library(tidyr)
df %>%
count(id, val_bin) %>%
pivot_wider(names_from = val_bin, values_from = n, values_fill = list(n = 0)) %>%
column_to_rownames('id')
# (214,338] (338,461] (461,584] (584,707] (707,831]
#A 5 30 44 20 1
#B 9 30 34 27 0
#C 8 28 43 20 1
I know the question was for a pipe %>%
, but have you ever heard of the exposition pipe ( %$%
)? It's also from the magrittr
package (just like %>%
) and meant exactly for what you want to do:
df %$%
table(id, val_bin)
Help page: https://magrittr.tidyverse.org/reference/exposition.html
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.