[英]dplyr: apply function table() to each column of a data.frame
I often apply the table-function on each column of a data frame using plyr , like this:我经常使用plyr在数据框的每一列上应用表函数,如下所示:
library(plyr)
ldply( mtcars, function(x) data.frame( table(x), prop.table( table(x) ) ) )
Is it possible to do this in dplyr also?是否也可以在dplyr 中执行此操作?
My attempts fail:我的尝试失败了:
mtcars %>% do( table %>% data.frame() )
melt( mtcars ) %>% do( table %>% data.frame() )
You can try the following which does not rely on the tidyr
package.您可以尝试以下不依赖于
tidyr
包的方法。
mtcars %>%
lapply(table) %>%
lapply(as.data.frame) %>%
Map(cbind,var = names(mtcars),.) %>%
rbind_all() %>%
group_by(var) %>%
mutate(pct = Freq / sum(Freq))
Using tidyverse (dplyr and purrr):使用 tidyverse(dplyr 和 purrr):
library(tidyverse)
mtcars %>%
map( function(x) table(x) )
Or simply:或者干脆:
library(tidyverse)
mtcars %>%
map( table )
In general you probably would not want to run table()
on every column of a data frame because at least one of the variables will be unique (an id
field) and produce a very long output.通常,您可能不想在数据框的每一列上运行
table()
,因为至少有一个变量是唯一的(一个id
字段)并产生很长的输出。 However, you can use group_by()
and tally()
to obtain frequency tables in a dplyr
chain.但是,您可以使用
group_by()
和tally()
来获取dplyr
链中的频率表。 Or you can use count()
which does the group_by()
for you.或者您可以使用
count()
group_by()
为您执行group_by()
。
> mtcars %>%
group_by(cyl) %>%
tally()
> # mtcars %>% count(cyl)
Source: local data frame [3 x 2]
cyl n
1 4 11
2 6 7
3 8 14
If you want to do a two-way frequency table, group by more than one variable.如果你想做一个双向频率表,按多个变量分组。
> mtcars %>%
group_by(gear, cyl) %>%
tally()
> # mtcars %>% count(gear, cyl)
You can use spread()
of the tidyr
package to turn that two-way output into the output one is used to receiving with table()
when two variables are input.当输入两个变量时,您可以使用
tidyr
包的spread()
将双向输出转换为用于使用table()
接收的输出。
Solution by Caner did not work but from comenter akrun (credit goes to him), this solution worked great. Caner 的解决方案不起作用,但来自评论员 akrun(归功于他),这个解决方案效果很好。 Also using a much larger tibble to demo it.
还使用更大的 tibble 来演示它。 Also I added an order by percent descending.
我还按百分比降序添加了一个订单。
library(nycflights13);dim(flights)
tte<-gather(flights, Var, Val) %>%
group_by(Var) %>% dplyr::mutate(n=n()) %>%
group_by(Var,Val) %>% dplyr::mutate(n1=n(), Percent=n1/n)%>%
arrange(Var,desc(n1) %>% unique()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.