[英]Dplyr: how to group_by(all)
As I saw somewhere, when there are multiple layers of group_by(), one summarise() peels off a layer of grouping. 正如我在某处看到的那样,当有多层group_by()时,一个summarise()会剥离一层分组。 In sql, there's "group by all".
在sql中,有“按所有人分组”的功能。 I wonder if there's a way to cancel all grouping in dplyr (so that, eg, we can get max of all, rather than of each group)
我想知道是否有一种方法可以取消dplyr中的所有分组(这样,例如,我们可以获取全部而不是每个分组的最大值)
Example: 例:
library(dplyr)
library(car)
mtcars %>%
select(cyl, gear, carb) %>%
group_by(cyl, gear) %>%
summarise(count = n()) %>%
arrange(desc(count))
Output: 输出:
Source: local data frame [8 x 3]
Groups: cyl
cyl gear count
1 4 4 8
2 4 5 2
3 4 3 1
4 6 4 4
5 6 3 2
6 6 5 1
7 8 3 12
8 8 5 2
So the data was group_by(cyl, gear): two layers of grouping. 因此数据为group_by(cyl,gear):两层分组。 The summarise() counted how many cars in each (cyl, gear) group, and then peels off the group_by(gear) layer.
summarise()计算每个(缸,齿轮)组中有多少辆汽车,然后剥离group_by(gear)层。 Now the data is group_by(cyl).
现在数据是group_by(cyl)。 As you can see the descent order only works for each cyl.
如您所见,下降顺序仅适用于每个圆柱体。 (Descent in line 1-3 for cyl==4, descent in line 4-6 for cyl==6,...).
(对于cyl == 4,第1-3行下降,对于cyl == 6,在第4-6行下降...)。 How can we get a descent of all 8 lines?
如何获得所有8条线的下降? (Line 7 should be the first line.)
(第7行应该是第一行。)
Another example of how summarise() peels of grouping: 关于summarise()如何剥离分组的另一个示例:
mtcars %>%
select(cyl, gear, carb) %>%
group_by(cyl, gear) %>%
summarise(count = n())
Output:
Source: local data frame [8 x 3]
Groups: cyl
cyl gear count
1 4 3 1
2 4 4 8
3 4 5 2
4 6 3 2
5 6 4 4
6 6 5 1
7 8 3 12
8 8 5 2
---
mtcars %>%
select(cyl, gear, carb) %>%
group_by(cyl, gear) %>%
summarise(count = n()) %>%
summarise(count1 = max(count))
Output:
Source: local data frame [3 x 2]
cyl count1
1 4 8
2 6 4
3 8 12
---
mtcars %>%
select(cyl, gear, carb) %>%
group_by(cyl, gear) %>%
summarise(count = n()) %>%
summarise(count1 = max(count)) %>%
summarise(max(count1))
Output:
Source: local data frame [1 x 1]
max(count1)
1 12
Try: 尝试:
mtcars %>%
count(cyl, gear, name = "count") %>%
arrange(desc(count))
You will get: 你会得到:
#Source: local data frame [8 x 3]
#
# cyl gear count
#1 8 3 12
#2 4 4 8
#3 6 4 4
#4 4 5 2
#5 6 3 2
#6 8 5 2
#7 4 3 1
#8 6 5 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.