![](/img/trans.png)
[英]How to summarize across multiple columns with condition on another (grouped) column with dplyr?
[英]dplyr summarize grouped data with another column
我有一個數據框pop.subset <-
:
state location pop
WA Seattle 100
WA Kent 20
OR foo 30
CA foo2 80
我需要每個州的人口最少的城市存儲在data.frame中。 我有:
result <- pop.subset %>%
group_by(state) %>%
summarise(min = min(pop))
這將返回data.frame:
state min
WA 20
... .... etc
但是我也需要這座城市。 我嘗試將位置包括在group_by
函數中,例如: group_by(state, location)
,但是這樣可以使每個城市的分鍾數與一個州配對,而不是像這樣的城市:
state location pop
WA Seattle 100
WA Kent 20
foo foo foo
我有沒有一個簡單的解決方案? 我希望我的結果像這樣:
state location pop
WA Kent 20
... ... ... etc.
你嘗試過這樣的事情嗎?
result <- pop.subset %>%
group_by(state, location) %>%
summarise(min = min(both_sexes_2012))
我認為您想按state
分組,然后過濾min(pop)
:
pop.subset %>%
group_by(state) %>%
filter(pop == min(pop)) %>%
ungroup()
# A tibble: 3 x 3
state location pop
<chr> <chr> <int>
1 WA Kent 20
2 OR foo 30
3 CA foo2 80
我了解,這可以解決:
library(tibble)
data<-tribble(~state, ~location, ~pop,
"WA", "Seattle", 100,
"WA", "Kent", 20,
"OR", "foo" , 30,
"CA", "foo2" , 80
)
library(dplyr)
data%>%group_by(state)%>%summarise(location=location[which.min(pop)]
,min=min(pop))
# A tibble: 3 x 3
state location min
<chr> <chr> <dbl>
1 CA foo2 80
2 OR foo 30
3 WA Kent 20
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.