使用Group By返回多个变量并使用Dplyr进行汇总

Question

I'm trying to create a new column in my 2016 election dataset that shows whether the candidate lost or won a county. 我正在尝试在我的2016年选举数据集中创建一个新列，以显示候选人是否输了县或赢得了县。

 Democrat %>%
  group_by(county) %>%
  summarise(winningvote = max(fraction_votes))

This code only returns the max vote. 此代码仅返回最大投票。 Can I also return the candidate variable? 我还可以返回候选变量吗？ Adding: 新增：

 select(county, fraction_votes, candidate)

Doesn't return anything different. 没有返回任何不同的东西。

I'll attempt to create an "outcome" variable using mutate for the last line of the code. 我将尝试使用mutate为代码的最后一行创建一个“结果”变量。 I was thinking the apply family might be another way to solve this. 我以为申请家庭可能是解决此问题的另一种方式。

Thanks 谢谢

Answer 1

If the candidate is a field of the Democrat data frame, the simplest way is to do multiple grouping: 如果candidate是Democrat数据框的一个字段，则最简单的方法是进行多个分组：

Democrat %>%
  group_by(county, candidate) %>%
  summarise(winningvote = max(fraction_votes))

Answer 2

I'm pretty confident there's a more succinct way to do this, but below will provide you a winning vote flag as 1. Then you simply replace NA with 0 (second block of code) 我非常有信心这样做的方法更加简洁，但是下面将为您提供一个获胜的投票标志：1。然后您只需将NA替换为0（第二个代码块）

left_join(Democrat, (Democrat %>%
  group_by(county) %>%
  summarise(fraction_votes = max(fraction_votes)) %>%
  mutate(Winning_Vote = 1)))

Democrat[is.na(Democrat)] <- 0

使用Group By返回多个变量并使用Dplyr进行汇总

问题描述

2 个解决方案

解决方案1
1 2017-02-21 20:48:13

解决方案2
0 2017-02-21 20:57:35

使用Group By返回多个变量并使用Dplyr进行汇总

问题描述

2 个解决方案

解决方案1 1 2017-02-21 20:48:13

解决方案2 0 2017-02-21 20:57:35

解决方案1
1 2017-02-21 20:48:13

解决方案2
0 2017-02-21 20:57:35