简体   繁体   English

R 故障排除:根据数据框中另一列中的值对数据框中的一列的值求和

[英]R Troubleshooting: Sum values of one column in a data frame based on values in another column of the data frame

I am working on my final project for my data visualization class in R and I am trying to identify minority-majority counties in the US using CDC data.我正在为我在 R 中的数据可视化 class 的最终项目工作,并且我正在尝试使用 CDC 数据来识别美国的少数民族占多数的县。 For every county, I want to sum only the values in the Population column that correspond to the minority groups in the Races column.对于每个县,我只想对人口列中与种族列中的少数群体相对应的值求和。

I wrote the following code to sum the population counts of minority groups and it worked fine at first.我编写了以下代码来汇总少数群体的人口数量,起初它运行良好。 But now the minPop column is just outputting 64590855 for every row.但是现在 minPop 列只是为每一行输出 64590855。 Any help would be greatly appreciated!任何帮助将不胜感激!

races <- c("American Indian or Alaska Native", "Asian or Pacific Islander", "Black or African American")

group_by(County.Code) %>%
   mutate(minPop = sum(subset(Population, Race %in% races))) %>%
ungroup() 

Screenshot of my data frame我的数据框的屏幕截图

The following seems to be what the question asks for.以下似乎是问题所要求的。 Untested since there is no data in useable format.未经测试,因为没有可用格式的数据。

df_dem1 %>%
  filter(Race %in% races) %>%
  group_by(County.Code) %>%
  mutate(minPop = sum(Population)) %>%
  ungroup() 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM