[英]How to sum and weight certain rows in a dataframe in R?
I currently have a data.frame which is as follows:我目前有一个 data.frame 如下:
State Area_name LessHSD HSD SomeCAD BDorMore P_LessHSD P_HSD ZIP
1 US United States 26,948,057 59,265,308 63,365,655 68,867,051 12.3 27.1 1009
1913 NY Richmond County 37,675 101,738 81,014 108,326 11.5 30.9 36085
2 AL Alabama 470,043 1,020,172 987,148 822,595 14.2 30.9 1020
3 AL Autauga County 4,204 12,119 10,552 10,291 11.3 32.6 7080
1873 NY Bronx County 258,956 255,427 226,620 183,134 28 27.6 36005
1911 NY Queens County 303,881 454,105 369,271 518,999 18.5 27.6 36081
4 AL Baldwin County 14,310 40,579 46,025 46,075 9.7 27.6 1088
1901 NY New York County 162,237 155,048 171,461 758,325 13 12.4 36061
5 AL Barbour County 4,901 6,486 4,566 2,220 27.0 35.7 20012
1894 NY Kings County 326,469 455,299 3 47,052 648,461 18.4 25.6 36047
6 AL Bibb County 2,650 7,471 3,846 1,813 16.8 47.3 9012
I would like to sum up the 5 New York City burroughs (ZIP 36005,36047,36061,36081,36085) data for the columns LessHSD
, HSD
, SomeCAD
and create a new row with these sums with Area_name = New York Proper
(see output below).我想总结列
LessHSD
、 HSD
、 SomeCAD
的 5 个纽约市 burroughs (ZIP 36005,36047,36061,36081,36085) 数据,并用Area_name = New York Proper
用这些总和创建一个新行(见输出以下)。
For the columns P_LessHSD
, and P_HSD
, I would like to weight these variables by population into a new row.对于列
P_LessHSD
和P_HSD
,我想按人口将这些变量加权到一个新行中。 I have already calculated the weights myself from another set.我已经从另一组计算了自己的权重。 I would like to multiply Richmond County by
0.05669632
, Bronx County by 0.17051732
, Queens by 0.27133878
, New York County by 0.19392188
, and Kings by 0.3075256
.我想将里士满县乘以
0.05669632
,布朗克斯县乘以0.17051732
,皇后区乘以0.27133878
,纽约县乘以0.19392188
,国王乘以0.3075256
。
Tangibly, for the column P_LessHSD, this would look like:显然,对于列 P_LessHSD,这看起来像:
11.5*0.05669632
+ 28*0.17051732
+ 18.5*0.27133878
+ 13*0.19392188
+ 18.4*0.3075256
giving 18.6 (when rounded to tens place).给出 18.6(四舍五入到十位时)。 This would be done for P_HSD too.
这也适用于 P_HSD。 I would like the ZIP of the new row to be 55555. I would also like to delete all 5 rows with the Burroughs.
我希望新行的 ZIP 为 55555。我还想删除 Burroughs 的所有 5 行。
Output should be:输出应该是:
State Area_name LessHSD HSD SomeCAD BDorMore P_LessHSD P_HSD ZIP
1 US United States 26,948,057 59,265,308 63,365,655 68,867,051 12.3 27.1 1009
2 AL Alabama 470,043 1,020,172 987,148 822,595 14.2 30.9 1020
3 AL Autauga County 4,204 12,119 10,552 10,291 11.3 32.6 7080
4 AL Baldwin County 14,310 40,579 46,025 46,075 9.7 27.6 1088
5 AL Barbour County 4,901 6,486 4,566 2,220 27.0 35.7 20012
6 AL Bibb County 2,650 7,471 3,846 1,813 16.8 47.3 9012
7 NY New York Proper 1089218 1421617 895418 2217245 18.6 24.2 55555
Might it helps.可能有帮助。
It use dplyr
package.它使用
dplyr
包。 You need install it first你需要先安装它
install.packages("dplyr")
library(dplyr)
DF %>%
filter(!(ZIP %in% c(36005,36047,36061,36081,36085))) %>%
bind_rows(
DF %>%
filter(ZIP %in% c(36005,36047,36061,36081,36085)) %>%
mutate(wg = case_when(Area_name == "Richmond County" ~ 0.05669632,
Area_name == "Bronx County" ~ 0.17051732,
Area_name == "Queens County" ~ 0.27133878,
Area_name == "New York County" ~ 0.19392188,
Area_name == "Kings County" ~ 0.3075256,
TRUE ~ 0),
P_LessHSD = wg*P_LessHSD,
P_HSD = wg*P_HSD,
Area_name = "New York Proper") %>%
group_by(State, Area_name) %>%
summarize_at(vars(LessHSD:P_HSD), sum) %>%
mutate(ZIP = 55555) )
# # A tibble: 7 x 9
# State Area_name LessHSD HSD SomeCAD BDorMore P_LessHSD P_HSD ZIP
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 US United States 26948057 59265308 63365655 68867051 12.3 27.1 1009
# 2 AL Alabama 470043 1020172 987148 822595 14.2 30.9 1020
# 3 AL Autauga County 4204 12119 10552 10291 11.3 32.6 7080
# 4 AL Baldwin County 14310 40579 46025 46075 9.7 27.6 1088
# 5 AL Barbour County 4901 6486 4566 2220 27 35.7 20012
# 6 AL Bibb County 2650 7471 3846 1813 16.8 47.3 9012
# 7 NY New York Proper 1089218 1421617 1195418 2217245 18.6 24.2 55555
PS.附注。 It gives different result for
someCAD
.它为
someCAD
提供了不同的结果。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.