[英]weighted join or match in R
I am working with election data from California's Statewide Database ( https://statewidedatabase.org/election.html ).我正在处理来自加州全州数据库 ( https://statewidedatabase.org/election.html ) 的选举数据。 I am trying to convert their precinct level election results to 2010 census block level results.
我正在尝试将他们的选区级别选举结果转换为 2010 年人口普查区块级别的结果。 I have the precinct level election results
我有选区级别的选举结果
> sov_results
# A tibble: 20,744 x 136
COUNTY FIPS SRPREC_KEY SRPREC ADDIST CDDIST SDDIST BEDIST TOTREG DEMREG REPREG AIPREG GRNREG LIBREG NLPREG REFREG DCLREG MSCREG TOTVOTE
<dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 49 06097 060971002 1002 2 5 2 2 29 0 0 0 0 0 0 0 0 0 18
2 49 06097 060971003 1003 2 2 2 2 1 0 0 0 0 0 0 0 0 0 0
3 49 06097 060971005 1005 2 2 2 2 106 0 0 0 0 0 0 0 0 0 67
4 49 06097 060971006 1006 2 5 2 2 2 0 0 0 0 0 0 0 0 0 2
5 49 06097 060971007 1007 2 2 2 2 56 0 0 0 0 0 0 0 0 0 42
6 49 06097 060971008 1008 2 5 2 2 148 0 0 0 0 0 0 0 0 0 109
7 49 06097 060971009 1009 2 5 2 2 137 0 0 0 0 0 0 0 0 0 97
8 49 06097 060971012 1012 2 5 2 2 21 0 0 0 0 0 0 0 0 0 16
9 49 06097 060971017 1017 4 5 2 2 723 0 0 0 0 0 0 0 0 0 591
10 49 06097 060971018 1018 2 2 2 2 14 0 0 0 0 0 0 0 0 0 10
# ... with 20,734 more rows, and 117 more variables: DEMVOTE <dbl>, REPVOTE <dbl>, AIPVOTE <dbl>, GRNVOTE <dbl>, LIBVOTE <dbl>,
# NLPVOTE <dbl>, REFVOTE <dbl>, DCLVOTE <dbl>, MSCVOTE <dbl>, PRCVOTE <dbl>, ABSVOTE <dbl>, ASSDEM01 <dbl>, ASSDEM02 <dbl>,
# ASSDEM03 <dbl>, ASSDEM04 <dbl>, ASSDEM05 <dbl>, ASSDEM06 <dbl>, ASSDEM07 <dbl>, ASSDEM08 <dbl>, ASSGRN01 <dbl>, ASSIND01 <dbl>,
# ASSLIB01 <dbl>, ASSPAF01 <dbl>, ASSREP01 <dbl>, ASSREP02 <dbl>, ASSREP03 <dbl>, ASSREP04 <dbl>, CNGAIP01 <dbl>, CNGDEM01 <dbl>,
# CNGDEM02 <dbl>, CNGDEM03 <dbl>, CNGDEM04 <dbl>, CNGDEM05 <dbl>, CNGDEM06 <dbl>, CNGDEM07 <dbl>, CNGDEM08 <dbl>, CNGDEM09 <dbl>,
As well as the conversion key with the weights.以及带有权重的转换键。
> conversion
# A tibble: 398,299 x 13
SRPREC FIPS ELECTION TYPE SRPREC_KEY BLOCK_KEY TRACT BLOCK BLKREG SRTOTREG PCTSRPREC BLKTOTREG PCTBLK
<dbl> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 NA 06097 p20 sr_blk 06097nan 060970000000000 0 0 1 NA NA 1 100
2 1002 06097 p20 sr_blk 060971002 060971525011014 152501 1014 26 29 89.7 26 100
3 1002 06097 p20 sr_blk 060971002 060971525013008 152501 3008 3 29 10.3 3 100
4 1003 06097 p20 sr_blk 060971003 060971526005068 152600 5068 1 1 100 1 100
5 1005 06097 p20 sr_blk 060971005 060971526005000 152600 5000 14 106 13.2 43 32.6
6 1005 06097 p20 sr_blk 060971005 060971526005003 152600 5003 12 106 11.3 12 100
7 1005 06097 p20 sr_blk 060971005 060971526005004 152600 5004 12 106 11.3 20 60
8 1005 06097 p20 sr_blk 060971005 060971526005006 152600 5006 5 106 4.72 5 100
9 1005 06097 p20 sr_blk 060971005 060971526005008 152600 5008 24 106 22.6 24 100
10 1005 06097 p20 sr_blk 060971005 060971526005020 152600 5020 28 106 26.4 28 100
I want to know how to match these precinct results to the census block in such a way that the census block is given the right amount of votes from the precinct results (based on the PCTSRPREC column which indicates what percentage of of the precinct belongs in the census block).我想知道如何将这些选区结果与人口普查区相匹配,以便普查区从选区结果中获得适当数量的选票(基于 PCTSRPREC 列,该列指示该选区的百分比属于人口普查区)。
For example, I would want to join so that 13.2% of SRPREC_KEY 060971005 is assigned to BLOCK 5000. That would be 13.2% of the TOTVOTE (rounded to a whole number), 13.2% of DEMVOTE, 13.2% of ASSDEM03 vote, etc. Is there a function or way to do this in R?例如,我想加入,以便将 SRPREC_KEY 060971005 的 13.2% 分配给 BLOCK 5000。这将是 TOTVOTE 的 13.2%(四舍五入)、DEMVOTE 的 13.2%、ASSDEM03 投票的 13.2%,等等。在 R 中是否有 function 或方法可以做到这一点?
I think you're looking for a join/merge operation and then a simple multiply.我认为您正在寻找加入/合并操作,然后是简单的乘法。
library(dplyr)
select(conversion, SRPREC_KEY, BLOCK, PCTSRPREC) %>%
left_join(., sov_results, by = "SRPREC_KEY") %>%
mutate(across(TOTREG:TOTVOTE, ~ . * PCTSRPREC / 100))
# SRPREC_KEY BLOCK PCTSRPREC COUNTY FIPS SRPREC ADDIST CDDIST SDDIST BEDIST TOTREG DEMREG REPREG AIPREG GRNREG LIBREG NLPREG REFREG DCLREG MSCREG TOTVOTE
# 1 06097nan 0 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
# 2 060971002 1014 89.70 49 6097 1002 2 5 2 2 26.0130 0 0 0 0 0 0 0 0 0 16.1460
# 3 060971002 3008 10.30 49 6097 1002 2 5 2 2 2.9870 0 0 0 0 0 0 0 0 0 1.8540
# 4 060971003 5068 100.00 49 6097 1003 2 2 2 2 1.0000 0 0 0 0 0 0 0 0 0 0.0000
# 5 060971005 5000 13.20 49 6097 1005 2 2 2 2 13.9920 0 0 0 0 0 0 0 0 0 8.8440
# 6 060971005 5003 11.30 49 6097 1005 2 2 2 2 11.9780 0 0 0 0 0 0 0 0 0 7.5710
# 7 060971005 5004 11.30 49 6097 1005 2 2 2 2 11.9780 0 0 0 0 0 0 0 0 0 7.5710
# 8 060971005 5006 4.72 49 6097 1005 2 2 2 2 5.0032 0 0 0 0 0 0 0 0 0 3.1624
# 9 060971005 5008 22.60 49 6097 1005 2 2 2 2 23.9560 0 0 0 0 0 0 0 0 0 15.1420
# 10 060971005 5020 26.40 49 6097 1005 2 2 2 2 27.9840 0 0 0 0 0 0 0 0 0 17.6880
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.