The ICPSR database is a massive ASCII-coded dataset of US elections that was created before modern coding standards were put in place. I have written a script that extracts all the data into a data.frame in long format with columns for county, vote total, and "party-year" (eg, Democrat_1992, Republican_1992, Reform_1992, etc.).
The problem is that these data were created in a patchwork by multiple authors over multiple years, so there are numerous duplicates and inefficiencies. For example, in the Arizona returns, you will encounter the following:
county | votes | header |
---|---|---|
PIMA | 0 | DEMOCRAT_1944 |
PIMA | 13,006 | DEMOCRAT_1944 |
MARICOPA | 32,197 | DEMOCRAT_1944 |
MARICOPA | 0 | DEMOCRAT_1944 |
PIMA | 3,392 | REPUBLICAN_1944 |
MARICOPA | 24,853 | REPUBLICAN_1944 |
The problem is that when you shift this to wide format, R will create a list for the column "DEMOCRAT_1944" where, for example, the Maricopa entry would be c(32197, 0). Making it worse, this is inconsistent; most data are entered correctly (eg, the data for REPUBLICAN_1944 only appear once, and so those data convert to wide nicely).
I am at a bit of a loss on how to fix this. Obviously it would be easy in this table to do it by brute force, but we're talking about 503,371 observations in the overall data.frame. It isn't consistent which party or year is redundant, so any solution would have to be very general. Also, some counties will have "legitimate" zeroes in them, so simply eliminating those rows containing zero can't be the solution.
I used the following code to convert from long to wide:
state_df2 <- state_df %>%
pivot_wider(names_from = new_header, values_from = value)
You could do:
state_df %>%
mutate(votes = as.numeric(str_remove(votes, ','))) %>%
pivot_wider(names_from = header, values_from = votes, values_fn = sum)
county DEMOCRAT_1944 REPUBLICAN_1944
<chr> <dbl> <dbl>
1 PIMA 13006 3392
2 MARICOPA 32197 24853
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.