If i have a data set similar to the following:
# State Ben.Carson.Number.of.Votes Ben.Carson.Party Ben.Carson.Percent Bernie.Sanders.Votes Bernie Sanders.Percent Bernie.Sanders.Party
# OH 305. Republican 8.3 500 12.30 Democrat
# FL 20 Republican 3.0 700 11.00. Democrat
# TX 400. Republican 5.0 50 1.00 Democrat
How do I create four unified columns, Candidate Name, Votes, Percent, and Party, from all the separate columns located in the data set currently? Ie gather together all three types of columns based on the candidate name located in the column name.
I tried the following but to no avail:
tidyElectionData %>%
gather(key, value, -c(County, Location.State, State)) %>%
separate(key, into = c("Candidate", "Party"), sep = "(^[^.]+[.][^.]+)(.+$)") %>%
spread(Party, value)
A solution based in the tidyverse can look as follows.
library(dplyr)
library(tidyr)
library(stringr)
df %>%
mutate(across(everything(), as.character)) %>%
pivot_longer(-State) %>%
mutate(names = str_extract(name, 'Votes|Party|Percent'),
name = str_extract(name, 'Ben.Carson|Bernie.Sanders')) %>%
pivot_wider(names_from = names, values_from = value)
# State name Votes Party Percent
# <chr> <chr> <chr> <chr> <chr>
# 1 OH Ben.Carson 305 Republican 8.3
# 2 OH Bernie.Sanders 500 Democrat 12.3
# 3 FL Ben.Carson 20 Republican 3
# 4 FL Bernie.Sanders 700 Democrat 11
# 5 TX Ben.Carson 400 Republican 5
# 6 TX Bernie.Sanders 50 Democrat 1
Data
df <- structure(list(State = c("OH", "FL", "TX"), Ben.Carson.Number.of.Votes = c(305,
20, 400), Ben.Carson.Party = c("Republican", "Republican", "Republican"
), Ben.Carson.Percent = c(8.3, 3, 5), Bernie.Sanders.Votes = c(500,
700, 50), Bernie.Sanders.Percent = c(12.3, 11, 1), Bernie.Sanders.Party = c("Democrat",
"Democrat", "Democrat")), row.names = c(NA, -3L), class = c("tbl_df",
"tbl", "data.frame"))
In base R you could do:
candidates <- unique(sub("(\\w+[.]\\w+).*","\\1",names(df)[-1]))
columns <- split(names(df[-1]),sub(".*[.]","",names(df)[-1]))
df1<-reshape(df, columns, dir = "long", times = candidates, idvar = "State")
names(df1)[-1]<-c("candidate", names(columns))
rownames(df1) <- NULL
df1
State candidate Party Percent Votes
1 OH Ben.Carson Republican 8.3 305
2 FL Ben.Carson Republican 3 20
3 TX Ben.Carson Republican 5 400
4 OH Bernie.Sanders Democrat 12.30 500
5 FL Bernie.Sanders Democrat 11.00. 700
6 TX Bernie.Sanders Democrat 1.00 50
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.