Recently, I scraped data from a website and it resembles data table in the input
variable below.
input <- data.frame(
"Date" = sprintf("%02d-Jan", 1:15),
"Type_event_1" = c(rep("Skiing", 3), rep("Marathon", 7), rep("Skating", 5)),
"sport_event_1"= c(rep("Alpine skiing",4), rep("Biathlon",6), rep("Curling",3), rep("Figure skating",2)),
"Type_event_2" = c(rep("Skiing", 4), rep("Marathon", 6),rep("Ice-Hockey", 3), rep("Skating", 2)),
"sport_event_2"= c(rep("Skeleton",4), rep("Luge",6), rep("Hockey",3), rep("Ski Jumping",2))
)
I want to rbind
columns with common suffix ("event_1", "event_2") one below other along with the 'Date' column. In this case I just have 4 columns ie 2 events what if I had 40 columns ie_ 20 such events. How can I do this using a for loop ?
The expected output look like this:
expected_output <- data.frame(
"Date" = rep(sprintf("%02d-Jan", 1:15),2),
"Type_event_1" = c(rep("Skiing", 3), rep("Marathon", 7), rep("Skating", 5),rep("Skiing", 4), rep("Marathon", 6),rep("Ice-Hockey", 3), rep("Skating", 2)),
"sport_event_1"= c(rep("Alpine skiing",4), rep("Biathlon",6), rep("Curling",3), rep("Figure skating",2),rep("Skeleton",4), rep("Luge",6), rep("Hockey",3), rep("Ski Jumping",2))
)
Try
library(data.table)
library(dplyr)
out1=data.table::melt(input[c(1,grep("Type_event_",names(input)))],"Date")[,c(1,3)]
out2=data.table::melt(input[c(1,grep("sport_event_",names(input)))],"Date")[,c(1,3)]
final<-cbind(out1,out2[,-1])
names(final)<-c("Date","Type_event","sport_event")
library(tidyverse)
tbl_df(input) %>%
unite(v1, Type_event_1, sport_event_1) %>%
unite(v2, Type_event_2, sport_event_2) %>%
gather(v1,v2, -Date) %>%
separate(v2, c("Type_event","sport_event"), sep = "_") %>%
select(-v1)
# # A tibble: 30 x 3
# Date Type_event sport_event
# <fct> <chr> <chr>
# 1 01-Jan Skiing Alpine skiing
# 2 02-Jan Skiing Alpine skiing
# 3 03-Jan Skiing Alpine skiing
# 4 04-Jan Marathon Alpine skiing
# 5 05-Jan Marathon Biathlon
# 6 06-Jan Marathon Biathlon
# 7 07-Jan Marathon Biathlon
# 8 08-Jan Marathon Biathlon
# 9 09-Jan Marathon Biathlon
#10 10-Jan Marathon Biathlon
# # ... with 20 more rows
Note: I'm using tbl_df(input)
only for visualisation purposes. You can use just input %>% ...
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.