简体   繁体   中英

R from long to wide dataframe with real valued columns

I have a quick question that relates to reshaping my data frame where I have ID "grouped_by" data. I have the following schema of the df (+ 2 exemplary instances that I wish to widen (in total I have >5000)):

   id                  solver   scoreA  scoreB  group   size 
   <chr>               <chr>    <dbl>   <dbl>   <chr>   <dbl>
 1 instance_1          s1       1        0.5    g1      1000                     
 2 instance_1          s2       100      50     g1      1000

... what I want to gain is:

   id           solver.best  scoreA.s1  scoreA.s2  scoreB.s1   scoreB.s2  group   size 
   <chr>        <chr>        <dbl>      <dbl>      <dbl>       <dbl>      <chr>   <dbl>
 1 instance_1   s1           1          100        0.5         50         g1      1000                     

Appreciate your help. BR

Maybe you can try the code below

reshape(within(df, Q <- ave(seq(nrow(df)), id, FUN = seq_along)),
  direction = "wide",
  idvar = "id", 
  timevar = "Q"
)

which gives

> reshape(cbind(df,Q = seq(nrow(df))),direction = "wide",idvar = "id",timevar = "Q")
          id solver.1 scoreA.1 scoreB.1 group.1 size.1 solver.2 scoreA.2
1 instance 1       s1        1      0.5      g1   1000       s2      100
  scoreB.2 group.2 size.2
1       50      g1   1000

Data

> dput(df)
structure(list(id = c("instance 1", "instance 1"), solver = c("s1", 
"s2"), scoreA = c(1L, 100L), scoreB = c(0.5, 50), group = c("g1",
"g1"), size = c(1000L, 1000L)), class = "data.frame", row.names = c("1",
"2"))

As I still wish to have a handy, eg tidyverse, best-practice, I still want to share the practical approach, which works just as fine conceptually :):

# create empty (wide) target df
wide_df <- data.frame(matrix(ncol = 8, nrow = 0))

names <- c("id", "best_solver", "scoreA_s1", "scoreA_s2",
           "scoreB_s1", "scoreB_s2", "group", "size")
colnames(wide_df) <- names


# traverse grouped by and arranged original (long) df 
for(i in seq(2, length(long_df$group), by = 2)){
  wide_df[i/2, "id"] <- long_df[i, "id"]
  wide_df[i/2, "best_solver"] <- long_df[which(long_df[(i-1):i, "scoreA"] ==
                                 min(long_df[i-1, "scoreA"], long_df[i, "scoreA"])), 
                                 "solver"]
  wide_df[i/2, "scoreA_s1"] <- long_df[i-1, "scoreA"]
  wide_df[i/2, "scoreA_s2"] <- long_df[i, "scoreA"]
  wide_df[i/2, "scoreB_s1"] <- long_df[i-1, "scoreB"]
  wide_df[i/2, "scoreB_s2"] <- long_df[i, "scoreB"]
  wide_df[i/2, "group"] <- long_df[i, "group"]
  wide_df[i/2, "size"] <- long_df[i, "size"]
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM