简体   繁体   中英

Pivot all columns wider (except ID columns) using pivot_wider() in R/tidyverse

I'm trying to transform a data frame from long to wide in R . I am trying to pivot all columns wider (excepting columns that uniquely identify observations) using pivot_wider() . Here is a minimal working example:

library("tidyr")

set.seed(12345)

sampleSize <- 10
timepoints <- 3
raters <- 2

data_long <- data.frame(ID = rep(1:sampleSize, each = timepoints * raters),
                        time = rep(1:timepoints, times = sampleSize * raters),
                        rater = rep(c("a","b"), times = sampleSize * timepoints),
                        v1 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        v2 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        v3 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        v100 = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        vA = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        vB = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        vC = sample.int(99, sampleSize * timepoints * raters, replace = TRUE),
                        vZZ = sample.int(99, sampleSize * timepoints * raters, replace = TRUE))

Here are the data:

> tibble(data_long)
# A tibble: 60 x 11
      ID  time rater    v1    v2    v3  v100    vA    vB    vC   vZZ
   <int> <int> <chr> <int> <int> <int> <int> <int> <int> <int> <int>
 1     1     1 a        14    56    30    75    66    22     8    73
 2     1     1 b        90    44    99     8    36    72     1    78
 3     1     2 a        92    35    93    46     4    68    39    52
 4     1     2 b        51    91    50    67    43    72    99    74
 5     1     3 a        80    34    31    31    21    52     7    23
 6     1     3 b        24    86    25    86    20    43    74    89
 7     2     1 a        58    51    48    60     6    56    66    37
 8     2     1 b        96    95    76     1    78     2    65     3
 9     2     2 a        88    26    92    86     7    37    84    15
10     2     2 b        93    55    25    62    27    39    73    85
# ... with 50 more rows

In this example, I have three columns that uniquely identify all observations: ID , time , and rater . I'd like to widen every other column by rater (ie, excluding the ID and time columns). My expected output is:

# A tibble: 30 x 18
      ID  time  v1_a  v1_b  v2_a  v2_b  v3_a  v3_b v100_a v100_b  vA_a  vA_b  vB_a  vB_b  vC_a  vC_b vZZ_a vZZ_b
   <int> <int> <int> <int> <int> <int> <int> <int>  <int>  <int> <int> <int> <int> <int> <int> <int> <int> <int>
 1     1     1    14    90    56    44    30    99     75      8    66    36    22    72     8     1    73    78
 2     1     2    92    51    35    91    93    50     46     67     4    43    68    72    39    99    52    74
 3     1     3    80    24    34    86    31    25     31     86    21    20    52    43     7    74    23    89
 4     2     1    58    96    51    95    48    76     60      1     6    78    56     2    66    65    37     3
 5     2     2    88    93    26    55    92    25     86     62     7    27    37    39    84    73    15    85
 6     2     3    75     2    23    55    28     8     66     74    65    92    58    10    91    65     7    44
 7     3     1    86    94     7    87    78    85     38     87    36    49    89    83    33    34    32    38
 8     3     2    10    75    12    15    21    18     56     77    54    17    61    92    18    50    98    27
 9     3     3    38    81    46    90    20    47     88     15    33    95    66    19    12    27    84    52
10     4     1    32    38    88    68    77    71     10     81    21    54    33    16    90    41    29    72
# ... with 20 more rows

I can widen any given columns using the following syntax:

data_long %>% 
  pivot_wider(names_from = rater, values_from = c(v1, v2))

Thus, I could widen all columns by entering all of them manually in a vector:

data_long %>% 
  pivot_wider(names_from = rater, values_from = c(v1, v2, v3, v100, vA, vB, vC, vZZ))

However, this becomes unwieldy if I have many columns. Another approach is to widen columns by specifying a range of columns:

data_long %>% 
  pivot_wider(names_from = rater, values_from = v1:vZZ)

However, this approach does not work well if all columns to be widened are not in a single range, for instance if the ID columns are interspersed throughout the data frame (though it would be possible to specify multiple ranges).

Is there a way to use pivot_wider() to widen ALL columns except for any columns that I specify as columns that uniquely identify each observation using id_cols (ie, ID and time ). I'd like the solution to be extendable to the case where I have many columns (and thus do not want to specify variable names or ranges for variables to be widened).

As we know the first 3 columns, should be fixed, use - on those column names in values_from

library(dplyr)
library(tidyr)
data_long %>% 
   pivot_wider(names_from = rater, values_from = -names(.)[1:3])

Or if we already create an object

id_cols <- c("ID", "time")
data_long %>%
    pivot_wider(names_from = rater, values_from = -all_of(id_cols))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM