How can I create a new data frame with several rows for each observation based on string column?

Question

I have a data frame in R with data on observations. One column contains several data points for each observation recorded as one long string with separators. I would like to restructure this data so that one observation can occur with several rows instead per the example below.

The data right now looks like this:

df <- data.frame(matrix(c("A", "B",
                          "X", "Y",
                          "{data1},{data2}", "{data1}"),
                 nrow = 2,
                 ncol = 3,
                 byrow = F))
names(df) <- c("key", "info", "more_info")

I would like it to look like this:

df <- data.frame(matrix(c("A", "A", "B",
                          "X", "X", "Y",
                          "{data1}", "{data2}", "{data1}"),
                 nrow = 3,
                 ncol = 3,
                 byrow = F))
names(df) <- c("key", "info", "more_info")

My first idea was to first use separate() and then use pivot_longer() but this ran into issues since the length of the last column is not the same for each observation. In fact, for some observations it may consist of hundreds of records.

Answer 1

You can use separate_rows from tidyr:

> library(tidyr)
> separate_rows(df, more_info, sep=",")
# A tibble: 3 x 3
  key   info  more_info
  <fct> <fct> <chr>    
1 A     X     {data1}  
2 A     X     {data2}  
3 B     Y     {data1}

Answer 2

An option with unnest after strsplit

library(dplyr)
library(tidyr)
df %>% 
    mutate(more_info = strsplit(more_info, ",")) %>% 
    unnest(c(more_info))

How can I create a new data frame with several rows for each observation based on string column?

Question

2 answers

solution1
1 ACCPTED 2021-02-12 15:11:34

solution2
0 2021-02-12 23:01:21

How can I create a new data frame with several rows for each observation based on string column?

Question

2 answers

solution1 1 ACCPTED 2021-02-12 15:11:34

solution2 0 2021-02-12 23:01:21

solution1
1 ACCPTED 2021-02-12 15:11:34

solution2
0 2021-02-12 23:01:21