简体   繁体   中英

How can I expand grouped data in R using dplyr?

Is it possible to achieve the result of the following code in a more efficient and concise way? Due to the nature of my task, I cannot use base or tidyr functions, because the dplyr code needs to be translated to SQL and executed on a database.

library(dplyr)
library(dbplyr)
library(RSQLite)
library(DBI)

# Create example data set
id <- c("a", "b", "c")
df <- data.frame(id)

# Treat it as a data base table
con <- DBI::dbConnect(RSQLite::SQLite(), path = ":memory:")
copy_to(con, df, "data", temporary = FALSE)

# Expand to data set and create a variable for four quarters
n <- 4

data <- tbl(con, "data") %>%
    mutate(quarter = 1)

for (i in 2:n) {
    data <- data %>%
        mutate(quarter = i) %>%
        union(data, data) %>%
        show_query()
}

data <- collect(data)

My goal in my real life example is to query a list of IDs and to expand it to a data set with a variable "quarter". I want to use that list as a basis to successively join more information later.

It sounds like you want the Cartesian product of id = c('a', 'b', 'c') and quarters = c(1, 2, 3, 4) , which would give you id_quarter = c(('a',1), ('a',2), ('a',3), ..., ('c',4)) .

This can be done using a join on a dummy variable like so:

id <- c("a", "b", "c")
df <- data.frame(id)
quarter <- c(1, 2, 3, 4)
df_q <- data.frame(quarter)

# Treat it as a data base table
con <- DBI::dbConnect(RSQLite::SQLite(), path = ":memory:")
copy_to(con, df, "data", temporary = FALSE)
copy_to(con, df_q, "quarter", temporary = FALSE)

# create placeholder column
data <- tbl(con, "data") %>%
    mutate(dummy_placeholder = 1)
quarters <- tbl(con, "quarter") %>%
    mutate(dummy_placeholder = 1)

# join and collect
result <- data %>%
    inner_join(quarter, by = "dummy_placeholder") %>%
    collect()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM