简体   繁体   中英

Map a list of dataframe columns into another dataframe r

An example list of dataframes are given below. listofDataFrames contains multiple dataframes. Each dataframe contains a column lev which is the key to be used in the mapping process. The values are the columns except lev . New columns should be generated for DF based on mapping from listofDataFrames . To be more clear, if we consider colors from listofDataFrames , there are two columns: "colors number 3" and "colors number 10". These columns both contain 3 unique values: "r","l" and "?". In DF we should create two new columns: "colors number 3" and "colors number 10". We can create them based on the lev column in colors from listofDataFrames . In . In DF` if for a particular row and column "colors" has "orange" then we should map "r" for the new column "colors number 3". The expected output is given below.

# Create an example list of dataframes and populate it
listofDataFrames <- list() 

genres <- data.frame("genres number 12" =  c("r","l","?","r","r"),
           "genres number 17" =  c("l","r","?","l","?"),
           lev = c("pop","rock","jazz","blues","r&b"),
           check.names = FALSE)

colors <- data.frame("colors number 3" =  c("l","r","?","r"),
                     "colors number 10" =  c("l","r","l","r"),
                     lev = c("red","blue","green","orange"),
                     check.names = FALSE)

listofDataFrames[["genres"]] <- genres
listofDataFrames[["colors"]] <- colors

## DF

DF <-data.frame("genres" = c("pop", "pop","jazz","rock","jazz","blues","rock","pop","blues","pop"),
           "colors" = c("orange","red","red","orange","green","blue","orange","red","blue","green"),
           "values" = c(12, 15, 24, 33 ,47, 2 , 9 ,6, 89, 75))


## EXPECTED OUTPUT

expectedOutput <- 
  data.frame("genres" = c("pop", "pop","jazz","rock","jazz","blues","rock","pop","blues","pop"),
           "colors" = c("orange","red","red","orange","green","blue","orange","red","blue","green"),
           "values" = c(12, 15, 24, 33 ,47, 2 , 9 ,6, 89, 75),
           "genres number 12" = c("r","r","?","l","?","r","l","r","r","r"),
           "genres number 17" = c("l","l","?","r","?","l","r","l","l","l"),
           "colors number 3" = c("r","l","l","r","?","r","r","l","r","?"),
           "colors number 10" = c("r","l","l","r","l","r","r","l","r","l"),
           check.names = FALSE
           )

Here, we could use double merge first on the 'genres' and then on the 'colors' column of 'DF' with corresponding list elements

merge(merge(DF, listofDataFrames[['genres']], all.x = TRUE, 
   by.x = 'genres', by.y = 'lev'), 
     listofDataFrames[['colors']], all.x = TRUE, by.x = 'colors', by.y = 'lev')

Or we can use a loop

nm1 <- names(listofDataFrames)
out <- DF
for(i in seq_along(nm1)) {
     out <- merge(out, listofDataFrames[[nm1[i]]], all.x = TRUE,
       by.x = nm1[i], by.y = 'lev')
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM