简体   繁体   中英

Create sublist based on another list in R

I have 2 lists, it is matching by order. List 1 contained 4 datasets, List 2 contained 4 datasets.

The output returns 2 new lists, one for list 1, and one for list 2.

Condition:

If any dataset from list 1 have the same titles, I create sublist

This is List 1

df1 <- data.frame(name = "AAA", title = "Test", stringsAsFactors = F)
df2 <- data.frame(name = "BBB", title = "Test 1", stringsAsFactors = F)
df3 <- data.frame(name = "CCC", title = "Test 2", stringsAsFactors = F)
df4 <- data.frame(name = "DDD", title = "Test 1", stringsAsFactors = F)

check <- list(df1, df2, df3, df4)
names(check) <- c("df1", "df2","df3","df4")

This is List 2

df11 <- mtcars
df22 <- iris
df33 <- PlantGrowth
df44 <- ToothGrowth

df <- list(df11, df22, df33, df44)
names(df) <- c("df11", "df22","df33","df44")

Structure of the lists

> str(check)
List of 4
 $ df1:'data.frame':    1 obs. of  2 variables:
  ..$ name : chr "AAA"
  ..$ title: chr "Test"
 $ df2:'data.frame':    1 obs. of  2 variables:
  ..$ name : chr "BBB"
  ..$ title: chr "Test 1"
 $ df3:'data.frame':    1 obs. of  2 variables:
  ..$ name : chr "CCC"
  ..$ title: chr "Test 2"
 $ df4:'data.frame':    1 obs. of  2 variables:
  ..$ name : chr "DDD"
  ..$ title: chr "Test 1"
> str(df)
List of 4
 $ df11:'data.frame':   32 obs. of  11 variables:
  ..$ mpg : num [1:32] 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
  ..$ cyl : num [1:32] 6 6 4 6 8 6 8 4 4 6 ...
  ..$ disp: num [1:32] 160 160 108 258 360 ...
  ..$ hp  : num [1:32] 110 110 93 110 175 105 245 62 95 123 ...
  ..$ drat: num [1:32] 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
  ..$ wt  : num [1:32] 2.62 2.88 2.32 3.21 3.44 ...
  ..$ qsec: num [1:32] 16.5 17 18.6 19.4 17 ...
  ..$ vs  : num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
  ..$ am  : num [1:32] 1 1 1 0 0 0 0 0 0 0 ...
  ..$ gear: num [1:32] 4 4 4 3 3 3 3 4 4 4 ...
  ..$ carb: num [1:32] 4 4 1 1 2 1 4 2 2 4 ...
 $ df22:'data.frame':   150 obs. of  5 variables:
  ..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
  ..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
  ..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
  ..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
  ..$ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ df33:'data.frame':   30 obs. of  2 variables:
  ..$ weight: num [1:30] 4.17 5.58 5.18 6.11 4.5 4.61 5.17 4.53 5.33 5.14 ...
  ..$ group : Factor w/ 3 levels "ctrl","trt1",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ df44:'data.frame':   60 obs. of  3 variables:
  ..$ len : num [1:60] 4.2 11.5 7.3 5.8 6.4 10 11.2 11.2 5.2 7 ...
  ..$ supp: Factor w/ 2 levels "OJ","VC": 2 2 2 2 2 2 2 2 2 2 ...
  ..$ dose: num [1:60] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ...
> 

You can see df2 and df4 have the same titles, so desired output:

new_check <- list(df1, list(df2, df4), df3)

new_df <- list(df11, list(df22, df44), df33)

May this is what you want?

df1 <- data.frame(name = "AAA", title = "Test", stringsAsFactors = F)
df2 <- data.frame(name = "BBB", title = "Test 1", stringsAsFactors = F)
df3 <- data.frame(name = "CCC", title = "Test 2", stringsAsFactors = F)
df4 <- data.frame(name = "DDD", title = "Test 1", stringsAsFactors = F)

check <- list(df1, df2, df3, df4)
names(check) <- c("df1", "df2","df3","df4")

df11 <- mtcars
df22 <- iris
df33 <- PlantGrowth
df44 <- ToothGrowth

df <- list(df11, df22, df33, df44)
names(df) <- c("df11", "df22","df33","df44")

library(purrr)
unique_title <- unique(map(check, pluck, "title"))
# A function that extract item from x_list with index follow the indexes
# in check that have title equal to x_title
group_same_title <- function(x_title, x_list) {
  as.list(x_list[which(map(check, pluck, "title") == x_title)])
}

new_check <- map(unique_title, group_same_title, x_list = check)
new_df <- map(unique_title, group_same_title, x_list = df)

Here is the new_check

new_check

Output

[[1]]
[[1]]$df1
  name title
1  AAA  Test


[[2]]
[[2]]$df2
  name  title
1  BBB Test 1

[[2]]$df4
  name  title
1  DDD Test 1


[[3]]
[[3]]$df3
  name  title
1  CCC Test 2

Here is the new_df summary

lapply(new_df, summary)

Output

[[1]]
     Length Class      Mode
df11 11     data.frame list

[[2]]
     Length Class      Mode
df22 5      data.frame list
df44 3      data.frame list

[[3]]
     Length Class      Mode
df33 2      data.frame list

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM