簡體   English   中英

在 R 中的多個嵌套列表中對復雜數據框中的列進行子集化

[英]Subsetting columns within complex data frames within multiple nested lists in R

我正在嘗試從具有多個嵌套數據框的大列表中提取特定列。 這是我的代碼和輸出數據:

str(ls1)
List of 2
 $ CAT1:'data.frame':   603 obs. of  2 variables:
  ..$ M12:'data.frame': 603 obs. of  5 variables:
  .. ..$ chr        : Factor w/ 598 levels "chr1-105554500-105557462",..: 44 45 46 47 48 49 50 51 52 53 ...
  .. ..$ gene.name  : Factor w/ 551 levels "ENSMUST00000000028-Cdc45",..: 214 184 309 271 267 102 50 315 348 220 ...
  .. ..$ gene.length: int [1:603] 4380 4842 4278 406 357 610 1439 2081 1123 2200 ...
  .. ..$ dir        : Factor w/ 2 levels "-","+": 1 2 1 1 1 2 2 1 2 1 ...
  .. ..$ read.ct    : int [1:603] 307 91 89 84 204 36 10 37 102 77 ...
  ..$ M14:'data.frame': 603 obs. of  5 variables:
  .. ..$ chr        : Factor w/ 596 levels "chr1-105554500-105557462",..: 45 46 47 48 49 50 51 52 53 54 ...
  .. ..$ gene.name  : Factor w/ 549 levels "ENSMUST00000000028-Cdc45",..: 215 184 312 274 270 103 52 318 351 221 ...
  .. ..$ gene.length: int [1:603] 4380 4842 4278 406 357 610 1439 2081 1123 2200 ...
  .. ..$ dir        : Factor w/ 2 levels "-","+": 1 2 1 1 1 2 2 1 2 1 ...
  .. ..$ read.ct    : int [1:603] 370 104 112 89 139 45 12 60 93 70 ...
 $ CAT2:'data.frame':   109 obs. of  2 variables:
  ..$ M12:'data.frame': 109 obs. of  5 variables:
  .. ..$ chr        : Factor w/ 80 levels "chr1-121307307-121312200",..: 6 7 8 1 9 10 2 3 11 12 ...
  .. ..$ gene.name  : Factor w/ 80 levels "ENSMUST00000000365-Mcts1",..: 9 69 71 7 44 58 63 17 32 12 ...
  .. ..$ gene.length: int [1:109] 4205 3229 32462 4894 2048 9952 1334 3698 1787 11235 ...
  .. ..$ dir        : Factor w/ 2 levels "-","+": 2 1 1 1 1 1 1 2 2 2 ...
  .. ..$ read.ct    : int [1:109] 4 2 1 12 18 1 3 1 3 3 ...
  ..$ M14:'data.frame': 109 obs. of  5 variables:
  .. ..$ chr        : Factor w/ 85 levels "chr1-121307307-121312200",..: 7 8 1 9 10 2 11 12 13 3 ...
  .. ..$ gene.name  : Factor w/ 85 levels "ENSMUST00000002291-Paxip1",..: 6 71 4 45 61 65 59 8 9 15 ...
  .. ..$ gene.length: int [1:109] 4205 3229 4894 2048 9952 1334 780 569 11235 1348 ...
  .. ..$ dir        : Factor w/ 2 levels "-","+": 2 1 1 1 1 1 2 2 2 1 ...
  .. ..$ read.ct    : int [1:109] 21 3 6 22 5 2 3 1 1 1 ...

我想要的是能夠從每個子列表(即M12、M14)中提取gene.name 和read.ct 列。 我希望它看起來像這樣:

List of 2
$ CAT1:'data.frame':  603 obs. of  2 variables:
..$ M12:'data.frame':    603 obs. of  5 variables:
.. ..$ gene.name  : Factor w/ 551 levels "ENSMUST00000000028-Cdc45",..: 214 184 309 271 267 102 50 315 348 220 ...
.. ..$ read.ct    : int [1:603] 307 91 89 84 204 36 10 37 102 77 ...
..$ M14:'data.frame':    603 obs. of  5 variables:
.. ..$ gene.name  : Factor w/ 551 levels "ENSMUST00000000028-Cdc45",..: 214 184 309 271 267 102 50 315 348 220 ...
.. ..$ read.ct    : int [1:603] 307 91 89 84 204 36 10 37 102 77 ...
$ CAT2:'data.frame':  109 obs. of  2 variables:
..$ M12:'data.frame':    109 obs. of  5 variables:
.. ..$ gene.name  : Factor w/ 80 levels "ENSMUST00000000365-Mcts1",..: 9 69 71 7 44 58 63 17 32 12 ...
.. ..$ read.ct    : int [1:109] 4 2 1 12 18 1 3 1 3 3 ...
..$ M14:'data.frame':    109 obs. of  5 variables:
.. ..$ gene.name  : Factor w/ 85 levels "ENSMUST00000002291-Paxip1",..: 6 71 4 45 61 65 59 8 9 15 ...
.. ..$ read.ct    : int [1:109] 21 3 6 22 5 2 3 1 1 1 ...

我應該如何編寫代碼以獲得上述所需的輸出? 我嘗試了以下方法:

ls2 <- lapply(ls1, function(x) {
  y <- x[c(1:2)][c("gene.name", "read.ct")]
  return(y)
})

但我收到錯誤:

Error in `[.data.frame`(x[c(1:2)], c("gene.name", "read.ct")) : 
  undefined columns selected 

任何幫助,將不勝感激! 謝謝你。

似乎data.frame嵌套在第一個數據集的列中

lapply(ls1, function(x) lapply(x, `[`, c("gene.name", "read.ct")))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM