I have used lapply along with biomart to extract the homologues for 3 different species. I also need to extract the target IDs for all of the homologues and I was hoping to also use lapply for the target IDs as well to make my code more efficient. The code I have so far is below:
Load Biomart:
library(biomaRt)
Set the species vector
species <- c("hsapiens", "mmusculus", "ggallus")
Make a connection to ensembl for all species
ensembl_hsapiens <- useMart("ensembl",
dataset = "hsapiens_gene_ensembl")
ensembl_mmusculus <- useMart("ensembl",
dataset = "mmusculus_gene_ensembl")
ensembl_ggallus <- useMart("ensembl",
dataset = "ggallus_gene_ensembl")
Get the human genes
hsapien_PC_genes <- getBM(attributes = c("ensembl_gene_id", "external_gene_name"),
filters = "biotype",
values = "protein_coding",
mart = ensembl_hsapiens)
ensembl_gene_ID <- hsapien_PC_genes$ensembl_gene_id
Get the homologues but exclude humans as these have already been retrieved by using species[2:9]
all_homologues <- list()
all_homologues <- lapply(species[2:9], function(s) getBM(attributes = c("ensembl_gene_id",
"external_gene_name",
paste0(s, c("_homolog_ensembl_gene",
"_homolog_associated_gene_name"))),
filters = "ensembl_gene_id",
values = c(ensembl_gene_ID),
mart = ensembl_hsapiens))
This is where I run into problems, I don't know how to subset the ensembl_gene_id for each species and use lapply to run it. What I have tried so far is below:
target_id <- list()
target_id <- lapply(species, function(s) getBM(attributes = c("ensembl_gene_id",
"external_gene_name",
"hsapiens_homolog_associated_gene_name",
"hsapiens_homolog_perc_id"),
filters = "ensembl_gene_id",
values = c(all_homologues[[]][["ensembl_gene_id"]]),
mart = get(paste0("ensembl_", s))))
I can get it to work the normal way like this:
target_id[["mmusculus"]] <- getBM(attributes = c("ensembl_gene_id",
"external_gene_name",
"hsapiens_homolog_associated_gene_name",
"hsapiens_homolog_perc_id"),
filters = "ensembl_gene_id",
values = c(all_homologues[["mmusculus"]]$ensembl_gene_id),
mart = ensembl_mmusulus)
target_id[["ggallus"]] <- getBM(attributes = c("ensembl_gene_id",
"external_gene_name",
"hsapiens_homolog_associated_gene_name",
"hsapiens_homolog_perc_id"),
filters = "ensembl_gene_id",
values = c(all_homologues[["ggallus"]]$ensembl_gene_id),
mart = ensembl_ggallus)
But this is not as efficient as getting r to automatically change the species for me
I have found a solution:
target_id <- lapply(species[-1], function(s) getBM(attributes = c("ensembl_gene_id",
"external_gene_name",
"hsapiens_homolog_associated_gene_name",
"hsapiens_homolog_perc_id"),
filters = "ensembl_gene_id",
values = all_homologues[[paste0(s)]][paste0(s, "_homolog_ensembl_gene")],
mart = ensembl[[paste0(s)]]))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.