简体   繁体   English

如何使用biomaRt将安捷伦探针ID列表转换为基因符号并具有na值?

[英]How do I convert a list of agilent probe IDs to gene symbols using biomaRt and have na values?

I'm trying to use biomaRt to convert a list of more than 90k probe IDs to the gene symbols, but am having problems. 我正在尝试使用biomaRt将超过90k探针ID的列表转换为基因符号,但是遇到了问题。 Using the getBM function, I can see that only 22k of those have corresponding gene symbols, but the output is a vector of length 22k, and I am unable to see the correspondence to the initial probe ID list. 使用getBM函数,我可以看到只有22k个具有相应的基因符号,但是输出是长度为22k的向量,而且我看不到与初始探针ID列表的对应关系。 Using getBMlist, I can get an output with na values specified for those probes that don't match, but the function gives a warning message that getBMlist isn't for large lists. 使用getBMlist,我可以得到不匹配的探针指定na值的输出,但是该函数会给出警告消息,表明getBMlist不适用于大列表。 How do I get an output of 90k gene symbols and na values? 如何获得90k个基因符号和na值的输出?

To get the mappings between probeID and gene symbol you need to include the probeID in the biomaRt attributes. 要获取probeID和基因符号之间的映射,您需要在biomaRt属性中包括probeID。

Here's how I did it for some of my work using agilent microarrays: 这是我使用安捷伦微阵列完成某些工作的方式:

genes<-c("A_23_P10060", "A_23_P10091", "A_23_P103951", "A_23_P10525", "A_23_P105732", "A_23_P10605", "NM_005325")

library(biomaRt)
ensembl<-useMart("ensembl", dataset="hsapiens_gene_ensembl")

ensembl.id<-grep("ENST", genes, value=T)
agilent.df<-getBM(attributes = c("hgnc_symbol","efg_agilent_wholegenome_4x44k_v1"), filters=c("efg_agilent_wholegenome_4x44k_v1"),values=genes, mart=ensembl)

genes<-merge(x = as.data.frame(genes),y =  agilent.df, by.y="efg_agilent_wholegenome_4x44k_v1", all.x=T, by.x="genes")

There is a very good biomaRt tutorial that walks you though the same process. 有一个非常好的生物材料教程 ,可以指导您完成相同的过程。 If you run this code you'll notice that one probe will have "" for a hgnc_symbol, that's because it exists in the ensemble mart but has no designated gene symbol. 如果运行此代码,您会注意到一个探针将为hgnc_symbol带有“”,这是因为它存在于集成市场中,但没有指定的基因符号。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用biomaRt从基因列表中获取Entrez基因ID - Entrez gene IDs from gene list using biomaRt 如何在R中将基因名称(hgnc_symbol)转换为Ensemble ID? “ bioconductor-biomaRt” - How can I convert gene names (hgnc_symbol) to Ensemble IDs in R? “bioconductor-biomaRt” 无法使用 biomaRt 包从 Entrez ID 获取基因符号 - Unable to use biomaRt package to get Gene Symbols from Entrez IDs 使用 biomaRt 将 Ensembl ID 转换为基因名称 - convert Ensembl ID to gene name using biomaRt 如何将大列表中的Entrez ids转换为基因符号并替换R中列表中的entrez ids? - How to convert Entrez ids in a large list into gene symbols and replace entrez ids in list in R? R:biomaRt软件包缺少探针ID? - R: biomaRt package missing probe ids? 如何将 Affymetrix 探针转换为基因符号? - How can I convert Affymetrix probes into gene Symbols? R中的Biomart将rssnp转换为基因名称 - Biomart in R to convert rssnp to gene name 基因名称重复时,如何从R中的RNAseq数据通过基因ID调用数据帧? - How do I call a data frame by Gene IDs from RNAseq data in R when gene names are duplicated? 使用 Biomart hsapiens_gene_ensembl 数据集时的错误消息。 有谁知道怎么解决? - Error message when using Biomart hsapiens_gene_ensembl dataset. Anyone know how to solve?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM