I have a list of genes as rownames of my eset and I want to convert them to Ensembl gene ID. I used getGene in bioMart package but it took the same name twice for some genes! here is a small example for my code:
library (biomaRt)
rownames(eset)
[1] "EPC1" "MYO3A" "PARD3" "ATRNL1" "GDF2" "IL10RA" "GAD2" "CCDC6"
getGene(rownames(eset),type='hgnc_symbol',mart)[c(1,9)]
# [1] is the hgnc_symbol to recheck the matched data
# [9] is the ensemble_gene_id
hgnc_symbol ensembl_gene_id
1 ATRNL1 ENSG00000107518
2 CCDC6 ENSG00000108091
3 EPC1 ENSG00000120616
4 GAD2 ENSG00000136750
5 GDF2 ENSG00000263761
6 IL10RA ENSG00000110324
7 IL10RA LRG_151
8 MYO3A ENSG00000095777
9 PARD3 ENSG00000148498
As you can see there are two entries for "IL10RA" in the hgnc_symbol column; but I only had one "IL10RA" in the rownames(eset); this causes a problem at the end when I wanted to add the Ensembl_ID to the fData(eset)! How can I solve this problem? to have result like this:
hgnc_symbol ensembl_gene_id
1 ATRNL1 ENSG00000107518
2 CCDC6 ENSG00000108091
3 EPC1 ENSG00000120616
4 GAD2 ENSG00000136750
5 GDF2 ENSG00000263761
6 IL10RA ENSG00000110324
7 MYO3A ENSG00000095777
8 PARD3 ENSG00000148498
Thanks in advance,
I've found the solution by !duplicated in the eset. Something like this:
g_All <- getGene(id = rownames(eset)),type='hgnc_symbol',mart)
g_All <- g_All[!duplicated(g_All[,1]),]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.