goseq package 在 R “缺少 TRUE/FALSE 需要的值”错误

Question

I am attempting to run a GO Analysis in R (I have never done this analysis, so I am trying different packages), and I am struggling to find the problem with my code in the goseq package.我正在尝试在 R 中运行 GO 分析（我从未做过此分析，所以我正在尝试不同的包），并且我正在努力在 goseq ZEFE90A8E604A7C840DA33B 中找到我的代码的问题。

I start with this code which produces a list of the differentially expressed gene names:我从这段代码开始，它产生了一个差异表达基因名称的列表：

 de.genes <- rownames(res)[ which(res$padj < fdr.threshold & !is.na(res$padj)) ]

Then I try to run this code (based on page 7 of the vignette ( https://bioconductor.org/packages/devel/bioc/vignettes/goseq/inst/doc/goseq.pdf )然后我尝试运行此代码（基于小插图的第 7 页（ https://bioconductor.org/packages/devel/bioc/vignettes/goseq/inst/doc/goseq.pdf ）

 pwf <- nullp(de.genes, "hg38","geneSymbol")

but I get the following error:但我收到以下错误：

 Can't find hg38/geneSymbol length data in genLenDataBase...
 Found the annotation package, TxDb.Hsapiens.UCSC.hg38.knownGene
 Trying to get the gene lengths from it.
 Error in if (matched_frac == 0) { : missing value where TRUE/FALSE needed
 In addition: Warning message:
 In grep(txdbPattern, installedPackages):argument 'pattern' has length > 1 and only the first element will be used

I found this forum: https://support.bioconductor.org/p/38580/ that says I need an "indicator variable" but I do not know what this is.我发现这个论坛： https://support.bioconductor.org/p/38580/说我需要一个“指标变量”，但我不知道这是什么。

Any help with this error would be greatly appreciated, or if you know of any other GO packages that are easy to learn.非常感谢您对此错误的任何帮助，或者如果您知道任何其他易于学习的 GO 软件包。 Thanks!谢谢！

Answer 1

You can check the supported databases, hg38 is not one of them:您可以检查支持的数据库，hg38 不是其中之一：

library(org.Hs.eg.db)
library(goseq)

supported[grep("hg38|hg19",supported$Genome),]
   Genome         Id  Id Description Lengths in geneLeneDataBase
4    hg19  knownGene  Entrez Gene ID                        TRUE
36   hg19    ensGene Ensembl gene ID                        TRUE
81   hg19 geneSymbol     Gene Symbol                        TRUE
98   hg38                                                  FALSE
   GO Annotation Available
4                     TRUE
36                    TRUE
81                    TRUE
98                    TRUE

You can get a rough idea of what it looks like by using hg19, you will have some missing or unmatched by should be ok.您可以通过使用 hg19 大致了解它的外观，您将有一些缺失或不匹配应该没问题。 You need to have a binary vector and it should be named, for example:你需要有一个二进制向量，它应该被命名，例如：

set.seed(111)
allgenes = keys(org.Hs.eg.db,keytype="SYMBOL")
de.genes = rbinom(100,1,0.3)
names(de.genes) = sample(allgenes,100)

It looks like this:它看起来像这样：

  GALNT5        TPRKB         CD48       OR52R1 LOC105372708 LOC112163649 
       0            1            0            0            0            0

LOC105369203 LOC110121115 LOC105377654 LOC105371502 LOC101929964 HPC14 0 0 0 0 0 0 IGHD4-17 LOC101927993 HINT1 BCC3 RPL18P3 LOC108281192 0 0 0 0 0 1 RNU6-793P JUN 0 0 LOC105369203 LOC110121115 LOC105377654 LOC105371502 LOC101929964 HPC14 0 0 0 0 0 0 IGHD4-17 LOC101927993 HINT1 BCC3 RPL18P3 LOC108281192-0 0 7 93 JUN 0 010N6

This will be ok:这会没问题：

res = nullp(de.genes,"hg19","geneSymbol")

goseq package 在 R “缺少 TRUE/FALSE 需要的值”错误

问题描述

1 个解决方案

解决方案1
1 2021-01-13 06:04:40

goseq package 在 R “缺少 TRUE/FALSE 需要的值”错误

问题描述

1 个解决方案

解决方案1 1 2021-01-13 06:04:40

解决方案1
1 2021-01-13 06:04:40