简体   繁体   English

使用bioconductor和r标准化data.frame

[英]normalize data.frame using bioconductor and r

I see several methods to normalize data in the form of affyBatch objects. 我看到了几种以affyBatch对象形式标准化数据的方法。 Some of the methods used are: threestep , mas5calls , mascallsfilter , justMAS and rma . 所使用的一些方法有: threestepmas5callsmascallsfilterjustMASrma However my data is in the data.frame format as I have read my expression data from a .txt file. 但是,由于我已经从.txt文件中读取了表达式数据,因此我的数据采用data.frame格式。

Can you please let me know what normalization and filtration methods I could use on data.frame? 您能否让我知道我可以在data.frame上使用哪些规范化和过滤方法? Or is it possible to convert data.frame into an affyBatch object? 还是有可能将data.frame转换为affyBatch对象?

When I tried some of the normalization methods, I got the following error: 当我尝试某些标准化方法时,出现以下错误:

dat.eset <- threestep(dat.fp, background.method="RMA.2",
                      normalize.method="quantile", summary.method="median.polish")
Error in threestep(dat, background.method = "RMA.2", normalize.method = "quantile", :
argument is data.frame threestep requires AffyBatch dat.mas5 <- mas5calls(dat)
Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function ‘mas5calls’ for signature ‘"data.frame"’

Better to ask Bioconductor questions on the Bioconductor mailing list. 最好在Bioconductor邮件列表上询问Bioconductor问题。 Many of the methods you mention are not commonly used, and are present more or less for historical reasons. 您提到的许多方法并不常用,由于历史原因或多或少地存在。 Likely your 'data.frame' has already been summarized in some way, eliminating additional methods. 您的“ data.frame”可能已经以某种方式进行了汇总,从而消除了其他方法。 You'll really need to provide more information (about what is actually in your data frame) to get a useful answers; 您确实需要提供更多信息(关于数据框中的实际信息),以获得有用的答案; you might check out the extensive vignette for the limma package and the microarray work flow as a starting point. 您可以查看有关limma软件包的详细信息以及微阵列工作流程的起点。

Create a "fake" AffyBatch using the CEL files of the same type as in your experiment (get one from GEO and replicate it by copying), then 使用与实验中相同类型的CEL文件创建一个“假” AffyBatch(从GEO中获取一个,并通过复制进行复制),然后

my.AB <- readAffy(filenames)

or by read.affy from the simpleaffy package, with the covdesc file 或通过simpleaffy包中的read.affy以及covdesc文件

then replace the data by 然后将数据替换为

exprs(my.AB) <- my.numeric.data

where my.numeric.data is your data frame, previously converted to a numeric matrix 其中my.numeric.data是您的数据框,以前已转换为数字矩阵

Then do MAS/RMA etc as needed 然后根据需要执行MAS / RMA等

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM