变量筛选连续结果，分类预测因子，负p值

Question

我正在尝试使用表达数据的大数据集（沿列的所有分类变量）找到一组好的分类变量，以预测二元结果。 在几个但不是所有时间点（研究中的T1-T7）测量每个受试者。 每个科目都有一个特定的ID。 为此，我决定使用MXM::MMPC.timeclass() 。 但是，它会产生负p值。 据我所知，p值...根据定义，概率不能为负。 他们真的不能，这是显而易见的。

我已经尝试过MMPC.timeclass()并且已经进行了大量的文献检索以找到可能合适的另一种方法，但是到目前为止还没有任何方法。

set.seed(5)
## assume these are longitudinal data, each column is a variable (or feature)
dataset <- matrix( rnorm(400 * 100), ncol = 100 ) 
id <- rep(1:80, each = 5)  ## 80 subjects
reps <- rep( seq(4, 12, by = 2), 80)

## 5 time points for each subject
## dataset contains are the regression coefficients of each subject's values on the 
## reps (which is assumed to be time in this example)
target <- rep(0:1, each = 200)
a <- MMPC.timeclass(target, reps, id, dataset)
a@pvalues %>% summary()

    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-4.01762 -1.39835 -0.68720 -0.98512 -0.37326 -0.01365

预期结果应包括p值（在0-1范围内）或甚至更好，从筛选程序的每个变量的某种类型的排名。 我之前使用过VariableScreening::ScreenLD() ，但这是一个绝对的结果，所以它不适合数据。

Answer 1

答案是它们是log p值。 文档将相应更新。 有关包作者的回复，请参阅https://github.com/mensxmachina/MXM-R-Package/issues/2 。

变量筛选连续结果，分类预测因子，负p值

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-05-07 02:24:14

变量筛选连续结果，分类预测因子，负p值

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-05-07 02:24:14

解决方案1
0 已采纳 2019-05-07 02:24:14