R：使用日志等级测试（survdiff）

Question

OK, so I have a dataframe that looks like this: 好的，所以我有一个像这样的数据框：

head(exprs, 21)

   sample expr              ID X_OS
1     BIX high TCGA_DM_A28E_01   26
2     BIX high TCGA_AY_6197_01   88
3     BIX high TCGA_HB_KH8H_01  553
4     BIX  low TCGA_K4_6303_01  256
5     BIX  low TCGA_F4_6703_01  491
6     BIX  low TCGA_Y7_PIK2_01  177
7     BIX  low TCGA_A6_5657_01  732
8     HEF high TCGA_DM_A28E_01   26
9     HEF high TCGA_AY_6197_01   88
10    HEF high TCGA_F4_6703_01  491
11    HEF high TCGA_HB_KH8H_01  553
12    HEF  low TCGA_K4_6303_01  256
13    HEF  low TCGA_Y7_PIK2_01  177
14    HEF  low TCGA_A6_5657_01  732
15    TUR high TCGA_DM_A28E_01   26
16    TUR high TCGA_F4_6703_01  491
17    TUR high TCGA_Y7_PIK2_01  177
18    TUR  low TCGA_K4_6303_01  256
19    TUR  low TCGA_AY_6197_01   88
20    TUR  low TCGA_HB_KH8H_01  553
21    TUR  low TCGA_A6_5657_01  732

Simply, for each sample , there are 7 patients, each with a survival time ( X_OS ) and expression level high or low ( expr ). 简单来说，对于每个sample ，有7位患者，每位患者的生存时间（ X_OS ）和表达水平high或low （ expr ）。 In the code below, I wish to take the first sample and run it through the survdiff function, with the outputs going to dfx . 在下面的代码中，我希望获取第一个示例并通过survdiff函数运行它，输出将进入dfx 。 However, I'm new to survival analysis and I'm not sure how to use the parameters of the survdiff function. 但是，我是生存分析的新手，我不确定如何使用survdiff函数的参数。 I wish to compare high and low expression groups for each sample . 我希望比较每个sample high表达组和low表达组。 How can I edit the function expfun to yield the survdiff output I need? 如何编辑函数expfun以产生所需的survdiff输出？ In addition, ideally I'd love to get the pvalues out of it, but I can work on that in a later step. 另外，理想情况下，我很乐意从中获取pvalue，但是我可以在以后的步骤中进行研究。 Thank you! 谢谢！

expfun = function(x) {
  survdiff(Surv(x$X_OS, x$expr))
}

dfx <- pblapply(split(exprs[c("expr", "X_OS")], exprs$sample), expfun)

Answer 1

Try this. 尝试这个。 I added a proper Surv() call because you only had times and no status argument and I made it into a formula (with the predictor on the other side of the tilde) because Surv function expects status as its second argument and survdiff expects a formula as its first argument. 我添加了一个适当的Surv（）调用，因为您只有时间，没有状态参数，并且将其放入公式中（预测变量位于波浪号的另一侧），因为Surv函数将状态期望为第二个参数，而survdiff期望使用公式作为第一个论点。 That means you need to use the regular R regression calling convention where column names are used as the formula tokens and the dataframe is given to the data argument. 这意味着您需要使用常规的R回归调用约定，其中将列名用作公式标记，并将数据框指定给data参数。 If you had a censoring variable, it would be put in as the second Surv argument rather than the 1 's that I have in there now. 如果您有一个检查变量，它将作为第二个Surv参数而不是我现在在其中的1输入。

 expfun = function(x) {
  survdiff( Surv( X_OS, rep(1,nrow(x)) ) ~ expr, data=x)
}

dfx <- lapply(split(exprs[c("expr", "X_OS")], exprs$sample), expfun)

This is the result from print.survdiff: 这是来自print.survdiff的结果：

> dfx
$BIX
Call:
survdiff(formula = Surv(X_OS, rep(1, nrow(x))) ~ expr, data = x)

          N Observed Expected (O-E)^2/E (O-E)^2/V
expr=high 3        3     2.05     0.446     0.708
expr=low  4        4     4.95     0.184     0.708

 Chisq= 0.7  on 1 degrees of freedom, p= 0.4 

$HEF
Call:
survdiff(formula = Surv(X_OS, rep(1, nrow(x))) ~ expr, data = x)

          N Observed Expected (O-E)^2/E (O-E)^2/V
expr=high 4        4     3.14     0.237      0.51
expr=low  3        3     3.86     0.192      0.51

 Chisq= 0.5  on 1 degrees of freedom, p= 0.475 

$TUR
Call:
survdiff(formula = Surv(X_OS, rep(1, nrow(x))) ~ expr, data = x)

          N Observed Expected (O-E)^2/E (O-E)^2/V
expr=high 3        3     1.75     0.902      1.41
expr=low  4        4     5.25     0.300      1.41

 Chisq= 1.4  on 1 degrees of freedom, p= 0.235

Note that you can see the code to produce the print output with: 请注意，您可以使用以下代码查看生成打印输出的代码：

getAnywhere(print.survdiff)

R：使用日志等级测试（survdiff）

问题描述

1 个解决方案

解决方案1
2 已采纳 2015-03-03 20:31:34

R：使用日志等级测试（survdiff）

问题描述

1 个解决方案

解决方案1 2 已采纳 2015-03-03 20:31:34

解决方案1
2 已采纳 2015-03-03 20:31:34