简体   繁体   中英

How to calculate p value and correlation coefficient for Spearman’s correlation of differential expression data with 40000 permutations?

I have 3 groups,let's call them g1, g2, g3. Each of them is a result of analysis in between groups of conditions, and g1 looks like this

                   geneSymbol      logFC         t      P.Value   adj.P.Val         Beta
EXykpF1BRREdXnv9Xk      MKI67 -0.3115880 -5.521186 5.772137e-07 0.008986062 4.3106665
0Tm7hdRJxd9zoevPlA     CCL3L3  0.1708020  4.162115 9.109798e-05 0.508784638 0.6630544
u_M5UdFdhg3lZ.qe64     UBE2G1 -0.1528149 -4.031466 1.430822e-04 0.508784638 0.3354065
lkkLCXcnzL9NXFXTl4     SEL1L3 -0.2138729 -3.977482 1.720517e-04 0.508784638 0.2015945
0Uu3XrB6Bd14qoNeuc      ZFP36  0.1667330  3.944917 1.921715e-04 0.508784638 0.1213335
3h7Sgq2i3sAUkxL_n8      ITGB5  0.3419488  3.938960 1.960886e-04 0.508784638 0.1066896

g2 and g2 look the same and each has 15568 entries (genes)

How to calculate p value and correlation coefficient for Spearman's correlation for this data for 40000 permutations?

I joined all 3 groups, g1, g2, g3, and extracted only Beta (B)

I got this data frame, with matching 15568 entries:

                     Beta1       Beta2    Beta3
EXykpF1BRREdXnv9Xk -4.970533 -4.752771 -5.404054
0Tm7hdRJxd9zoevPlA -4.862168 -5.147294 -3.909654
u_M5UdFdhg3lZ.qe64 -5.368846 -5.396183 -5.405330
lkkLCXcnzL9NXFXTl4 -4.367704 -4.847795 -5.148524
0Uu3XrB6Bd14qoNeuc -5.286592 -4.949305 -5.278798
3h7Sgq2i3sAUkxL_n8 -4.579528 -2.403240 -4.710600

To calculate Spearman's I could use in R:

> cor(d,use="pairwise.complete.obs",method="spearman")
        Beta1          Beta2        Beta3
Beta1 1.000000000  0.234171932  0.002474729
Beta2 0.234171932  1.000000000 -0.005469126
Beta3 0.002474729 -0.005469126  1.000000000

Can someone please tell me what would be the method to use to get correlation coefficient and p value taken in account number of permutations? And am I am correct to use Beta in order to do correlation in between these 3 groups?


A hint to access the correlation coefficient and p-value using the psych package. I'm going to use the mtcars dataset instead of re-typing your dataset as it is not in an easy copy-paste (dput(df)) format.

corr.test.col.1to4 <- corr.test(mtcars[1:4], method = "spearman", use = "complete.obs")
#1] "r"      "n"      "t"      "p"      "se"     "sef"    "adjust" "sym"    "ci"     "ci.adj"
# [11] "Call"  

# -------------------------------------------------------------------------
# in your case you probably want to do

#cor.test.beta <- corr.test(d[c("Beta1","Beta2", "Beta3")], method = "spearman", use = "complete.obs")

# -------------------------------------------------------------------------

As you can see from the output of names(corr.test.col.1to4) :

r: correlation coefficient

n: number of observation

p: p.value

se: standard error

ci: confidence intervals

So, if you want the correlation coefficient you can pull the values out using

#             mpg        cyl       disp         hp
# mpg   1.0000000 -0.9108013 -0.9088824 -0.8946646
# cyl  -0.9108013  1.0000000  0.9276516  0.9017909
# disp -0.9088824  0.9276516  1.0000000  0.8510426
# hp   -0.8946646  0.9017909  0.8510426  1.0000000

The p-values

#               mpg          cyl         disp           hp
# mpg  0.000000e+00 2.345144e-12 2.548135e-12 1.017194e-11
# cyl  4.690287e-13 0.000000e+00 1.365266e-13 5.603057e-12
# disp 6.370336e-13 2.275443e-14 0.000000e+00 6.791338e-10
# hp   5.085969e-12 1.867686e-12 6.791338e-10 0.000000e+00

The standard errors

#             mpg        cyl       disp         hp
# mpg  0.00000000 0.07537483 0.07614303 0.08156289
# cyl  0.07537483 0.00000000 0.06818175 0.07890355
# disp 0.07614303 0.06818175 0.00000000 0.09586909
# hp   0.08156289 0.07890355 0.09586909 0.00000000

The confidence intervals

#               lower          r      upper            p
# mpg-cyl  -0.9559077 -0.9108013 -0.8237102 4.690287e-13
# mpg-disp -0.9549362 -0.9088824 -0.8200941 6.370336e-13
# mpg-hp   -0.9477078 -0.8946646 -0.7935207 5.085969e-12
# cyl-disp  0.8557708  0.9276516  0.9643958 2.275443e-14
# cyl-hp    0.8067919  0.9017909  0.9513377 1.867686e-12
# disp-hp   0.7143279  0.8510426  0.9251848 6.791338e-10

You can save the output on a variable and do further formatting to make the reporting just.

Your second question Am I correct to use Beta in order to do correlation in between these 3 groups? is a valid question which you need to answer/address depending on the question you want to answer as well as report it in such a way that the corr is computed on variable Beta and justify the choice of the variable Beta in your report.

Hope that helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM