简体   繁体   中英

Combining and appending columns of different lengths, by row number, R

I'm working with biochemical data from subjects, analysing the results by sex. I have 19 biochemical tests to analyse for each sex, for each of two drugs (haematology and anatomy tests coming later).

For reasons of reproducibility of results and for preventing transcription errors, I am trying to summarise each test into one table. Included in the table output, I need a column for the Dunnett post hoc comparison p-values. Because the Dunnett test compares to the control results, with a control and 3 drug levels I only get 3 p-values. However, I have 4 mean and sd values.

Using ddply to get the mean and sd results (having limited the number of significant figures, I get a dataset that looks like this:

 Sex<- c(rep("F",4), rep("M",4))
 Druglevel <- c(rep(0:3,2))
 Sample <- c(rep(10,8))
 Mean <- c(0.44, 0.50, 0.46, 0.49, 0.48, 0.55, 0.47, 0.57)
 sd <- c(0.07, 0.07, 0.09, 0.12, 0.18, 0.19, 0.13, 0.41)
 Drug1Biochem1 <- data.frame(Sex, Druglevel, Sample, Mean, sd)

I have used glht in the package multcomp to perform the Dunnett tests on the aov object I constructed from undertaking a normal aov . I've extracted the p-values from the glht summary (I've rounded these to three decimal places). The male and female analyses have been run using separate ANOVA so I have one set of output for each sex. The female results are:

femaleR <- c(0.371, 0.973, 0.490) 

and the male results are:

 maleR <- c(0.862, 0.999, 0.738)

How can I append a column for the p-values to my original dataframe (Drug1Biochem1) so that both femaleR and maleR are in that final column, with row 1 and row 5 of that column empty (ie no p-values for the control)?

I wish to output the resulting combination to html, which can be inserted into a Word document so no transcription errors occur. I have set a seed value so that the results of the program are reproducible (when I finally stop debugging).

In summary, I would like a data frame (or table, or whatever I can output to html) that has the following format:

 Sex       Druglevel       Sample     Mean     sd     p-value
 F         0               10         0.44     0.07   
 F         1               10         0.50     0.07   0.371
 F         2               10         0.46     0.09   0.973
 F         3               10         0.49     0.12   0.480
 M         0               10         0.48     0.18   
 M         1               10         0.55     0.19   0.862
 M         2               10         0.47     0.13   0.999
 M         3               10         0.57     0.41   0.738

For each test, I wish to reproduce this exact table. There will always be 4 groups per sex, and there will never be a p-value for the control, which will always be summarised in row 1 (F) and row 5 (M).

You could try merge

dN <- data.frame(Sex=rep(c('M', 'F'), each=3), Druglevel=1:3, 
                 pval=c(maleR, femaleR))

merge(Drug1Biochem1, dN, by=c('Sex', 'Druglevel'), all=TRUE)
#   Sex Druglevel Sample Mean   sd  pval
#1   F         0     10 0.44 0.07    NA
#2   F         1     10 0.50 0.07 0.371
#3   F         2     10 0.46 0.09 0.973
#4   F         3     10 0.49 0.12 0.490
#5   M         0     10 0.48 0.18    NA
#6   M         1     10 0.55 0.19 0.862
#7   M         2     10 0.47 0.13 0.999
#8   M         3     10 0.57 0.41 0.738

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM