[英]How do I arrange the results from the TukeyHSD into a table?
我有下面 TukeyHSD 的 output(变成数据框):
df <- read.table(header=TRUE, stringsAsFactors=FALSE, text="
comp diff lwr upr p_adj
duration:5-duration:5 2.125000e-01 -0.13653578 0.5615358 0.4873403
speed:5-probability:5 2.250000e-01 -0.12403578 0.5740358 0.4219408
probability:10-probability:5 3.875000e-01 0.03846422 0.7365358 0.0206341
duration:10-duration:5 6.875000e-01 0.33846422 1.0365358 0.0000020
speed:10-probability:5 2.250000e-01 -0.12403578 0.5740358 0.4219408
probability:60-probability:5 1.250000e-02 -0.31064434 0.3356443 0.9999974
probability:10-probability:60 1.250000e-02 -0.31064434 0.3356443 0.9999974
probability:10-speed:5 1.250000e-02 -0.31064434 0.3356443 0.9999974
")
我想要下面的东西:
duration5 probability5 speed5 duration10 probability10 speed10
duration5
probability5 p
speed5 p p
duration10 p p p
probability10 p p p p
speed10 p p p p p
我已经尝试过这里提出的类似解决方案。
我更改了代码以识别连字符“-”上的拆分,但它不起作用(见下文)。 为什么不执行? 有替代方法吗?
transformTable <- function(tbl, metric) {
# Takes table of TurkeyHSD output metrics
# and transforms them into a pairwise comparison matrix.
# tbl is assumed to be a data.frame or tibble,
# var is assumed to be a character string
# giving the variable name of the metric in question
# (here: "diff", "lwr", "upr", or "p_adj")
tbl <- tbl %>%
# Split comparison into individual variables
mutate(
Var1 = sub("\\-.*", "", comp), #before hypen
Var2 = sub(".*-", "", comp)) # after hyphen%>%
# Only keep relevant fields
select(Var1, Var2, matches(metric)) %>%
# Filter out NA's
filter(!is.na(metric)) %>%
# Make into "wide" format using Va r2
spread_(key = 'Var2', value = metric, fill = '')
# Let's change the row names to Var1
row.names(tbl) <- tbl$Var1
# And drop the Var1 column
tbl <- select(tbl, -Var1)
return(tbl)
}
transformTable(df,'p_adj')
一种快速而肮脏的方法
library(tidyr)
library(stringr)
df <- df %>%
separate(col = comp, into = c('x', 'y'), sep = '-') %>%
mutate(x = str_remove(x, ":")) %>%
mutate(y = str_remove(y, ":")) %>%
select(x, y, p_adj)
df1 <- data.frame(matrix(nrow = length(unique(c(df$x, df$y))), ncol = length(unique(c(df$x, df$y)))))
colnames(df1) <- unique(c(df$x, df$y))
rownames(df1) <- unique(c(df$x, df$y))
for(i in 1:length(unique(c(df$x, df$y)))){
for(j in 1:length(unique(c(df$x, df$y)))){
value <- (df %>% filter(x == rownames(df1)[i]) %>% filter(y == colnames(df1)[j]) %>% select(p_adj))$p_adj
if(length(value) != 0){
df1[i,j] <-value
df1[j,i] <- value
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.