简体   繁体   English

使用 ggplot2/plotly 绘制“楼梯”

[英]Plotting “staircases” using ggplot2/plotly

I am trying to follow this tutorial over here:https://rviews.rstudio.com/2017/09/25/survival-analysis-with-r/ (bottom of the page).我正在尝试在此处遵循本教程:https://rviews.rstudio.com/2017/09/25/survival-analysis-with-r/ (页面底部)。

I have slightly modified the code for this tutorial and have plotted the "staircases" (ie "survival functions", in the below picture "red", "blue", "green") corresponding to 3 of the observations in the data:我稍微修改了本教程的代码,并绘制了对应于数据中 3 个观察值的“楼梯”(即“生存函数”,在下图中的“红色”、“蓝色”、“绿色”):

 library(survival)
    library(dplyr)
    library(ranger)
    library(data.table)
library(ggplot2)
library(plotly)
    
a = na.omit(lung)
a$ID <- seq_along(a[,1])

r_fit <- ranger(Surv(time,status) ~ age + sex + ph.ecog + ph.karno + pat.karno + meal.cal + wt.loss, data = a, mtry = 4, 
importance = "permutation", splitrule = "extratrees", verbose = TRUE)

death_times <- r_fit$unique.death.times
surv_prob  <-data.frame(r_fit$survival)
avg_prob <- sapply(surv_prob, mean)

plot(r_fit$unique.death.times, r_fit$survival[1,], type = "l", ylim = c(0,1), col = "red", xlab = "Days", ylab = "survival", main = "Survival Curves")

new = a[1:3,]

pred <- predict(r_fit, new, type = 'response')$survival
pred <- data.table(pred)
colnames(pred) <- as.character(r_fit$unique.death.times)

plot(r_fit$unique.death.times, pred[1,], type = "l", col = "red")

lines(r_fit$unique.death.times, r_fit$survival[2,], type = "l", col = "green")

lines(r_fit$unique.death.times, r_fit$survival[3,], type = "l", col = "blue")

在此处输入图像描述

From here, I want to make the above plot as "interactive".从这里开始,我想将上述 plot 设置为“交互式”。 I want to make so that when you move the mouse over one of the curves:我想这样当您将鼠标移到其中一条曲线上时:

  1. The "properties" belonging to that curve (from object "a") hover (eg ID, age, sex, ph.ecog, etc.)属于该曲线的“属性”(来自 object “a”)hover(例如 ID、年龄、性别、ph.ecog 等)

  2. In the same "hover box" from 1), also show the x-coordinate (r_fit$unique) and the y-coordinate (from "pred") for each position the mouse is hovering over (for a given curve)在来自 1) 的同一个“悬停框”中,还显示鼠标悬停的每个 position 的 x 坐标(r_fit$unique)和 y 坐标(来自“pred”)(对于给定曲线)

My plan was first to take the "grob" object and convert it to a "ggplot" object, and then to convert the "ggplot" object into a "plotly" object: My plan was first to take the "grob" object and convert it to a "ggplot" object, and then to convert the "ggplot" object into a "plotly" object:

 grob= plot(r_fit$unique.death.times, pred[1,], type = "l", col = "red")
basic_plot = ggpubr::as_ggplot(grob)

But when I try to inspect the "basic_plot", it shows up as "NULL".但是当我尝试检查“basic_plot”时,它显示为“NULL”。

 ggplot(f)
Error: `data` must be a data frame, or other object coercible by `fortify()`, not an S3 object with class gg/ggplot

Had this worked, I would have eventually converted the ggplot object to plotly:如果这可行,我最终会将 ggplot object 转换为 plotly:

plotly_plot = ggplotly(final_plot)

How to make this interactive plot?如何制作这个交互式 plot?

I am trying to achieve something similar to this: https://plotly.com/python/v3/ipython-notebooks/survival-analysis-r-vs-python/ (towards the bottom of the page, plot with the title "lifespan of different tumor DNA profiles")我正在尝试实现与此类似的目标: https://plotly.com/python/v3/ipython-notebooks/survival-analysis-r-vs-python/ (靠近页面底部,plot,标题为“lifespan”不同的肿瘤 DNA 图谱")

在此处输入图像描述

(Please Note: I am working with a computer that has no USB port or Internet connection, only R with a few pre-installed libraries... I do not have "ggplotify" or "survminer") (请注意:我正在使用没有 USB 端口或 Internet 连接的计算机,只有 R 和一些预安装的库......我没有“ggplotify”或“survminer”)

The issue is that when you draw a plot in base graphics draw directly on a device .问题是,当您在base图形中绘制 plot 时,直接在设备上绘制 The line of your code grob= plot(r_fit$unique.death.times, pred[1,], type = "l", col = "red") creates a NULL object (unlike ggplot which would return a plot object). The line of your code grob= plot(r_fit$unique.death.times, pred[1,], type = "l", col = "red") creates a NULL object (unlike ggplot which would return a plot object).

You can make the plot directly in ggplot (there are a few ways of doing this but I've done a simple example bolow) and convert it with ggplotly :您可以直接在 ggplot 中制作ggplot (有几种方法可以做到这一点,但我已经做了一个简单的示例)并使用ggplotly进行转换:

fig_dat <- data.frame(time = r_fit$unique.death.times,
                      pred_1 = t(pred[1,]),
                      fit_1 = r_fit$survival[2,],
                      fit_2 = r_fit$survival[3,])

fig_dat_long <- fig_dat %>% pivot_longer(-time, names_to = "pred_fit", values_to = "pred_fit_values")

gg_p <- ggplot(fig_dat_long, aes(x = time, y = pred_fit_values, colour = pred_fit)) +
  geom_line()

ggplotly(gg_p)

Alternatively you could also plot in plotly directly:或者,您也可以直接在 plotly 中的plotly

fig_dat <- data.frame(time = r_fit$unique.death.times,
                      pred_1 = t(pred[1,]),
                      fit_1 <- r_fit$survival[2,],
                      fit_2 <- r_fit$survival[3,])


fig <- plot_ly(fig_dat, x = ~time, y = ~pred_1, name = 'pred1', type = 'scatter', mode = 'lines')
fig <- fig %>% add_trace(y = ~fit_1, name = 'fit 1', mode = 'lines') 
fig <- fig %>% add_trace(y = ~fit_2, name = 'fit 2', mode = 'lines')

Happy Christmas:)圣诞快乐:)

UPDATE:更新:

## make dataframe of variables to plot:
fig_dat <- data.frame(time = r_fit$unique.death.times,
                      pred_1 = t(pred[1,]),
                      fit_1 = r_fit$survival[2,],
                      fit_2 = r_fit$survival[3,])

# to include the variables from 'a' we need to put them in the same dataframe for plotting
# Trouble is they are different lengths the predicted data are a little shorter
dim(fig_dat)
dim(a)
# We can join the two with inner join: https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right
fig_dat_join <- inner_join(fig_dat, a, by = "time")
dim(fig_dat_join)
# now they are equal dimensions and joined together but we have a slight issue with duplicate values:
sort(a$time) # we can see here that time 53 appears twice for example
a$time[duplicated(a$time)] # this tells us which values in time are duplicated
sort(death_times) # some
death_times[duplicated(death_times)] #none
# because of the duplicates some combinations are returned: see rows 9 and 10
fig_dat_join 

# I'm not familiar with the analysis so I'm not sure what the correct way in this case is to remove the duplicates in 'a' so that the dimentions of 'a' match the output of 'r-fit'
# You might need to look into that but it might not make much difference to the visualisation

# I've not used plotly a great deal so there is probably a better way of doing this but I've done my best and included the links as comments: https://plotly-r.com/overview.html
# labels: https://plotly.com/r/figure-labels/
x_labs <- list(
  title = "Time")

y_labs <- list(
  title = "y axis")

# T include extra info in hovertext: I https://stackoverflow.com/questions/49901771/how-to-set-different-text-and-hoverinfo-text

p1 <- plot_ly(data = fig_dat_join,
              x = ~time,
              # text = ~n,
              # textposition = "auto",
              # hoverinfo = "text",
              hovertext = paste("Time :", fig_dat_join$time,
                                "<br> Sex :", fig_dat_join$sex,
                                "<br> Inst :", fig_dat_join$inst,
                                "<br> ID :", fig_dat_join$ID,
                                "<br> Age :", fig_dat_join$age
                                )) %>% 
  add_trace(y = ~pred_1,
            type = 'scatter',
            name = 'Predictor 1',
            mode = 'lines') %>% 
  add_trace( y = ~fit_1,
            type = 'scatter',
            name = 'Fit 1',
            mode = 'lines') %>% 
  add_trace( y = ~fit_2,
             type = 'scatter',
             name = 'Fit 2',
             mode = 'lines') %>% 
  layout(xaxis = x_labs, yaxis = y_labs)

p1


UPDATE 2:更新 2:

I was making dataframe a match up with the unique.death.times by using left_join() above.我通过使用上面的left_join()使 dataframe a unique.death.times匹配。 If you don't need that then we could just move the hovertext code into each add_trace ?如果您不需要,那么我们可以将hovertext代码移动到每个add_trace吗?

fig_dat <- data.frame(time = r_fit$unique.death.times,
                      pred_1 = t(pred[1,]),
                      fit_1 = r_fit$survival[2,],
                      fit_2 = r_fit$survival[3,])


p2 <- plot_ly(data = fig_dat,
              x = ~time,
              # text = ~n,
              # textposition = "auto",
              hoverinfo = "text"
) %>% 
  add_trace(y = ~pred_1,
            type = 'scatter',
            name = 'Predictor 1',
            mode = 'lines',
            hovertext = paste("Time :", fig_dat$time,
                              "<br> y axis :", fig_dat$pred_1,
                              "<br> Sex :", a$sex[1],
                              "<br> Inst :", a$inst[1],
                              "<br> ID :", a$ID[1],
                              "<br> Age :", a$age[1]
            )) %>% 
  add_trace( y = ~fit_1,
             type = 'scatter',
             name = 'Fit 1',
             mode = 'lines',
             hovertext = paste("Time :", fig_dat$time,
                               "<br> y axis :", fig_dat$fit_1,
                               "<br> Sex :", a$sex[2],
                               "<br> Inst :", a$inst[2],
                               "<br> ID :", a$ID[2],
                               "<br> Age :", a$age[2]
             )) %>% 
  add_trace( y = ~fit_2,
             type = 'scatter',
             name = 'Fit 2',
             mode = 'lines',
             hovertext = paste("Time :", fig_dat$time,
                               "<br> y axis :", fig_dat$fit_2,
                               "<br> Sex :", a$sex[3],
                               "<br> Inst :", a$inst[3],
                               "<br> ID :", a$ID[3],
                               "<br> Age :", a$age[3]
             )) %>% 
  layout(xaxis = x_labs, yaxis = y_labs)

p2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM