[英]Create secondary axis for stacked barplot with line trace in R plotly
I am implementing a function which builds a stacked % barplot with an additional line trace.我正在实现一个 function ,它构建了一个带有附加线迹的堆叠百分比条形图。 I can create the stacked bar plot using either a wide-form or a long-form dataframe.
我可以使用宽格式或长格式 dataframe 创建堆叠条 plot。 The two code sections below produce plots that look essentially the same:
下面的两个代码部分生成的图看起来基本相同:
Using a wide-form dataframe:使用宽格式 dataframe:
library(dplyr)
library(plotly) # install.packages("plotly")
# simple example data for SO post
some_dates = c(as.Date('2021-01-01'), as.Date('2021-02-01'),
as.Date('2021-03-01'), as.Date('2021-04-01'))
bar1 = c(0.25,0.45,0.65,0.75)
bar2 = c(0.60,0.40,0.20,0.10)
bar3 = c(0.15,0.15,0.15,0.15)
line_data = c(0,1,2,3)
# wide form dataframe
df_bars = data.frame("db" = some_dates, "b1" = bar1,
"b2" = bar2, "b3" = bar3)
df_line = data.frame("line_dates" = some_dates, "line" = line_data)
plot_so1 = plot_ly(x = df_bars$db,
y = df_bars[[colnames(df_bars)[2]]],
type = 'bar',
name = colnames(df_bars)[2]) %>%
layout(title = 'My plot title',
xaxis = list(title = 'db'),
yaxis = list(title = 'CatProportions'),
barmode = 'stack',
showlegend = TRUE)
# Now loop through the rest of the columns except for the ones already used.
# This is done because "in the wild", the plot is being built in a function
# that has data which is passed to it so the number and names of the columns
# that are used to build the plot are not know in advance.
for (col_index in 3:length(some_dates)) {
plot_so1 =
add_trace(plot_so1,
x = df_bars$db,
y = df_bars[[colnames(df_bars)[col_index]]],
name = colnames(df_bars)[col_index])
}
Using a long-form dataframe:使用长格式 dataframe:
## long form of dataframe #########################
df_bars_long = df_bars %>%
pivot_longer(!db, names_to = "Categories", values_to = "CatProportions")
# build same plot from long form dataframe
plot_so2 = plot_ly(data = df_bars_long,
x = ~db, y = ~CatProportions,
color = ~Categories,
type = "bar") %>%
layout(barmode = "stack")
## above works, now try to add the line trace #####
plot_so2 = plot_ly(data = df_bars_long,
x = ~db, y = ~CatProportions,
color = ~Categories,
type = "bar") %>%
# add_trace(x = df_line$line_dates,
# y = df_line$line,
# type = 'scatter', mode = 'lines', name = 'my line',
# line = list(color = '#000000')) %>%
layout(title = 'My plot title',
xaxis = list(title = 'db'),
yaxis = list(title = 'CatProportions'),
barmode = 'stack',
showlegend = TRUE)
I understand that using the long form to create plots like this is best practice, but I show both methods above because I want to add a line trace using data from another dataframe which has one column for the x values and one column for the y values and have only been able to add this trace using the wide-form which I can do by adding the following code segment to the wide-form code:我知道使用长格式来创建这样的图是最佳实践,但我在上面展示了这两种方法,因为我想使用来自另一个 dataframe 的数据添加一条线迹,其中一列用于 x 值,一列用于 y 值并且只能使用宽格式添加此跟踪,我可以通过将以下代码段添加到宽格式代码来完成:
plot_so1 = add_trace(plot_so1,
x = df_line$line_dates,
y = df_line$line,
type = 'scatter', mode = 'lines', name = 'my line',
line = list(color = '#000000'))
This produces the following plot:这将产生以下 plot:
My primary question is, how do I create a secondary y-axis for the line trace from the wide from dataframe code?我的主要问题是,如何为来自 dataframe 代码的宽线迹线创建辅助 y 轴? My secondary question is: can the final plot I'm looking for be done with the long form dataframe and if so, how?
我的第二个问题是:我正在寻找的最终 plot 是否可以使用长格式 dataframe 完成,如果可以,怎么做?
This post got me started on this problem:这篇文章让我开始解决这个问题:
Stacked Bar Chart with Line Chart not working in R with plotly 带有折线图的堆积条形图在 R 和 plotly 中不起作用
but it didn't involve a stacked barplot which seems to make life more interesting.但它没有涉及似乎让生活更有趣的堆叠条形图。
This part is unchanged:这部分不变:
some_dates = c(as.Date('2021-01-01'), as.Date('2021-02-01'),
as.Date('2021-03-01'), as.Date('2021-04-01'))
line_data = c(0,1,2,3)
# wide form dataframe
df_bars = data.frame("db" = some_dates, "b1" = bar1,
"b2" = bar2, "b3" = bar3)
df_line = data.frame("line_dates" = some_dates, "line" = line_data)
If we want to include a line into the plot, we need to know the y values of the line at each date, so we merge df_line
with df_bars_long
using merge()
:如果我们想在 plot 中包含一行,我们需要知道该行在每个日期的 y 值,因此我们使用
merge()
将df_line
与df_bars_long
合并:
df_bars_long = df_bars %>%
pivot_longer(!db, names_to = "Categories", values_to = "CatProportions") %>%
merge(df_line, by.y = "line_dates", by.x = "db") %>%
group_by(db) %>%
dplyr::mutate(line = ifelse(duplicated(line), NA, line))
> df_bars_long
db Categories CatProportions line
1 2021-01-01 b1 0.25 0
2 2021-01-01 b2 0.60 NA
3 2021-01-01 b3 0.15 NA
4 2021-02-01 b1 0.45 1
.. .. .. .. ..
Then, the plot:然后,plot:
plot_so2 <- plot_ly(data = df_bars_long,
x = ~db, y = ~CatProportions,
color = ~Categories,
type = "bar") %>%
add_lines(y = ~line,
name = "my line",
line = list(color = '#000000'),
showlegend = TRUE,
yaxis = "y2") %>%
layout(title = 'My plot title',
xaxis = list(title = 'db'),
yaxis = list(title = 'CatProportions'),
barmode = 'stack',
yaxis2 = list(overlaying = "y",
side = "right", range = range(na.omit(df_bars_long$line))))
> plot_so2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.