[英]How to plot categorical data with Bokeh using boxplots?
I have a simple dataframe with 5 columns and 10 rows of data that I would like to display as a single diagram using boxplots. 我有一个简单的数据框,其中包含5列和10行数据,我想使用boxplots将其显示为单个图表。
df_times[:5]
0 1 2 3 4
0 211.635771 45.411404 17.134416 20.062214 185.544333
1 234.500053 49.166657 17.052492 17.056290 205.531887
2 234.224389 49.342572 17.868082 15.981429 193.489293
3 221.880990 47.189842 17.054071 17.052869 198.318657
4 223.811611 49.991753 17.052005 17.590466 219.541593
The drawing code as follows: 绘图代码如下:
colors = ["red", "olive", "darkred", "goldenrod", "skyblue", "orange", "salmon"]
charts_times = figure(plot_width=900, plot_height=350, title='Query Runtime')
base, lower, upper, stds = [], [], [], []
for i, table_name in enumerate([x[26:] for x in tables_to_be_analyzed]):
run_time = df_times[i]
run_time_mean = run_time.mean()
run_time_std = run_time.std()
stds.append(run_time_std)
lower.append(run_time_mean - run_time_std)
upper.append(run_time_mean + run_time_std)
base.append(table_name)
color = colors[i % len(colors)]
charts_times.circle(x=table_name, y=run_time, color=color, size=6)
charts_times.title.text = 'Query time ' + ' std: ' + str([round(x,1) for x in stds])
source_times = ColumnDataSource(data=dict(base=base, lower=lower, upper=upper))
charts_times.add_layout(
Whisker(
source=source_times, base="base", upper="upper",
lower="lower", line_width=1.5))
show(charts_times)
The only problem is that I am not sure how to have categorical data with boxplots because as it is bokeh just throws an error: 唯一的问题是我不确定如何使用箱形图获得分类数据,因为散景只是引发错误:
ERROR:bokeh.core.validation.check:E-1001 (BAD_COLUMN_NAME): Glyph refers to nonexistent column name: orig_parquet_0 [renderer: GlyphRenderer(id='f3703748-b7f9-43f6-807c-a8a24bdaab32', ...)]
ERROR:bokeh.core.validation.check:E-1001 (BAD_COLUMN_NAME): Glyph refers to nonexistent column name: test_orc_opt_3 [renderer: GlyphRenderer(id='b86d8724-ff66-45ac-8a09-eaa70cadf348', ...)]
ERROR:bokeh.core.validation.check:E-1001 (BAD_COLUMN_NAME): Glyph refers to nonexistent column name: test_orc_opt_4 [renderer: GlyphRenderer(id='ee5f644e-b334-4bce-8184-9f7d8da8bba1', ...)]
ERROR:bokeh.core.validation.check:E-1001 (BAD_COLUMN_NAME): Glyph refers to nonexistent column name: test_orc_opt_6 [renderer: GlyphRenderer(id='55eb8c19-344f-4010-b254-222330b76203', ...)]
ERROR:bokeh.core.validation.check:E-1001 (BAD_COLUMN_NAME): Glyph refers to nonexistent column name: test_parquet_opt_7 [renderer: GlyphRenderer(id='81ad57e4-4e01-4762-9a6e-0ce57cb51dd7', ...)]
You pass the name of a column for x
: 您传递
x
的列名称:
charts_times.circle(x=table_name, ...)
But you have not created and passed any ColumnDataSource
to as the source
argument to circle
. 但是您尚未创建任何
ColumnDataSource
并将其传递为circle
的source
参数。 You need to create a CDS with column names like "orig_parquet_0"
您需要使用列名称(例如
"orig_parquet_0"
创建CDS
Note that you cannot mix and match passing column names and literal lists to the same glyph at the same time, ie 请注意,您不能同时将传递的列名和文字列表混合和匹配到同一字形,即
p.circle(x="colname", y=[1,2,3,...], # not possible
If any of the data for a glyphs is referred to "by name" from a CDS, then all the data has to be in a CDS: 如果从CDS中“按名称”引用了字形的任何数据,则所有数据都必须在CDS中:
p.circle(x="xname", y="yname", source=source)
Alternatively, you can pass all the data as literal arrays or lists and not pass a source
at all (you just can't mix and match): 另外,您可以将所有数据作为文字数组或列表进行传递,而根本不传递
source
(您无法混合匹配):
p.circle(x=[1,2,..], y=[4,5,...]) # also OK
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.