简体   繁体   English

Altair:在 LayerChart 中应用选择和 ~selection

[英]Altair: apply selection and ~selection within LayerChart

The use case here is to analyze the results of clustering;这里的用例是分析聚类的结果; We'd like to choose a cluster and a feature (so two dropdowns), and display a layered histogram that shows the distribution of "feature" for points in "cluster", layered with the distribution of "feature" for points outside "cluster."我们想选择一个集群和一个特征(所以两个下拉菜单),并显示一个分层直方图,显示“集群”中点的“特征”分布,分层“集群”外点的“特征”分布。” The additional constraint is we need standalone HTML, so we can't use anything requiring a python kernel.额外的限制是我们需要独立的 HTML,所以我们不能使用任何需要 python kernel 的东西。

After also exploring plotly and bokeh , Altair gave me the closest solution (below).在探索plotlybokeh之后, Altair给了我最接近的解决方案(如下)。

Input data is like this:输入数据是这样的:

import pandas as pd
import numpy as np
import altair as alt

df = pd.DataFrame({"cluster": np.random.choice([1, 2], size=100)})  # cluster labels
df["feature1"] = np.random.normal(loc=1, scale=0.7, size=100) + df["cluster"]  # a feature column
df["feature2"] = np.random.normal(loc=10, scale=1, size=100) - 3*df["cluster"]  # a second feature column
   cluster  feature1  feature2
0        2       3.4       4.5
1        2       2.4       3.9
2        1       1.6       5.7
3        1       1.6       5.7
4        2       3.3       4.5

Which I have first melted to:我首先融化到:

dfm = pd.melt(df.set_index('cluster', drop=True), ignore_index=False, var_name='feature')
dfm.reset_index(inplace=True)  # Get cluster labels as a column again
   cluster   feature  value
0        2  feature1    3.4
1        2  feature1    2.4
2        1  feature1    1.6
3        1  feature1    1.6
4        2  feature1    3.3

I have code working except that I have to use VConcatChart rather than LayerChart because with the cluster selection I need to apply selection to one histogram (points in the cluster) and ~selection to the other (points outside the cluster).我有代码工作,除了我必须使用VConcatChart而不是LayerChart因为集群选择我需要将selection应用于一个直方图(集群中的点)和~selection另一个(集群外的点)。

input_dropdown = alt.binding_select(options=[1,2], name='cluster  ')
selection = alt.selection_single(fields=['cluster'], bind=input_dropdown, name='filter_cluster', init={'cluster': 1})
input_dropdown2 = alt.binding_select(options=['feature1', 'feature2'], name='feature  ')
selection2 = alt.selection_single(fields=['feature'], bind=input_dropdown2, name='filter_feature', init={'feature': 'feature1'})

x_domain = alt.selection_interval(bind='scales', encodings=['x'])

hist1 = alt.Chart(dfm).transform_filter(
    selection & selection2
).transform_joinaggregate(  # to achieve normed histogram
    total='count(*)'
).transform_calculate(  # to achieve normed histogram
    pct='1 / datum.total'
).mark_bar(
    opacity=0.3, interpolate='step', color='green'
).encode(
    alt.X('value:Q', bin=alt.BinParams(maxbins=50), title="feature value", axis=None),
    alt.Y('sum(pct):Q', title="frequency"),
)

hist2 = alt.Chart(dfm).transform_filter(
    ~selection & selection2
).transform_joinaggregate(  # to achieve normed histogram
    total='count(*)'
).transform_calculate(  # to achieve normed histogram
    pct='1 / datum.total'
).mark_bar(
    opacity=0.3, interpolate='step', color='black'
).encode(
    alt.X('value:Q', bin=alt.BinParams(maxbins=50), title="feature value"),
    alt.Y('sum(pct):Q', title="frequency")
)

conc = alt.vconcat(
    hist1,
    hist2
).add_selection(
    selection
).add_selection(
    selection2
).configure_concat(
    spacing=0
).add_selection(
    x_domain
).resolve_scale(
    x='shared'
)

conc

Here is an image of the resulting viz, not the interactive form.这是生成的可视化的图像,而不是交互式表单。

Is there any way to achieve this but where the two histograms are layered?有没有办法实现这一点,但是两个直方图是分层的?

As per the comments, the solution was to replace alt.vconcat with alt.layer .根据评论,解决方案是将alt.vconcat替换为alt.layer

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM