简体   繁体   English

绘制对数色标

[英]Logarithmic color scale in plotly

I'm trying to visualize the data with some outliers using Plotly and Python3 . 我正在尝试使用PlotlyPython3将数据与某些异常值可视化。 Outliers cause the color scale legend to look badly: there are only few high data points, but the legend looks bad: space between 2k and 10k is too big. 离群值导致色标图例看起来很糟糕:只有很少的高数据点,但是图例看起来很糟糕:2k和10k之间的空间太大。

So the question is, how to change the appearance of 'color legend' on the right (see image below), so it will show the difference between 0 to 2k mostly? 因此,问题是,如何更改右侧“颜色图例”的外观(请参见下图),这样它将大部分显示0到2k之间的差异? Unfortunately, couldn't get an answer from this doc file 遗憾的是未能得到响应 doc文件

Sample code (jupyter notebook): 示例代码(jupyter笔记本):

import numpy as np
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
from plotly.graph_objs import *
init_notebook_mode()

x = np.random.randn(100,1) + 3
y = np.random.randn(100,1) + 10
x = np.reshape(x, 100)
y = np.reshape(y, 100)

color = np.random.randint(0,1000, [100])
color[[1,3,5]] = color[[1,3,5]] + 10000 # create outliers in color var

trace = Scatter(
    x = x,
    y = y,
    mode = 'markers',
    marker=dict(
        color = color,
        showscale=True,
        colorscale = [[0, 'rgb(166,206,227, 0.5)'],
                      [0.05, 'rgb(31,120,180,0.5)'],
                      [0.1, 'rgb(178,223,138,0.5)'],
                      [0.15, 'rgb(51,160,44,0.5)'],
                      [0.2, 'rgb(251,154,153,0.5)'],
                      [1, 'rgb(227,26,28,0.5)']
                     ]
    )
)

fig = Figure(data=[trace])
iplot(fig)

情节

You can accomplish what I think you're after by customizing the colorscale , cmin , and cmax properties to have a discrete color change at exactly 2000. Then you can customize colorbar.tickvals to label the boundary as 2000. See https://plot.ly/python/reference/#scatter-marker-colorbar . 你可以做到什么,我认为你是后通过自定义colorscalecmin ,和cmax性能有正好2000离散颜色的变化。然后,你可以自定义colorbar.tickvals标注的边界,2000年见https://开头的情节.ly / python / reference /#scatter-marker-colorbar

import numpy as np
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
from plotly.graph_objs import *
init_notebook_mode()

x = np.random.randn(100,1) + 3
y = np.random.randn(100,1) + 10
x = np.reshape(x, 100)
y = np.reshape(y, 100)

color = np.random.randint(0,1000, [100])
color[[1,3,5]] = color[[1,3,5]] + 10000 # create outliers in color var

bar_max = 2000
factor = 0.9  # Normalized location where continuous colorscale should end

trace = Scatter(
    x = x,
    y = y,
    mode = 'markers',
    marker=dict(
        color = color,
        showscale=True,
        cmin=0,
        cmax= bar_max/factor,
        colorscale = [[0, 'rgb(166,206,227, 0.5)'],
                      [0.05, 'rgb(31,120,180,0.5)'],
                      [0.2, 'rgb(178,223,138,0.5)'],
                      [0.5, 'rgb(51,160,44,0.5)'],
                      [factor, 'rgb(251,154,153,0.5)'],
                      [factor, 'rgb(227,26,28,0.5)'],
                      [1, 'rgb(227,26,28,0.5)']
                     ],
        colorbar=dict(
            tickvals = [0, 500, 1000, 1500, 2000],
            ticks='outside'
        )
    )
)

fig = Figure(data=[trace])
iplot(fig)

新图结果
New figure result 新图结果

Since you asked with a precise question, I try to reply with a precise answer, even if I don't think this could not be the best in data visualization. 由于您提出了一个精确的问题,因此即使我认为这不是数据可视化的最佳选择,我也会尝试给出一个精确的答案。 Later I show you why. 稍后我告诉你为什么。

Anyway, you can normalize the values of the colors and "squeeze" your data in a much smaller interval. 无论如何,您可以标准化颜色的值,并以较小的间隔“压缩”数据。 It mathematically represents the power to which the number e must be raised to produce the original value. 它在数学上表示为产生原始值而必须将数字e提高到的幂。 You can use log10 if you're more comfortable with. 如果您更喜欢,可以使用log10。

The code is very very simple, I attach only the trace definition as the rest is unchanged. 代码非常简单,我只附加了跟踪定义,其余部分未更改。 I placed a standard cmap for convenience as the interval of the values is continuous. 为了方便起见,我放置了一个标准cmap,因为值的间隔是连续的。

trace = Scatter(
    x = x,
    y = y,
    mode = 'markers',
    marker=dict(
        color = np.log(color),
        showscale=True,
        colorscale = 'RdBu'
    )
)

在此处输入图片说明

As I said, transforming the values with log isn't always the best. 如我所说,用log转换值并不总是最好的。 It actually forces the observer to a rough reading of the graph. 实际上,它迫使观察者粗略读取图形。 As example, nevertheless in my example the orange markers range between 410 and 950, can you tell the difference? 作为示例,尽管如此,在我的示例中,橙色标记的范围在410到950之间,您能否分辨出区别?

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM