绘制对数色标

Question

I'm trying to visualize the data with some outliers using Plotly and Python3 . 我正在尝试使用Plotly和Python3将数据与某些异常值可视化。 Outliers cause the color scale legend to look badly: there are only few high data points, but the legend looks bad: space between 2k and 10k is too big. 离群值导致色标图例看起来很糟糕：只有很少的高数据点，但是图例看起来很糟糕：2k和10k之间的空间太大。

So the question is, how to change the appearance of 'color legend' on the right (see image below), so it will show the difference between 0 to 2k mostly? 因此，问题是，如何更改右侧“颜色图例”的外观（请参见下图），这样它将大部分显示0到2k之间的差异？ Unfortunately, couldn't get an answer from this doc file 遗憾的是未能得到响应此 doc文件

Sample code (jupyter notebook): 示例代码（jupyter笔记本）：

import numpy as np
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
from plotly.graph_objs import *
init_notebook_mode()

x = np.random.randn(100,1) + 3
y = np.random.randn(100,1) + 10
x = np.reshape(x, 100)
y = np.reshape(y, 100)

color = np.random.randint(0,1000, [100])
color[[1,3,5]] = color[[1,3,5]] + 10000 # create outliers in color var

trace = Scatter(
    x = x,
    y = y,
    mode = 'markers',
    marker=dict(
        color = color,
        showscale=True,
        colorscale = [[0, 'rgb(166,206,227, 0.5)'],
                      [0.05, 'rgb(31,120,180,0.5)'],
                      [0.1, 'rgb(178,223,138,0.5)'],
                      [0.15, 'rgb(51,160,44,0.5)'],
                      [0.2, 'rgb(251,154,153,0.5)'],
                      [1, 'rgb(227,26,28,0.5)']
                     ]
    )
)

fig = Figure(data=[trace])
iplot(fig)

Answer 1

You can accomplish what I think you're after by customizing the colorscale , cmin , and cmax properties to have a discrete color change at exactly 2000. Then you can customize colorbar.tickvals to label the boundary as 2000. See https://plot.ly/python/reference/#scatter-marker-colorbar . 你可以做到什么，我认为你是后通过自定义colorscale ， cmin ，和cmax性能有正好2000离散颜色的变化。然后，你可以自定义colorbar.tickvals标注的边界，2000年见https：//开头的情节.ly / python / reference /＃scatter-marker-colorbar 。

import numpy as np
from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
from plotly.graph_objs import *
init_notebook_mode()

x = np.random.randn(100,1) + 3
y = np.random.randn(100,1) + 10
x = np.reshape(x, 100)
y = np.reshape(y, 100)

color = np.random.randint(0,1000, [100])
color[[1,3,5]] = color[[1,3,5]] + 10000 # create outliers in color var

bar_max = 2000
factor = 0.9  # Normalized location where continuous colorscale should end

trace = Scatter(
    x = x,
    y = y,
    mode = 'markers',
    marker=dict(
        color = color,
        showscale=True,
        cmin=0,
        cmax= bar_max/factor,
        colorscale = [[0, 'rgb(166,206,227, 0.5)'],
                      [0.05, 'rgb(31,120,180,0.5)'],
                      [0.2, 'rgb(178,223,138,0.5)'],
                      [0.5, 'rgb(51,160,44,0.5)'],
                      [factor, 'rgb(251,154,153,0.5)'],
                      [factor, 'rgb(227,26,28,0.5)'],
                      [1, 'rgb(227,26,28,0.5)']
                     ],
        colorbar=dict(
            tickvals = [0, 500, 1000, 1500, 2000],
            ticks='outside'
        )
    )
)

fig = Figure(data=[trace])
iplot(fig)

^{New figure result} ^新图结果

Answer 2

Since you asked with a precise question, I try to reply with a precise answer, even if I don't think this could not be the best in data visualization. 由于您提出了一个精确的问题，因此即使我认为这不是数据可视化的最佳选择，我也会尝试给出一个精确的答案。 Later I show you why. 稍后我告诉你为什么。

Anyway, you can normalize the values of the colors and "squeeze" your data in a much smaller interval. 无论如何，您可以标准化颜色的值，并以较小的间隔“压缩”数据。 It mathematically represents the power to which the number e must be raised to produce the original value. 它在数学上表示为产生原始值而必须将数字e提高到的幂。 You can use log10 if you're more comfortable with. 如果您更喜欢，可以使用log10。

The code is very very simple, I attach only the trace definition as the rest is unchanged. 代码非常简单，我只附加了跟踪定义，其余部分未更改。 I placed a standard cmap for convenience as the interval of the values is continuous. 为了方便起见，我放置了一个标准cmap，因为值的间隔是连续的。

trace = Scatter(
    x = x,
    y = y,
    mode = 'markers',
    marker=dict(
        color = np.log(color),
        showscale=True,
        colorscale = 'RdBu'
    )
)

As I said, transforming the values with log isn't always the best. 如我所说，用log转换值并不总是最好的。 It actually forces the observer to a rough reading of the graph. 实际上，它迫使观察者粗略读取图形。 As example, nevertheless in my example the orange markers range between 410 and 950, can you tell the difference? 作为示例，尽管如此，在我的示例中，橙色标记的范围在410到950之间，您能否分辨出区别？

绘制对数色标

问题描述

2 个解决方案

解决方案1
1 2018-06-09 20:38:13

解决方案2
0 2018-05-21 15:04:06

绘制对数色标

问题描述

2 个解决方案

解决方案1 1 2018-06-09 20:38:13

解决方案2 0 2018-05-21 15:04:06

解决方案1
1 2018-06-09 20:38:13

解决方案2
0 2018-05-21 15:04:06