简体   繁体   English

如何在数据着色器热图中填充或插入稀疏数据空白空间(欠采样)?

[英]How do you fill or intrerpolate sparse data empty space (undersampling) in a datashader heatmap?

When plotting a set of data in datashader it will, if the X-axis has discrete numbers and undersampling, leave gaps between the colums where the background can be seen.在数据着色器中绘制一组数据时,如果 X 轴具有离散数字和欠采样,则会在可以看到背景的列之间留下间隙。

I have been trying to fix this by trying to set a larger point size or by using the dynspread transfer function.我一直在尝试通过设置更大的点大小或使用 dynspread 传输 function 来解决此问题。 No luck - it could well be that I just don't know the correct way of applying these.不走运——很可能是我不知道应用这些的正确方法。

Here is sample code to reproduce what I mean:这是重现我的意思的示例代码:

import pandas as pd
import numpy as np

import datashader as ds, colorcet
import holoviews as hv
from holoviews.operation.datashader import datashade
from holoviews import opts

# generate random dataset 0 - 10000
image = np.random.randn(250, 1024, 1024) + 10000
z, x, y = image.shape
print("z, x, y =", z, x, y)
    
# rearrange data to 'z' + 'value' array and convert to dataframe
arr = np.column_stack((np.repeat(np.arange(z),y*x), image.ravel()))
df = pd.DataFrame(arr, columns = ['X', 'Y'])

### Plot using in datashader
map = ds.Canvas(plot_width=800, plot_height=800)
agg = map.points(df, 'X', 'Y' )
pts = ds.tf.shade(agg, cmap=colorcet.fire)
ds.tf.set_background(pts, 'white')

Of course, plotting the same set using bokeh shows the same thing.当然,使用 bokeh 绘制相同的集合会显示相同的内容。 Only worse, if you zoom in:更糟糕的是,如果你放大:

hv.extension("bokeh")
datashade(hv.Points(df), cmap=colorcet.fire).relabel('Value heatmap').opts(height=700, width=800)

Datashader is working as designed in this case.在这种情况下,Datashader 按设计工作。 When rendering points into a raster grid, it shows you the actual point data available, up to the limit of what the pixel grid can show.当将点渲染到栅格网格中时,它会向您显示可用的实际点数据,直至像素网格可以显示的限制。 If there are multiple datapoints in a pixel, their counts or values are aggregated.如果一个像素中有多个数据点,则会汇总它们的计数或值。 If there is no data in some pixels, no data is shown.如果某些像素中没有数据,则不显示数据。

It sounds like you want a different sort of plot than a datashaded pixel heatmap.听起来您想要一种不同于数据阴影像素热图的 plot。 Maybe:也许:

  • If your data represent regular samples from an underlying raster or quadmesh grid, use a datashaded hv.Image or hv.Quadmesh plot (or call canvas.raster or canvas.quadmesh directly), not an hv.Points or canvas.points plot If your data represent regular samples from an underlying raster or quadmesh grid, use a datashaded hv.Image or hv.Quadmesh plot (or call canvas.raster or canvas.quadmesh directly), not an hv.Points or canvas.points plot
  • If your data represent arbitrarily located samples from an underlying continuous distribution, you can use a datashaded hv.TriMesh or canvas.trimesh plot to fill in between dots after you compute a Delaunay or other type of triangulation so that it defines a surface.如果您的数据表示来自基础连续分布的任意位置的样本,您可以在计算 Delaunay 或其他类型的三角剖分后使用数据阴影hv.TriMesh或 canvas.trimesh plot 在点之间填充,以便定义表面。
  • If your data represent arbitrarily located samples from a non-continuous distribution but you still want to approximate it with a continuous function, you can use a (non-datashaded) hv.Bivariate plot, which computes a smooth kernel density estimate that effectively "connects the dots" as you describe but also smooths out local density differences. If your data represent arbitrarily located samples from a non-continuous distribution but you still want to approximate it with a continuous function, you can use a (non-datashaded) hv.Bivariate plot, which computes a smooth kernel density estimate that effectively "connects正如您所描述的那样,这些点也可以消除局部密度差异。

None of these options do precisely what you're asking here, but I think the TriMesh will behave the most like you suggest, while still behaving similarly for the zoomed-out case.这些选项都不能完全满足您在这里的要求,但我认为 TriMesh 的行为最符合您的建议,同时在缩小的情况下仍然表现类似。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM