[英]How can one create a heatmap from a 2D scatterplot data in Python?
How can one create a heatmap from a 2D scatterplot data in Python, where for each (x,y) point in the scatterplot one has az value associated to it? 如何使用Python中的2D散点图数据创建热图,其中散点图中的每个(x,y)点都具有与之关联的z值? The z value will be the value used to color the heatmap.
z值将是用于为热图着色的值。
For example, in R, I can use : 例如,在R中,我可以使用:
# This example is from http://knowledge-forlife.com/r-creating-heatmap-scatterplot-data/
#I'm just setting the seed so you can see the same example on your computer
set.seed(1)
#Our X data
x <- runif(150)
#Our Y data
y <- runif(150)
#Our Z data
z <- c(rnorm(mean=1,100),rnorm(mean=20,50))
#Store the length of our data
N <- length(x)
# View the scatterplot
plot(x, y)
#Here is the interpolation to give the heatmap effect.
#Use xo and yo to set the output grid you want to use.
#xo and yo are used to change the resolution of the interpolation
#Here, I have included a somewhat standard protocol for these parameters
s <- interp(x,y,z,xo=seq(min(x),max(x),length=N),
yo=seq(min(x),max(x),length=N),duplicate="mean")
#Here's where the fun happens
#Note you can add your typical plotting paramaters here, such as xlab or ylab
image.plot(s,xlim=c(0,1),ylim=c(0,1),zlim=c(-2,25))
Scatterplot (each (x,y) point in this scatterplot one has az value associated to it; the z values aren't visible in the scatterplot): 散点图(此散点图中的每个(x,y)点都具有与之关联的z值; z值在散点图中不可见):
Corresponding heatmap (the color represents the z values): 对应的热图(颜色代表z值):
Note that this question is different from Generate a heatmap in MatPlotLib using a scatter data set , where the color in the heatmap represents the density of the (x,y) points). 请注意,此问题与使用散点数据集在MatPlotLib中生成热图不同,其中,热图中的颜色表示(x,y)点的密度)。
I went ahead with Gerges Dib's suggestion. 我接受了Gerges Dib的建议。 Here is the code, sampling (x,y,z) points from 3D Gaussian distribution:
这是代码,是从3D高斯分布中采样(x,y,z)点的代码:
import numpy as np
import scipy.interpolate
from scipy.stats import multivariate_normal
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
# Sample from 3D Gaussian distribution
np.random.seed(0)
number_of_samples = 20
x = np.random.rand(number_of_samples)
y = np.random.rand(number_of_samples)
xy = np.column_stack([x.flat, y.flat]) # Create a (N, 2) array of (x, y) pairs.
mu = np.array([0.0, 0.0])
sigma = np.array([.95, 2.5])
covariance = np.diag(sigma**2)
z = multivariate_normal.pdf(xy, mean=mu, cov=covariance)
plt.scatter(x, y)
plt.savefig('scatterplot.png', dpi=300)
plt.tricontourf(x, y, z)
plt.savefig('tricontourf.png', dpi=300)
# Interpolate and generate heatmap:
grid_x, grid_y = np.mgrid[x.min():x.max():1000j, y.min():y.max():1000j]
for method in ['nearest','linear','cubic'] :
plt.figure()
grid_z = scipy.interpolate.griddata(xy,z,(grid_x, grid_y), method=method)
# [pcolormesh with missing values?](https://stackoverflow.com/a/31687006/395857)
import numpy.ma as ma
plt.pcolormesh(grid_x, grid_y, ma.masked_invalid(grid_z), cmap='RdBu', vmin=np.nanmin(grid_z), vmax=np.nanmax(grid_z))
plt.title('{0} interpolation'.format(method))
plt.colorbar()
plt.savefig('heatmap_interpolation_{0}.png'.format(method), dpi=300)
plt.clf()
plt.close()
scatterplot.png: scatterplot.png:
tricontourf.png: tricontourf.png:
heatmap_interpolation_nearest.png heatmap_interpolation_nearest.png
heatmap_interpolation_linear.png: heatmap_interpolation_linear.png:
heatmap_interpolation_cubic.png: heatmap_interpolation_cubic.png:
Here is a translation of your code into python
using numpy
for vector operations and matplotlib
for plotting: 这是将代码转换为
python
的代码,使用numpy
进行矢量操作,使用matplotlib
进行绘图:
import numpy as np
from matplotlib import pyplot
x = np.random.uniform(size=150)
y = np.random.uniform(size=150)
z = np.concatenate([np.random.randn(100)+1, np.random.randn(50)+20])
pyplot.plot(x, y, 'ok')
pyplot.tricontourf(x, y, z)
pyplot.show()
One difference here is that I did not use interpolation to put x and y on a grid, but rather used matplotlib
's tricontourf
which uses triangular tessellation. 此处的一个区别是,我没有使用插值将x和y放置在网格上,而是使用了
matplotlib
的tricontourf
,它使用了三角镶嵌。 If you need to put the data onto a rectangular grid, you can use scipy.interpolate.griddata
which works very similar to the interp
function you have in R. Then, for plotting a regular grid, you can use pyplot.pcolormesh
. 如果需要将数据放在矩形网格上,则可以使用
scipy.interpolate.griddata
,它的工作原理与R中的interp
函数非常相似。然后,要绘制常规网格,可以使用pyplot.pcolormesh
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.