简体   繁体   English

使用matplotlib的向日葵散点图

[英]sunflower scatter plot using matplotlib

I am interested in constructing a sunflower scatter plot (as depicted in, for example, http://www.jstatsoft.org/v08/i03/paper [PDF link]). 我有兴趣构建向日葵散点图(例如,如http://www.jstatsoft.org/v08/i03/paper [PDF链接]所示)。 Before I write my own implementation, does anyone know of an existing one? 在我编写自己的实现之前,有没有人知道现有的实现? I am aware of the functions in Stata and R, but am looking for one in matplotlib. 我知道Stata和R中的函数,但我在matplotlib中寻找一个。

Thank you. 谢谢。

I don't know of any matplotlib implementations but it's not hard to do. 我不知道任何matplotlib实现,但它并不难做到。 Here I let hexbin do the counting, and then go through each cell and add the appropriate number of petals: 在这里,我让hexbin进行计数,然后遍历每个单元格并添加适当数量的花瓣:

在此输入图像描述

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import colors

np.random.seed(0)
n = 2000
x = np.random.standard_normal(n)
y = 2.0 + 3.0 * x + 4.0 * np.random.standard_normal(n)

cmap = colors.ListedColormap(['white', 'yellow', 'orange'])
hb = plt.hexbin(x,y, bins='log', cmap=cmap, gridsize=20, edgecolor='gray')
plt.axis([-2, 2, -12, 12])
plt.title("sunflower plot")

counts = hb.get_array()
coords = hb.get_offsets()

for i, count in enumerate(counts):
    x, y = coords[i,:]
    count = int(10**count)
    if count>3 and count<=12:
        n = count // 1
        if n>1:
            plt.plot([x], [y], 'k.')
            plt.plot([x], [y], marker=(n, 2), color='k', markersize=18)
    if count>12:
        n = count // 5
        if n>1:
            plt.plot([x], [y], 'k.')
            plt.plot([x], [y], marker=(n, 2), color='k', markersize=18)

plt.show()

Here yellow is 1 petal = 1, and orange 1 petal = 5. 黄色是1瓣= 1,橙色1瓣= 5。

One obvious place for improvement here is working with the colormap. 这里有一个明显的改进之处就是使用色彩映射。 For example, do you want to preset the colors boundaries or calculate them from the data, etc? 例如,您想预设颜色边界还是从数据中计算它们等? Here I just kludged it a bit: I used bins='log' just to get a reasonable ratio between yellow and orange cells for the particular sample I used; 在这里我简单地说了一下:我使用bins='log'来获得我使用的特定样本的黄色和橙色细胞之间的合理比例; and also I hard coded the borders between white, yellow, and orange cells (3 and 12). 我还对白色,黄色和橙色细胞(3和12)之间的边界进行了硬编码。

Being able to use a tuple to specify the marker characteristics in matplotlib makes it really easy to draw all the different petal numbers. 能够使用元组指定matplotlib中的标记特征,可以很容易地绘制所有不同的花瓣数。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM