繁体   English   中英

为清晰起见,在散点图中使用“bin”水平微调点

[英]Horizontally microadjust points with a "bin" in a scatter plot for clarity

我有一个散点图,看起来像:

   |             x
 2 |   x
   |        o
   |   o         x
 1 |   x    o
   |   o    x    o
   |   x         x
   |________________
      foo  bar  baz

代码看起来像:

data = pd.read_csv("data", index_col = [0,1,2,3,4])

variable_x = data.xs("var_x", level = 0)

a_list = ["a1", "a2", "a3", "a4", "a5"]

b_list = variable_x.index.get_level_values(1).unique().to_list()

c_list = variable_x.index.get_level_values(2).unique().to_list()

colours = {"a1" : "r",
           "a2" : "g",
           "a3" : "b",
           "a4" : "c",
           "a5" : "k"
           }

markers = {"b1" : "x",
           "b2" : "o",
           "b3" : "D",
           "b4" : "X",
           "b5" : "*"
           }

fig, axs = plt.subplots(1, 3, sharey = True)

ax = axs[0]

for a in a_list:
    
    color = colours[a]
    
    for b in b_list:
        
        marker = markers[b]
                
        for c in c_list:
            
            vals = variable_x.loc[:, a_list, :, :, :].xs(b, level = 1).xs(c, level = 2)
            
            for val in vals:
                
                ax.scatter(c, val, color = color, marker = marker, s = 5)

抱歉,如果我的伪代码不太合理,我可能从我的实际代码中错误地传输了它。

实际数据的点数要多得多,所以点的垂直线比较杂乱,难以区分。 有没有办法调整xo等的水平位置,使它们相距很小但仍在正确的“bin”内?

在使用以下辅助函数之前,我做过类似的事情:

def raw_data_scatter(array, xcenter, spread):
    y = array
    x = np.random.uniform(0,(spread/2), size=len(y))
    half = int(len(y)/2)
    for i in range(half):
        x[i] *= -1
    np.random.shuffle(x)
    x += xcenter
    return x,y

给定一个 y 值数组和一个以它们为中心的 x 点,它会在 x 方向上产生噪声以进行绘图。 它是随机的(所以点不会根据它们的密度分布),但它很简单,我认为仍然看起来不错。 下面是一个例子:

df = pd.DataFrame({'foo':np.random.randint(1,100,20),
                   'bar':np.random.randint(25,125,20),
                   'baz':np.random.randint(10,60,20)})

fig, ax = plt.subplots()
ax.set_xticks(range(len(df.columns)))
ax.set_xticklabels(df.columns)

for i, col in enumerate(df.columns):
    x, y = raw_data_scatter(df[col], xcenter=i, spread=.16)
    ax.scatter(x, y)

在此处输入图片说明

从另一个答案中汲取灵感,我的最终方法如下所示

data = pd.read_csv("data", index_col = [0,1,2,3,4])

variable_x = data.xs("var_x", level = 0)

a_list = ["a1", "a2", "a3", "a4", "a5"]

b_list = variable_x.index.get_level_values(1).unique().to_list()

c_list = variable_x.index.get_level_values(2).unique().to_list()

colours = {"a1" : "r",
           "a2" : "g",
           "a3" : "b",
           "a4" : "c",
           "a5" : "k"
           }

markers = {"b1" : "x",
           "b2" : "o",
           "b3" : "D",
           "b4" : "X",
           "b5" : "*"
           }

fig, axs = plt.subplots(1, 3, sharey = True)

ax = axs[0]

offset_scale = .14

for a_num, a in enumerate(a_list):
    
    offset = (- len(a_list)/2 + a_num) * offset_scale

    color = colours[a]
    
    for b in b_list:
        
        marker = markers[b]
                
        for c_num, c in enumerate(c_list):
            
            vals = variable_x.loc[:, a_list, :, :, :].xs(b, level = 1).xs(c, level = 2)
            
            for val in vals:
                
                ax.scatter(c_num + offset, val, color = color, marker = marker, s = 5)

ax.set_xticks(range(len(c_list)))
ax.set_xticklabels(c_list)

我的绘图区域(省略了轴)如下所示:

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM