简体   繁体   English

在散点图上添加颜色条标签作为文本

[英]Add colorbar labels as text on scatter plot

I have a scatter plot generated using:我有一个使用以下方法生成的散点图:

x = list(auto['umap1'])
y = list(auto['umap2'])


final_df2 = pd.DataFrame(list(zip(x,y,communities)), columns =['x', 'y', 'cluster'])
no_clusters = max(communities)
cluster_list = list(range (min(communities), no_clusters+1))
fig2, ax = plt.subplots(figsize = (20,15))
plt.scatter(x,y, c=final_df2['cluster'], cmap=plt.cm.get_cmap('hsv', max(cluster_list)), s = 0.5)
plt.title('Phenograph on UMAP - All Markers (auto)', fontsize=15)
plt.xlabel('umap_1', fontsize=15)
plt.ylabel('umap_2', fontsize=15)
plt.colorbar(extend='both',ticks = range(max(cluster_list)))
plt.show()

I wanted to know how can I add the colorbar labels (numbers from 1-31) to the actual clusters on the graph (as text) that each one corresponds to.我想知道如何将颜色条标签(从 1 到 31 的数字)添加到每个对应的图表上的实际集群(作为文本)。 This is because it is quite hard to tell this from the colours as they loop back to red.这是因为当它们循环回红色时,很难从颜色中分辨出这一点。

I tried:我试过:

n = list(final_df2['cluster'])
for i, txt in enumerate(n):
    ax.annotate(txt, (y[i], x[i]))

But this is giving me no luck.但这并没有给我带来任何运气。 在此处输入图片说明

Your code for the annotations is writing an annotation for each and every dot.您的注释代码正在为每个点编写注释。 This just ends in a sea of numbers.这只是以数字的海洋结束。

Somehow, you should find a kind of center for each cluster, for example by averaging all the points that belong to the same cluster.不知何故,您应该为每个集群找到一种中心,例如通过平均属于同一集群的所有点。

Then, you use the coordinates of the center to position the text.然后,使用中心坐标定位文本。 You can give it a background to make it easier to read.您可以为其设置背景以使其更易于阅读。

As I don't have your data, the code below simulates some points already around a center.由于我没有你的数据,下面的代码模拟了一些已经围绕中心的点。

from matplotlib import pyplot as plt
import pandas as pd
import numpy as np

# calculate some random points to serve as cluster centers; run a few steps of a relaxing algorithm to separate them a bit
def random_distibuted_centers():
    cx = np.random.uniform(-10, 10, MAX_CLUST + 1)
    cy = np.random.uniform(-10, 10, MAX_CLUST + 1)
    for _ in range(10):
        for i in range(1, MAX_CLUST + 1):
            for j in range(1, MAX_CLUST + 1):
                if i != j:
                    dist = np.linalg.norm([cx[i] - cx[j], cy[i] - cy[j]])
                    if dist < 4:
                        cx[i] += 0.4 * (cx[i] - cx[j]) / dist
                        cy[i] += 0.4 * (cy[i] - cy[j]) / dist
    return cx, cy

N = 1000
MAX_CLUST = 31
cx, cy = random_distibuted_centers()

# for demonstration purposes, just generate some random points around the centers
x =  np.concatenate( [np.random.normal(cx[i], 2, N) for i in range(1,MAX_CLUST+1)])
y =  np.concatenate( [np.random.normal(cy[i], 2, N) for i in range(1,MAX_CLUST+1)])
communities = np.repeat(range(1,MAX_CLUST+1), N)

final_df2 = pd.DataFrame({'x':x, 'y':y, 'cluster': communities})
no_clusters = max(communities)
cluster_list = list(range (min(communities), no_clusters+1))
fig2, ax = plt.subplots(figsize = (20,15))
plt.scatter(x,y, c=final_df2['cluster'], cmap=plt.cm.get_cmap('hsv', max(cluster_list)), s=0.5)
plt.title('Phenograph on UMAP - All Markers (auto)', fontsize=15)
plt.xlabel('umap_1', fontsize=15)
plt.ylabel('umap_2', fontsize=15)
plt.colorbar(extend='both',ticks = cluster_list)

bbox_props = dict(boxstyle="circle,pad=0.3", fc="white", ec="black", lw=2, alpha=0.9)
for i in range(1,MAX_CLUST+1):
    ax.annotate(i, xy=(cx[i], cy[i]), ha='center', va='center', bbox=bbox_props)
plt.show()

示例图

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM