簡體   English   中英

如何在 Python 中注釋散點 plot 中的異常值?

[英]How do I annotate outliers in a scatter plot in Python?

我試圖在散點圖 plot 中僅注釋某些實例的藝術家姓名(例如,“主要成分 1”>=1.5 或“主要成分 2”>=1.5)。

DataFrame 的頭部,形狀為 (500, 4)。

print(pca_df2.head())

       genre      artist      Principal Component 1  Principal Component 2 
1         a          band1             -0.148578               0.138509
2         b          band2             -0.484604               0.290153
3         b          band3             -0.576619              -0.359020
4         a          band4             -0.317572              -0.221687
5         a          band5             -0.536065              -0.404252

我當前的工作分散代碼 plot 代碼,它按類型將顏色映射到所有實例,如下所示:

labels = ['a', 'b', 'c', 'd', 'e']
colors = ['k', 'r', 'y', 'm', 'g']

fig = plt.figure(figsize = (8,8))
ax = fig.add_subplot(1,1,1)

for label, color in zip(labels, colors):
indicesToKeep = pca_df2['genre'] == label
ax.scatter(pca_df2.loc[indicesToKeep, 'Principal component 1'],
           pca_df2.loc[indicesToKeep, 'Principal component 2'],
           c = color, s = 20, alpha = 0.8)
ax.legend(labels)
ax.grid()
plt.show()

只需遍歷 DataFrame 即可解決

for index, row in pca_df2.iterrows():
artist = row['artist']
pca1 = row['Principal component 1']
pca2 = row['Principal component 2']
if pca1 >= 1.5 or pca2 >= 1.5:
    plt.text(x = pca1, y = pca2+0.05, s=artist, fontsize=9)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM