[英]How do I annotate outliers in a scatter plot in Python?
我試圖在散點圖 plot 中僅注釋某些實例的藝術家姓名(例如,“主要成分 1”>=1.5 或“主要成分 2”>=1.5)。
DataFrame 的頭部,形狀為 (500, 4)。
print(pca_df2.head())
genre artist Principal Component 1 Principal Component 2
1 a band1 -0.148578 0.138509
2 b band2 -0.484604 0.290153
3 b band3 -0.576619 -0.359020
4 a band4 -0.317572 -0.221687
5 a band5 -0.536065 -0.404252
我當前的工作分散代碼 plot 代碼,它按類型將顏色映射到所有實例,如下所示:
labels = ['a', 'b', 'c', 'd', 'e']
colors = ['k', 'r', 'y', 'm', 'g']
fig = plt.figure(figsize = (8,8))
ax = fig.add_subplot(1,1,1)
for label, color in zip(labels, colors):
indicesToKeep = pca_df2['genre'] == label
ax.scatter(pca_df2.loc[indicesToKeep, 'Principal component 1'],
pca_df2.loc[indicesToKeep, 'Principal component 2'],
c = color, s = 20, alpha = 0.8)
ax.legend(labels)
ax.grid()
plt.show()
只需遍歷 DataFrame 即可解決
for index, row in pca_df2.iterrows():
artist = row['artist']
pca1 = row['Principal component 1']
pca2 = row['Principal component 2']
if pca1 >= 1.5 or pca2 >= 1.5:
plt.text(x = pca1, y = pca2+0.05, s=artist, fontsize=9)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.