简体   繁体   English

在 xy 散点图中添加标签 plot 和 seaborn

[英]Adding labels in x y scatter plot with seaborn

I've spent hours on trying to do what I thought was a simple task, which is to add labels onto an XY plot while using seaborn.我花了几个小时尝试完成我认为是一项简单的任务,即在使用 seaborn 的同时将标签添加到 XY plot 上。

Here's my code这是我的代码

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

df_iris=sns.load_dataset("iris") 

sns.lmplot('sepal_length', # Horizontal axis
           'sepal_width', # Vertical axis
           data=df_iris, # Data source
           fit_reg=False, # Don't fix a regression line
           size = 8,
           aspect =2 ) # size and dimension

plt.title('Example Plot')
# Set x-axis label
plt.xlabel('Sepal Length')
# Set y-axis label
plt.ylabel('Sepal Width')

I would like to add to each dot on the plot the text in "species" column.我想在 plot 的每个点上添加“物种”列中的文本。

I've seen many examples using matplotlib but not using seaborn.我见过很多使用 matplotlib 但不使用 seaborn 的示例。

Any ideas?有任何想法吗? Thank you.谢谢你。

One way you can do this is as follows:您可以这样做的一种方法如下:

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline

df_iris=sns.load_dataset("iris") 

ax = sns.lmplot('sepal_length', # Horizontal axis
           'sepal_width', # Vertical axis
           data=df_iris, # Data source
           fit_reg=False, # Don't fix a regression line
           size = 10,
           aspect =2 ) # size and dimension

plt.title('Example Plot')
# Set x-axis label
plt.xlabel('Sepal Length')
# Set y-axis label
plt.ylabel('Sepal Width')


def label_point(x, y, val, ax):
    a = pd.concat({'x': x, 'y': y, 'val': val}, axis=1)
    for i, point in a.iterrows():
        ax.text(point['x']+.02, point['y'], str(point['val']))

label_point(df_iris.sepal_length, df_iris.sepal_width, df_iris.species, plt.gca())  

在此处输入图片说明

Here's a more up-to-date answer that doesn't suffer from the string issue described in the comments.这是一个更新的答案,不受评论中描述的字符串问题的影响。

import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

df_iris=sns.load_dataset("iris") 

plt.figure(figsize=(20,10))
p1 = sns.scatterplot('sepal_length', # Horizontal axis
       'sepal_width', # Vertical axis
       data=df_iris, # Data source
       size = 8,
       legend=False)  

for line in range(0,df_iris.shape[0]):
     p1.text(df_iris.sepal_length[line]+0.01, df_iris.sepal_width[line], 
     df_iris.species[line], horizontalalignment='left', 
     size='medium', color='black', weight='semibold')

plt.title('Example Plot')
# Set x-axis label
plt.xlabel('Sepal Length')
# Set y-axis label
plt.ylabel('Sepal Width')

在此处输入图片说明

Thanks to the 2 other answers, here is a function scatter_text that makes it possible to reuse these plots several times.感谢其他 2 个答案,这里有一个函数scatter_text可以多次重用这些图。

import seaborn as sns
import matplotlib.pyplot as plt

def scatter_text(x, y, text_column, data, title, xlabel, ylabel):
    """Scatter plot with country codes on the x y coordinates
       Based on this answer: https://stackoverflow.com/a/54789170/2641825"""
    # Create the scatter plot
    p1 = sns.scatterplot(x, y, data=data, size = 8, legend=False)
    # Add text besides each point
    for line in range(0,data.shape[0]):
         p1.text(data[x][line]+0.01, data[y][line], 
                 data[text_column][line], horizontalalignment='left', 
                 size='medium', color='black', weight='semibold')
    # Set title and axis labels
    plt.title(title)
    plt.xlabel(xlabel)
    plt.ylabel(ylabel)
    return p1

Use the function as follows:使用该函数如下:

df_iris=sns.load_dataset("iris") 
plt.figure(figsize=(20,10))
scatter_text('sepal_length', 'sepal_width', 'species',
             data = df_iris, 
             title = 'Iris sepals', 
             xlabel = 'Sepal Length (cm)',
             ylabel = 'Sepal Width (cm)')

See also this answer on how to have a function that returns a plot: https://stackoverflow.com/a/43926055/2641825另请参阅有关如何使用返回绘图的函数的答案: https : //stackoverflow.com/a/43926055/2641825

Below is a solution that does not iterate over rows in the data frame using the dreaded for loop.下面是一个不使用可怕的 for 循环迭代数据框中的行的解决方案。

There are many issues regarding iterating over a data frame.关于遍历数据框有很多问题。

The answer is don't iterate!答案是不要迭代! See this link .请参阅此链接

The solution below relies on a function ( plotlabel ) within the petalplot function, which is called by df.apply .下面的解决方案依赖于 petalplot function 中的petalplot ( plotlabel ),它由df.apply

Now, I know readers will comment on the fact that I use scatter and not lmplot , but that is a bit besides the point.现在,我知道读者会对我使用scatter而不是lmplot的事实发表评论,但这有点离题了。

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

df_iris=sns.load_dataset("iris") 

def petalplot(df): 
    
    def plotlabel(xvar, yvar, label):
        ax.text(xvar+0.002, yvar, label)
        
    fig = plt.figure(figsize=(30,10))
    ax = sns.scatterplot(x = 'sepal_length', y = 'sepal_width', data=df)

    # The magic starts here:
    df.apply(lambda x: plotlabel(x['sepal_length'],  x['sepal_width'], x['species']), axis=1)

    plt.title('Example Plot')
    plt.xlabel('Sepal Length')
    plt.ylabel('Sepal Width')
    
petalplot(df_iris)

Same idea with Scott Boston's answer , however with Seaborn v0.12+, you can leverage seaborn.FacetGrid.apply to add labels on plots and set up your figure in one go:Scott Boston's answer 的想法相同,但是对于 Seaborn v0.12+,您可以利用seaborn.FacetGrid.apply在绘图上添加标签并在一个 go 中设置您的图形:

import seaborn as sns
import pandas as pd

%matplotlib inline

sns.set_theme()

df_iris = sns.load_dataset("iris")
(
    sns.lmplot(
        data=df_iris,
        x="sepal_length",
        y="sepal_width",
        fit_reg=False,
        height=8,
        aspect=2
    )
    .apply(lambda grid: [
        grid.ax.text(r["sepal_length"]+.02, r["sepal_width"], r["species"])
        for r in df_iris.to_dict(orient="records")
    ])
    .set(title="Example Plot")
    .set_axis_labels("Sepal Length", "Sepal Width")
)

Or, if you don't need to use lmplot , also from v0.12, you can use the seaborn.objects interface .或者,如果您不需要使用lmplot ,同样从 v0.12 开始,您可以使用seaborn.objects 接口 This way we don't need to manually iterate over the Iris dataframe nor refer to df_iris or column names sepal_... multiple times.这样我们就不需要手动迭代 Iris dataframe 也不需要多次引用df_iris或列名sepal_... _...。

import seaborn.objects as so
(
    so.Plot(df_iris, x="sepal_length", y="sepal_width", text="species")
        .add(so.Dot())
        .add(so.Text(halign="left"))
        .label(title="Example plot", x="Sepal Length", y="Sepal Width")
        .layout(size=(20, 10))
)

This produces the below figure:这会产生下图:

在此处输入图像描述

Use the powerful declarative API to avoid loops ( seaborn>=0.12 ).使用强大的声明 API来避免循环 ( seaborn>=0.12 )。

Specifically, put x,y, and annotations into a pandas data frame and call plotting.具体来说,将 x、y 和注释放入 pandas 数据框中并调用绘图。

Here is an example from my own research work.这是我自己的研究工作中的一个例子。

import seaborn.objects as so
import pandas as pd

df = pd.DataFrame(..,columns=['phase','P(X=1)','text'])

fig,ax = plt.subplots()
    p = so.Plot(df,x='phase',y='P(X=1)',text='text').add(so.Dot(marker='+')).add(so.Text(halign='left'))
    p.on(ax).show()

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM