简体   繁体   English

在散点图中注释 csv 列

[英]Annotate csv column in scatter plot

I have two dataset in csv format:我有两个 csv 格式的数据集:

df2 df2

type  prediction          100000     155000    
 0           0            2.60994   3.40305
 1           1           10.82100  34.68900
 0           0            4.29470   3.74023
 0           0            7.81339   9.92839
 0           0           28.37480  33.58000

df df

 TIMESTEP   id  type         y         z         v_acc
  100000   8054     1     -0.317192 -0.315662   15.54430
  100000    669     0      0.352031 -0.008087   2.60994 
  100000    520     0      0.437786  0.000325   5.28670
  100000   2303     1      0.263105  0.132615   7.81339 
  105000   8055     1      0.113863  0.036407   5.94311

I am trying to match value of df2[100000] to df1[v_acc] .我正在尝试将df2[100000]的值与df1[v_acc]匹配。 If value matched, I am making scatter plot from df with columns y and z .如果值匹配,我将从df与列yz制作散点图。 After that I want to to annoted scatter point with matched value.之后,我想用匹配值注释散点。

What I want is:我想要的是:

在此处输入图像描述

(I want all annotaions in a same plot). (我想要在同一个情节中的所有注释)。

I tried to code in python for such condition but I am not getting all annotation points in a single plot instead I am getting multi plots with a single annotation.我尝试在 python 中针对这种情况进行编码,但我没有在一个图中获得所有注释点,而是我得到了带有单个注释的多个图。 I am also getting this error:我也收到此错误:

TypeError                                 Traceback (most recent call last)
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/IPython/core/formatters.py:339, in BaseFormatter.__call__(self, obj)
    337     pass
    338 else:
--> 339     return printer(obj)
    340 # Finally look for special method names
    341 method = get_real_method(obj, self.print_method)

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/IPython/core/pylabtools.py:151, in print_figure(fig, fmt, bbox_inches, base64, **kwargs)
    148     from matplotlib.backend_bases import FigureCanvasBase
    149     FigureCanvasBase(fig)
--> 151 fig.canvas.print_figure(bytes_io, **kw)
    152 data = bytes_io.getvalue()
    153 if fmt == 'svg':

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/matplotlib/backend_bases.py:2295, in FigureCanvasBase.print_figure(self, filename, dpi, facecolor, edgecolor, orientation, format, bbox_inches, pad_inches, bbox_extra_artists, backend, **kwargs)
   2289     renderer = _get_renderer(
   2290         self.figure,
   2291         functools.partial(
   2292             print_method, orientation=orientation)
   2293     )
   2294     with getattr(renderer, "_draw_disabled", nullcontext)():
-> 2295         self.figure.draw(renderer)
   2297 if bbox_inches:
...
    189 if len(self) == 1:
    190     return converter(self.iloc[0])
--> 191 raise TypeError(f"cannot convert the series to {converter}")

TypeError: cannot convert the series to <class 'float'>

Can I get some help to make a plot as I want?我可以得到一些帮助来制作我想要的情节吗?

Thank you.谢谢你。 My code is here:我的代码在这里:

df2 = pd.read_csv('./result.csv')
print(df2.columns)
#print(df2.head(10))
df  = pd.read_csv('./main.csv')
df = df[df['TIMESTEP'] == 100000]

for i in df['v_acc']:
    for j in df2['100000']:
        # sometimes numbers are long and different after decimals.So mathing 0.2f only
        if "{0:0.2f}".format(i) == "{0:0.2f}".format(j):
            plt.figure(figsize = (10,8))
            sns.scatterplot(data = df, x = "y", y = "z", hue = "type", palette=['red','dodgerblue'], legend='full')
            plt.annotate(i, (df['y'][df['v_acc'] == i], df['z'][df['v_acc'] == i]))
            plt.grid(False)
            plt.show()
            break

the reason for the multiple plots is because are you using plt.figure() inside the loop.多个图的原因是因为您在循环中使用plt.figure() This will create a single figure for each loop.这将为每个循环创建一个图形。 You need to create that outside and only the individual scatter and annotate within the loop.您需要在外部创建,并且仅在循环内创建单个分散和注释。 Here is the updated code that ran for the data you provided.这是为您提供的数据运行的更新代码。 Other than that, think your code is fine...除此之外,认为你的代码很好......

fig, ax=plt.subplots(figsize = (7,7)) ### Keep this before the loop and call it as subplot
for i in df['v_acc']:
    for j in df2[100000]:
        # sometimes numbers are long and different after decimals.So mathing 0.2f only
        if "{0:0.2f}".format(i) == "{0:0.2f}".format(j):
            #plt.figure(figsize = (10,8))
            ax=sns.scatterplot(data = df, x = "y", y = "z", hue = "type", palette=['red','dodgerblue'], legend='full')
            ax.annotate(i, (df['y'][df['v_acc'] == i], df['z'][df['v_acc'] == i]))
            break

plt.grid(False)  ### Keep these two after the loop, just one show for one plot
plt.show()

Output plot输出图

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM