[英]Annotate csv column in scatter plot
我有兩個 csv 格式的數據集:
df2
type prediction 100000 155000
0 0 2.60994 3.40305
1 1 10.82100 34.68900
0 0 4.29470 3.74023
0 0 7.81339 9.92839
0 0 28.37480 33.58000
df
TIMESTEP id type y z v_acc
100000 8054 1 -0.317192 -0.315662 15.54430
100000 669 0 0.352031 -0.008087 2.60994
100000 520 0 0.437786 0.000325 5.28670
100000 2303 1 0.263105 0.132615 7.81339
105000 8055 1 0.113863 0.036407 5.94311
我正在嘗試將df2[100000]
的值與df1[v_acc]
匹配。 如果值匹配,我將從df
與列y
和z
制作散點圖。 之后,我想用匹配值注釋散點。
我想要的是:
(我想要在同一個情節中的所有注釋)。
我嘗試在 python 中針對這種情況進行編碼,但我沒有在一個圖中獲得所有注釋點,而是我得到了帶有單個注釋的多個圖。 我也收到此錯誤:
TypeError Traceback (most recent call last)
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/IPython/core/formatters.py:339, in BaseFormatter.__call__(self, obj)
337 pass
338 else:
--> 339 return printer(obj)
340 # Finally look for special method names
341 method = get_real_method(obj, self.print_method)
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/IPython/core/pylabtools.py:151, in print_figure(fig, fmt, bbox_inches, base64, **kwargs)
148 from matplotlib.backend_bases import FigureCanvasBase
149 FigureCanvasBase(fig)
--> 151 fig.canvas.print_figure(bytes_io, **kw)
152 data = bytes_io.getvalue()
153 if fmt == 'svg':
File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/matplotlib/backend_bases.py:2295, in FigureCanvasBase.print_figure(self, filename, dpi, facecolor, edgecolor, orientation, format, bbox_inches, pad_inches, bbox_extra_artists, backend, **kwargs)
2289 renderer = _get_renderer(
2290 self.figure,
2291 functools.partial(
2292 print_method, orientation=orientation)
2293 )
2294 with getattr(renderer, "_draw_disabled", nullcontext)():
-> 2295 self.figure.draw(renderer)
2297 if bbox_inches:
...
189 if len(self) == 1:
190 return converter(self.iloc[0])
--> 191 raise TypeError(f"cannot convert the series to {converter}")
TypeError: cannot convert the series to <class 'float'>
我可以得到一些幫助來制作我想要的情節嗎?
謝謝你。 我的代碼在這里:
df2 = pd.read_csv('./result.csv')
print(df2.columns)
#print(df2.head(10))
df = pd.read_csv('./main.csv')
df = df[df['TIMESTEP'] == 100000]
for i in df['v_acc']:
for j in df2['100000']:
# sometimes numbers are long and different after decimals.So mathing 0.2f only
if "{0:0.2f}".format(i) == "{0:0.2f}".format(j):
plt.figure(figsize = (10,8))
sns.scatterplot(data = df, x = "y", y = "z", hue = "type", palette=['red','dodgerblue'], legend='full')
plt.annotate(i, (df['y'][df['v_acc'] == i], df['z'][df['v_acc'] == i]))
plt.grid(False)
plt.show()
break
多個圖的原因是因為您在循環中使用plt.figure()
。 這將為每個循環創建一個圖形。 您需要在外部創建,並且僅在循環內創建單個分散和注釋。 這是為您提供的數據運行的更新代碼。 除此之外,認為你的代碼很好......
fig, ax=plt.subplots(figsize = (7,7)) ### Keep this before the loop and call it as subplot
for i in df['v_acc']:
for j in df2[100000]:
# sometimes numbers are long and different after decimals.So mathing 0.2f only
if "{0:0.2f}".format(i) == "{0:0.2f}".format(j):
#plt.figure(figsize = (10,8))
ax=sns.scatterplot(data = df, x = "y", y = "z", hue = "type", palette=['red','dodgerblue'], legend='full')
ax.annotate(i, (df['y'][df['v_acc'] == i], df['z'][df['v_acc'] == i]))
break
plt.grid(False) ### Keep these two after the loop, just one show for one plot
plt.show()
輸出圖
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.