简体   繁体   English

使用 matplotlib 绘制散点图和曲线图

[英]Scatter and curve plot using matplotlib

I am trying to plot the accuracy evolution of NN models overtimes.我正在尝试绘制 NN 模型加班的准确性演变。 So, I have an excel file with data like the following:所以,我有一个包含如下数据的 excel 文件: 在此处输入图片说明 and I wrote the following code to extract data and plot the scatter:我编写了以下代码来提取数据并绘制散点图:

import pandas as pd

data = pd.read_excel (r'SOTA DNN.xlsx')
acc1 = pd.DataFrame(data, columns= ['Top-1-Acc'])
para = pd.DataFrame(data, columns= ['Parameters'])
dates = pd.to_datetime(data['Date'], format='%Y-%m-%d')

import matplotlib.pyplot as plt
plt.grid(True)
plt.ylim(40, 100)
plt.scatter(dates, acc1)
plt.show()

在此处输入图片说明

Is there a way to draw a line in the same figure to show only the ones achieving the maximum and print their names at the same time as in this example:有没有办法在同一个图中画一条线,只显示达到最大值的那些,并在这个例子中同时打印他们的名字: 在此处输入图片说明 is it also possible to stretch the x-axis (for the dates)?是否也可以拉伸 x 轴(用于日期)?

It is still not clear what you mean by "stretch the x-axis" and you did not provide your dataset, but here is a possible general approach:目前还不清楚“拉伸 x 轴”是什么意思,并且您没有提供数据集,但这是一种可能的通用方法:

import matplotlib.pyplot as plt
import pandas as pd

#fake data generation, this has to be substituted by your .xls import routine
from pandas._testing import rands_array
import numpy as np
np.random.seed(1234)
n = 30
acc = np.concatenate([np.random.randint(0, 10, 10), np.random.randint(0, 30, 10), np.random.randint(0, 100, n-20)])
date_range = pd.date_range("20190101", periods=n)
model = rands_array(5, n)
df = pd.DataFrame({"Model": model, "Date": date_range, "TopAcc": acc})
df = df.sample(frac=1).reset_index(drop=True)


#now to the actual data modification
#first, we extract the dataframe with monotonically increasing values after sorting the date column
df = df.sort_values("Date").reset_index()
df["Max"] = df.TopAcc.cummax().diff()
df.loc[0, "Max"] = 1
dfmax = df[df.Max > 0]

#then, we plot all data, followed by the best performers
fig, ax = plt.subplots(figsize=(10, 5))
ax.scatter(df.Date, df.TopAcc, c="grey")
ax.plot(dfmax.Date, dfmax.TopAcc, marker="x", c="blue")

#finally, we annotate the best performers
for _, xylabel in dfmax.iterrows():
        ax.text(xylabel.Date, xylabel.TopAcc, xylabel.Model, c="blue", horizontalalignment="right", verticalalignment="bottom")

plt.show()

Sample output:示例输出:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM