繁体   English   中英

我可以为每个 x 刻度设置 matplotlib 和 pandas 的默认值吗?

[英]Can I set default values with matplotlib and pandas for each x tick?

我有以下代码:

# Ratings by day, divided by Staff member

from datetime import datetime as dt

by_staff = df.groupby('User ID')

plt.figure(figsize=(15,8))

# Those are used to calculate xticks and yticks
xmin, xmax = pd.to_datetime(dt.now()), pd.to_datetime(0)
ymin, ymax = 0, 0

for index, data in by_staff:

    by_day = data.groupby('Date')
    
    x = pd.to_datetime(by_day.count().index)
    y = by_day.count()['Value']
    
    xmin = min(xmin, x.min())
    xmax = max(xmax, x.max())
    
    ymin = min(ymin, min(y))
    ymax = max(ymax, max(y))

    plt.plot_date(x, y, marker='o', label=index, markersize=12)

plt.title('Ratings by day, by Staff member', fontdict = {'fontsize': 25})
plt.xlabel('Day', fontsize=15)
plt.ylabel('n° of ratings for that day', fontsize=15)

ticks = pd.date_range(xmin, xmax, freq='D')

plt.xticks(ticks, rotation=60)
plt.yticks(range(ymin, ymax + 1))

plt.gcf().autofmt_xdate()

plt.grid()

plt.legend([a for a, b in by_staff],
          title="Ratings given",
          loc="center left",
          bbox_to_anchor=(1, 0, 0.5, 1))

plt.show()

如果当天没有数据,我想将特定 xtick 处显示的值设置为 0。 目前,这是 plot 所示:

显示的情节

我尝试了一些谷歌搜索,但我似乎无法正确解释我的问题。 我怎么能解决这个问题?

我的数据集: https://cdn.discordapp.com/attachments/311932890017693700/800789506328100934/sample-ratings.csv

让我们尝试通过让 pandas 聚合数据来简化任务。 我们同时按日期和用户 ID 分组,然后取消堆叠dataframe。 这允许我们用一个像 0 这样的预设值来填充缺失的数据点。形式x = df.groupby(["Date",'User ID']).count().Value.unstack(fill_value=0)是紧凑的链接a= df.groupby(["Date",'User ID'])b=a.count()c=b.Valuex=c.unstack(fill_value=0) 您可以打印出这些链式 pandas 操作的每个中间结果,看看它做了什么。

from matplotlib import pyplot as plt
import pandas as pd

df = pd.read_csv("test.csv", sep=",", parse_dates=["Date"])

#by_staff = df.groupby(["Date",'User ID']) - group entries by date and ID
#.count - count identical date-ID pairs
#.Value - use only this column
#.unstack(fill_value=0) bring resulting data from long to wide form
#and fill missing data with zero
by_staff = df.groupby(["Date",'User ID']).count().Value.unstack(fill_value=0)

ax = by_staff.plot(marker='o', markersize=12, linestyle="None", figsize=(15,8))

plt.title('Ratings by day, by Staff member', fontdict = {'fontsize': 25})
plt.xlabel('Day', fontsize=15)
plt.ylabel('n° of ratings for that day', fontsize=15)

#labeling only the actual rating values shown in the grid
plt.yticks(range(df.Value.max() + 1))
#this is not really necessary, it just labels zero differently
#labels = ["No rating"] + [str(i) for i in range(1, df.Value.max() + 1)]
#ax.set_yticklabels(labels)

plt.gcf().autofmt_xdate()
plt.grid()

plt.show()

样品 output: 在此处输入图像描述

显然,您不会看到多个条目。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM