Matplotlib 轴仅具有 Pandas Dataframe 上的值

Question

I'm working on a backlog chart since last year, and now with the new year,and now I'm facing this issue:从去年开始我就在做积压图表，现在是新的一年，现在我正面临这个问题：

I had to multiply the number of the year to keep the X axis keep rolling to the right.我不得不乘以年份数以保持 X 轴向右滚动。 But after that, I got this blanked interval on X axis from 202052 (concatenate year + week of the year number) until 202099~.但在那之后，我在 X 轴上得到了从 202052（连接年份 + 年份中的星期数）到 202099~ 的空白间隔。

My indexes doesn't have these values.我的索引没有这些值。 As below:如下：

(Int64Index([202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202035, 202036, 202037, 202038, 202040, 202041, 202043, 202044,
             202045, 202046, 202047, 202048, 202049, 202050, 202051, 202052,
             202101, 202102],
            dtype='int64'),
 Int64Index([202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202034, 202035, 202036, 202037, 202038, 202040, 202041, 202043,
             202044, 202045, 202046, 202047, 202048, 202049, 202050, 202051,
             202052, 202101, 202102],
            dtype='int64'),
 Int64Index([202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202034, 202035, 202036, 202037, 202038, 202040, 202041, 202043,
             202044, 202045, 202046, 202047, 202048, 202049, 202050, 202051,
             202052, 202101, 202102],
            dtype='int64'))

How can I drop these values?我怎样才能放弃这些价值？

Thank you!谢谢！

EDIT: Adding full code编辑：添加完整代码


import matplotlib.pyplot as plt
import pandas as pd
from datetime import datetime, timedelta
from matplotlib.lines import Line2D
import matplotlib.dates as mdates
import matplotlib.cbook as cbook
from matplotlib.ticker import MaxNLocator

%matplotlib inline

df = pd.read_csv(
    "/home/eklon/Downloads/Venturus/NetSuite/Acompanhamento/130121/MelhoriasNetSuite130121.csv", delimiter=';')


df.columns = df.columns.str.replace(' ', '')    

df['CreatedDate'] = pd.to_datetime(df['CreatedDate'])
df['CompletedDate'] = pd.to_datetime(df['CompletedDate'])
df['DayCompleted'] = df['CompletedDate'].dt.dayofweek
df['DayCreated'] = df['CreatedDate'].dt.dayofweek
df['WeekCreated'] = df['CreatedDate'].dt.isocalendar().week
df['WeekCompleted'] = df['CompletedDate'].dt.isocalendar().week
df['YearCreated'] = df['CreatedDate'].dt.year
df['YearCompleted'] = df['CompletedDate'].dt.year
df['firstCompletedDate'] = df.CompletedDate - df.DayCompleted * timedelta(days=1)
df['firstCreatedDate'] = df.CreatedDate - df.DayCreated * timedelta(days=1)

df['YearWeekCreated'] = df['YearCreated']*100 + df['WeekCreated']
df['YearWeekCompleted'] = df['YearCompleted']*100 + df['WeekCompleted']


df_done = df[df['Progress'] == 'Completed']
df_open = df[df['Progress'] != 'Completed']
df_todo = df[df['BucketName'] == 'To do']
df_doing = df[df['BucketName'] == 'Doing']
df_consult = df[df['BucketName'] == 'Em andamento RSM']
df_open['Priority'].value_counts().sort_index()
df['Priority'].sort_index()

df_backlog_created = df['YearWeekCreated'].value_counts().sort_index()
df_backlog_completed = df['YearWeekCompleted'].value_counts().sort_index()
df_backlog = df_backlog_created.cumsum() - df_backlog_completed.cumsum()




#============================================================================


qtd_created = df['YearWeekCreated'].value_counts().sort_index()
idx_created = qtd_created.index
qtd_completed = df['YearWeekCompleted'].value_counts().sort_index()
idx_completed = qtd_completed.index 
qtd_backlog = df_backlog
idx_backlog = qtd_backlog.index

idx_completed = idx_completed.astype(int)


fig, ax = plt.subplots(figsize=(14,10))



#plt.figure(figsize=(14,10))
ax.plot(idx_created, list(qtd_created), label="Iniciadas", color="r")
ax.plot(idx_completed, list(qtd_completed), label="Completadas", color="y", linewidth=3)
ax.bar(idx_backlog, qtd_backlog, label="Backlog", color="b")
ax.legend(['Novas', 'Fechadas', 'Backlog'])



x=[1,2,3]
y=[9,8,7]


for a,b in zip(idx_created, qtd_created): 
    plt.text(a, b, str(b), fontsize=12, color='w', bbox=dict(facecolor='red', alpha=0.5), horizontalalignment='center')




for a,b in zip(idx_backlog, qtd_backlog): 
    plt.text(a, b, str(b), fontsize=12, color='w', bbox=dict(facecolor='blue', alpha=0.5), horizontalalignment='center')



for a,b in zip(idx_completed, qtd_completed): 
    plt.text(a, b, str(b), fontsize=12, color='black', bbox=dict(facecolor='yellow', alpha=0.5))


plt.title('Backlog', fontsize= 20)

Answer 1

This is not direct fix for your code, but the principle should be the same.这不是您代码的直接修复，但原理应该是相同的。 I will create a fake dataframe and illustrate the problem and a solution.我将创建一个假的 dataframe 并说明问题和解决方案。

Current empty space problem:当前空白空间问题：

labels = [202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202034, 202035, 202036, 202037, 202038, 202040, 202041, 202043,
             202044, 202045, 202046, 202047, 202048, 202049, 202050, 202051,
             202052, 202101, 202102]
y = np.random.rand(len(labels))

# old approach, will have empty space
_, ax = plt.subplots(1,1)
ax.plot(labels, y)

Suggested solution:建议的解决方案：

labels = [202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202034, 202035, 202036, 202037, 202038, 202040, 202041, 202043,
             202044, 202045, 202046, 202047, 202048, 202049, 202050, 202051,
             202052, 202101, 202102]
y = np.random.rand(len(labels))

# suggested by dummy index
x_idx = range(len(labels))
_, ax = plt.subplots(1,1)
ax.plot(x_idx, y)
ax.set_xticks(x_idx[::5])
ax.set_xticklabels(labels[::5])

Hope this works work for you.希望这对你有用。 Kr.氪

Answer 2

What you want to do is called index plotting (just pass the y values to plot , no x values), so you should use an IndexLocator .你想要做的叫做索引绘图（只需将 y 值传递给plot ，没有 x 值），所以你应该使用IndexLocator 。 In the following example you set a tick every 4th row:在以下示例中，您每隔 4 行设置一个勾号：

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mt

np.random.seed(0)
idx = [202026, 202027, 202028, 202029, 202030, 202031, 202032, 202033,
             202035, 202036, 202037, 202038, 202040, 202041, 202043, 202044,
             202045, 202046, 202047, 202048, 202049, 202050, 202051, 202052,
             202101, 202102]
df = pd.DataFrame(np.random.rand(len(idx)), index=idx, columns=['col1'])

fig,ax = plt.subplots()
ax.plot(df.col1.to_numpy())
ax.xaxis.set_major_locator(mt.IndexLocator(4,0))
ax.xaxis.set_ticklabels(df.iloc[ax.get_xticks()].index)

Another possibility is to use a FuncFormatter , especially if you want to zoom your chart as it will dynamically format the autolocator ticks:另一种可能性是使用FuncFormatter ，特别是如果您想缩放图表，因为它会动态格式化自动定位器刻度：

ax.xaxis.set_major_formatter(mt.FuncFormatter(lambda x,_: f'{df.index[int(x)]}' if x in range(len(df)) else ''))

Matplotlib 轴仅具有 Pandas Dataframe 上的值

问题描述

2 个解决方案

解决方案1
1 2021-01-13 14:10:00

解决方案2
1 2021-01-13 14:54:17

Matplotlib 轴仅具有 Pandas Dataframe 上的值

问题描述

2 个解决方案

解决方案1 1 2021-01-13 14:10:00

解决方案2 1 2021-01-13 14:54:17

解决方案1
1 2021-01-13 14:10:00

解决方案2
1 2021-01-13 14:54:17