简体   繁体   English

如何使用 Plotly 从 plot 中删除周末?

[英]How remove weekends from plot using Plotly?

I'm trying to remove the weekend gaps from this time series plot.我试图从这个时间序列 plot 中消除周末的差距。 The x-axis is a data time stamp. x 轴是数据时间戳。 I've tried the code on this site , but can't get it to work.我已经尝试过此站点上的代码,但无法使其正常工作。 See sample file used查看使用的示例文件

The data looks like this数据看起来像这样

+-----------------------+---------------------+-------------+-------------+
|          asof         |    INSERTED_TIME    | DATA_SOURCE |    PRICE    |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:00:15 | DB          | 170.4261757 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:06:10 | DB          | 168.9348656 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:06:29 | DB          | 168.8412129 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:07:27 | DB          | 169.878796  |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:10:28 | DB          | 169.3685879 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:12:14 | DB          | 169.0787045 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17   00:00:00 | 2020-06-17 12:12:33 | DB          | 169.7561092 |
+-----------------------+---------------------+-------------+-------------+

Plot including weekend breaks Plot 包括周末休息

Using the line function I'm getting the plot below, with straight lines going from Friday end of day to Monday morning.使用 function,我得到下面的 plot,直线从周五结束到周一早上。 Using px.scatter, I don't get the line, but I still get the gap.使用 px.scatter,我没有得到这条线,但我仍然得到了差距。

import plotly.express as px
import pandas as pd

sampledf = pd.read_excel('sample.xlsx')

fig_sample = px.line(sampledf, x = 'INSERTED_TIME', y= 'PRICE', color = 'DATA_SOURCE')
fig_sample.show()

在此处输入图像描述

Attempt with no weekend breaks尝试没有周末休息

fig_sample = px.line(sampledf, x = 'INSERTED_TIME', y= 'PRICE', color = 'DATA_SOURCE')
fig_sample.update_xaxes(
    rangebreaks=[
        dict(bounds=["sat", "mon"]) #hide weekends
    ]
)
fig_sample.show()

在此处输入图像描述

Using rangebreaks results in a blank plot.使用范围分隔符会导致空白plot

Any help is appreciated.任何帮助表示赞赏。 Thanks谢谢

There is a limitation of 1000 rows when using rangebreaks When working with more than 1000 rows, add the parameter render_mode='svg'使用rangebreaks时有 1000 行的限制当使用超过 1000 行时,添加参数render_mode='svg'

In the code below I've used the scatter function, but as you can see the large weekend gaps are not longer there.在下面的代码中,我使用了scatter function,但正如您所见,大的周末间隙不再存在。 Additionally I've excluded the times between 11PM and 11AM此外,我已经排除了晚上 11 点到上午 11 点之间的时间

sampledf = pd.read_excel('sample.xlsx')

fig_sample = px.scatter(sampledf, x = 'INSERTED_TIME', y= 'PRICE', color = 'DATA_SOURCE', render_mode='svg')
fig_sample.update_xaxes(
    rangebreaks=[
        { 'pattern': 'day of week', 'bounds': [6, 1]}
        { 'pattern': 'hour', 'bounds':[23,11]}
    ]
)
fig_sample.show()

在此处输入图像描述

The values in the plot are different from the original data set, but will work with the data in the original post. plot 中的值与原始数据集不同,但将适用于原始帖子中的数据。 Found help here 在这里找到帮助

Looks like the x axis on the blank plot does not even have the right range, since it begins in a different year.看起来空白 plot 上的 x 轴甚至没有正确的范围,因为它开始于不同的年份。 It's hard to explain the behavior without looking at the exact data input, but you can start with a working, simpler, dataset and try to check for differences (try to plot a filtered version of the data with select points or check for differences in the dtypes of the DataFrame, etc).如果不查看确切的数据输入,很难解释这种行为,但您可以从一个工作的、更简单的数据集开始并尝试检查差异(尝试使用 select 点的数据过滤版本 plot 或检查差异dtypes等的数据类型。

You will see the expected behavior with a simpler dataset:您将使用更简单的数据集看到预期的行为:

import plotly.express as px
import pandas as pd
from datetime import datetime
d = {'col1': [datetime(2020, 5, d) for d in range(1, 30)],
     'col2': [d if (d + 3) % 7 not in (5, 6) else 0 for d in range(1, 30)]}
df = pd.DataFrame(data=d)
df.set_index('col1')

df_weekdays = df[df['col1'].dt.dayofweek.isin([0,1,2,3,4])]

f = px.line(df, x='col1', y='col2')
f.update_xaxes(
    rangebreaks=[
        dict(bounds=["sat", "mon"]), #hide weekends
    ]
)
f.show()

有休息

For the DataFrame without weekends, df_weekdays , it's a similar image:对于没有周末的df_weekdays ,它是一个类似的图像:

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM