[英]Plotly with datetime.time() in the x-axis and missing values
I have 2 pandas dataframes, df1 and df2 which both have data from 2 different days between 21:00 and 8:00.我有 2 个 pandas 数据帧,df1 和 df2,它们都包含 21:00 到 8:00 之间 2 个不同日期的数据。 The data should be 1 data point per minute, however there are there are missing values eg
数据应该是每分钟 1 个数据点,但是有缺失值,例如
location time Data
0 1 21:00:00 8
1 1 21:02:00 6
the data point for 21:01:00 does not exist. 21:01:00 的数据点不存在。 The missing data points occur at different times for each of the dataframes, so when I try to plot both of them on the same plot this happens:
对于每个数据帧,丢失的数据点发生在不同的时间,所以当我尝试在同一个 plot 上尝试 plot 时,会发生这种情况:
If I plot them individually they're both correct.如果我单独 plot 他们都是正确的。 I think the horizontal red lines are caused by the time values that exist in the red dataframe but not in the blue dataframe.
我认为水平红线是由红色 dataframe 中存在的时间值引起的,而不是蓝色 dataframe 中存在的时间值。
Has anyone encountered this before?有没有人遇到过这个? I want to plot both of them on the same axis, starting at 21:00 and finishing at 08:00.
我想 plot 两个都在同一轴上,从 21:00 开始,到 08:00 结束。
Here is the code I'm using:这是我正在使用的代码:
import pandas as pd
import plotly.express as px
df1 = pd.DataFrame({'location': 1,
'data': ['3', '4', '5'],
'time': [datetime.datetime(2022,7,16,21,0,0).time(),
datetime.datetime(2022,7,16,21,1,0).time(),
datetime.datetime(2022,7,16,21,3,0).time()]})
df2 = pd.DataFrame({'location': 2,
'data': ['8', '6', '7'],
'time': [datetime.datetime(2022,7,17,21,0,0).time(),
datetime.datetime(2022,7,17,21,2,0).time(),
datetime.datetime(2022,7,17,21,3,0).time()]})
df = pd.concat([df1,df2], axis=0)
fig = px.line(df, x="time", y="data", color='location')
fig.show()
Thanks!谢谢!
The problem is with the time column.问题在于时间列。 As you convert it to
time()
, this will be converted to object when you combine the dataframes.当您将其转换为
time()
时,当您组合数据帧时,它将转换为 object。 Check df.info()
.检查
df.info()
。 To avoid this, leave the data in datetime format and use update_axis()
to let px
set the time.为避免这种情况,请将数据保留为日期时间格式并使用
update_axis()
让px
设置时间。 Updated code below...下面更新代码...
import pandas as pd
import plotly.express as px
df1 = pd.DataFrame({'location': 1,
'data': ['3', '4', '5'],
'time': [datetime.datetime(2022,7,16,21,0,0),
datetime.datetime(2022,7,16,21,1,0),
datetime.datetime(2022,7,16,21,3,0)]})
df2 = pd.DataFrame({'location': 2,
'data': ['8', '6', '7'],
'time': [datetime.datetime(2022,7,16,21,0,0),
datetime.datetime(2022,7,16,21,2,0),
datetime.datetime(2022,7,16,21,3,0)]})
df = pd.concat([df1,df2], axis=0)
fig = px.line(df, x="time", y="data", color='location')
fig.update_xaxes(tickformat="%H:%M:%S")
fig.show()
Plot Plot
Thank you for your help @Redox it was very helpful but unfortunately doesn't work as I want it to when using the full datasets.感谢您的帮助@Redox,它非常有帮助,但不幸的是,在使用完整数据集时,它并没有像我想要的那样工作。 This is the result for the equivalent of this:
这是等效的结果:
## Note that you need to use .time()
df1 = pd.DataFrame({'location': 1, 'data': ['3', '4', '5'],
'time': [datetime.datetime(2022,7,17,21,0,0).time(),
datetime.datetime(2022,7,17,21,1,0).time(),
datetime.datetime(2022,7,17,21,3,0).time()]})
df2 = pd.DataFrame({'location': 2, 'data': ['8', '6', '7'],
'time': [datetime.datetime(2022,7,16,21,0,0).time(),
datetime.datetime(2022,7,16,21,2,0).time(),
datetime.datetime(2022,7,16,21,3,0).time()]})
df = pd.concat([df1,df2], axis=0)
date = str(datetime.datetime.strptime('2022-01-01', '%Y-%m-%d').date()) ##Random dummy date
df['time'] = pd.to_datetime(date + " " + df['time'].astype(str)) ##Convert back to datetime
fig = px.line(df, x="time", y="data", color='location')
fig.update_xaxes(tickformat="%H:%M")
fig.show()
When I try this:当我尝试这个时:
dt = datetime.datetime.strptime('2022-01-01', '%Y-%m-%d')
starttime = dt.replace(hour=21, minute=0) ## Start time is 9PM
dt = datetime.datetime.strptime('2022-01-02', '%Y-%m-%d')
endtime = dt.replace(hour=8, minute=0) ## End time is 8AM next day
fig = px.line(df, x="time", y="data", color='location', range_x=[starttime, endtime])
Here is what worked for me eventually:以下是最终对我有用的方法:
df1 = pd.DataFrame({'location': 1, 'data': ['3', '4', '5'],
'time_num': [datetime.datetime(2022,7,17,21,0,0).time().hour + datetime.datetime(2022,7,17,21,0,0).time().minute/60,
datetime.datetime(2022,7,17,21,1,0).time().hour + datetime.datetime(2022,7,17,21,0,0).time().minute/60,
datetime.datetime(2022,7,17,21,3,0).time().hour + datetime.datetime(2022,7,17,21,0,0).time().minute/60, ]})
df2 = pd.DataFrame({'location': 2, 'data': ['8', '6', '7'],
'time_num': [datetime.datetime(2022,7,16,21,0,0).time().hour + datetime.datetime(2022,7,16,21,0,0).time().minute/60,
datetime.datetime(2022,7,16,21,2,0).time().hour + datetime.datetime(2022,7,16,21,2,0).time().minute/60,
datetime.datetime(2022,7,16,21,3,0).time().hour + datetime.datetime(2022,7,16,21,3,0).time().minute/60]})
df_skeleton = pd.DataFrame()
df_skeleton['date'] = pd.date_range(datetime.datetime(2022,7,16,20,0,0), datetime.datetime(2022,7,17,8,0,0), freq = '1min')
df_skeleton['time']=df_test['date'].dt.strftime('%H:%M:%S')
df_skeleton['hour']=df_test['date'].dt.strftime('%H')
df_skeleton['min']=df_test['date'].dt.strftime('%M')
df_skeleton[['hour', 'min']] = df_test[['hour', 'min']].astype(int)
df_skeleton['time_num'] = df_test['hour'] + df_test['min']/60
result_1 = pd.merge(df_skeleton, df1, how="left", on=["time_num", "time_num"])
result_2 = pd.merge(df_skeleton, df2, how="left", on=["time_num", "time_num"])
result_1['location'] = '1'
fig = px.line(result_1, x='time', y='data',color='location')
fig.add_scatter(x=result_2['time'], y=result_2['data'],mode='lines', name='2')
fig.update_traces(connectgaps=True)
fig.show()
I'm not overly pleased with it but it works both with the dummy dataframes and the full dataframes.我对它并不太满意,但它适用于虚拟数据帧和完整数据帧。
This gives a different struct data frame:这给出了一个不同的结构数据框:
location_x![]() |
time_x![]() |
Data_x![]() |
t![]() |
location_y ![]() |
time_y ![]() |
Data_y![]() |
|
---|---|---|---|---|---|---|---|
0 ![]() |
1 ![]() |
2022-09-01 21:00:00 ![]() |
0 ![]() |
21:00:00 ![]() |
2 ![]() |
2022-09-04 21:00:00 ![]() |
1 ![]() |
1 ![]() |
1 ![]() |
2022-09-01 21:01:00 ![]() |
0.0302984 ![]() |
21:01:00 ![]() |
2 ![]() |
2022-09-04 21:01:00 ![]() |
0.999541 ![]() |
2 ![]() |
1 ![]() |
2022-09-01 21:02:00 ![]() |
0.060569 ![]() |
21:02:00 ![]() |
2 ![]() |
2022-09-04 21:02:00 ![]() |
0.998164 ![]() |
3 ![]() |
1 ![]() |
2022-09-01 21:03:00 ![]() |
0.0907839 ![]() |
21:03:00 ![]() |
2 ![]() |
2022-09-04 21:03:00 ![]() |
0.995871 ![]() |
4 ![]() |
1 ![]() |
2022-09-01 21:04:00 ![]() |
0.120916 ![]() |
21:04:00 ![]() |
2 ![]() |
2022-09-04 21:04:00 ![]() |
nan![]() |
This is then simple to generate a px.line()
figure from.然后很容易从中生成一个
px.line()
图形。 Traces being Data_x and Data_y .跟踪是Data_x和Data_y 。 Have used datetime column time_x for xaxis .
已将datetime列time_x用于xaxis 。 This then works well as datetime and continuous axes are well integrated.
这样就可以很好地集成日期时间和连续轴。 Updated
tickformat
so date part of axis is not displayed.更新
tickformat
,因此轴的日期部分不显示。
import pandas as pd
import numpy as np
import plotly.express as px
dr = pd.date_range("2022-09-01 21:00", "2022-09-02 08:00", freq="1Min")
# data to match question, two dataframes from 21:00 to 08:00, different dates with some holes
# with different dates
dfs = [
pd.DataFrame(
{
"location": np.full(len(dr), l),
"time": dr + pd.DateOffset(days=o),
"Data": f(np.linspace(0, 20, len(dr))),
}
)
.sample(frac=0.95)
.sort_index()
for l, o, f in zip([1, 2], [0, 3], [np.sin, np.cos])
]
df1 = dfs[0]
df2 = dfs[1]
# let's integrate the dataframes
# 1. fill the holes in each dataframe by doing an outer join to all times
# 2. outer join the two dataframes on just the time
df = pd.merge(
*[
pd.merge(
d,
pd.DataFrame(
{"time": pd.date_range(d["time"].min(), d["time"].max(), freq="1min")}
),
on="time",
how="outer",
)
.fillna({"location": l})
.assign(t=lambda d: d["time"].dt.time)
for d, l in zip([df1, df2], [1, 2])
],
on="t",
how="outer",
)
# finally generate plotly line chart using columns created by merging the data
# it's clearly observed there are gaps in both traces
px.line(
df.sort_values("time_x"), x="time_x", y=["Data_x", "Data_y"], hover_data=["time_y"]
).update_layout({"xaxis": {"tickformat": "%H:%M"}})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.