简体   繁体   中英

Plotting two datasets with incomplete data in same graph

My dataframe consists of two columns, the shareprice in every workday and the eps. The shareprice is only available on workdays, while the eps is only available quarterly on a saturday. Now I want to plot both graphs in the same visualization, with two y-axes.

            close   eps
date
...         
2020-04-01  240.91  NaN
2020-03-31  254.29  NaN
2020-03-30  254.81  NaN
2020-03-28     NaN  2.59
2020-03-27  247.74  NaN
2020-03-26  258.44  NaN
...
2019-12-28     NaN  5.04
2019-12-27  289.80  NaN
...   

My approach so far is using plotly:

fig = make_subplots(specs=[[{"secondary_y": True}]])
    fig.add_trace(
        go.Scatter(
            x=df.index,
            y=df["close"],
            name = "Price"
        ),
        secondary_y = False,
    )
    fig.add_trace(
        go.Scatter(
            x=df.dropna(subset=["eps"]),
            y=df["eps"],
            name = "EPS",
        ),
        secondary_y = True,
    )

    
    fig.update_yaxes(
        title_text="Price",
        secondary_y=False
    )
    fig.update_yaxes(
        title_text="EPS",
        secondary_y=True,
    )
    
    fig.show()

However, I end up with a graph, but the EPS are not shown. I want eps , to be a line of connected dots, for all the missing datapoints in the eps column.

在此处输入图像描述

I'm not quite sure what if you want a sort of stepwise plot or just join dots with a line. In the first case I think that you can use df["eps"].fillna(method="ffill") while on the second df["eps"].interpolate()

Generate data

import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots

df = pd.DataFrame({"date":pd.date_range('2019-01-01', '2020-12-31')})

df["close"] = np.abs(np.random.randn(len(df))) * 300
df["eps"] = np.abs(np.random.randn(len(df))) * 10

df["close"] = np.where(df["date"].dt.weekday>=5,
                       np.nan,
                       df["close"])

df["eps"] = np.where((df["date"].dt.month%4==0) & 
                     (df["date"].dt.weekday==5),
                     df["eps"],
                     np.nan)

grp = df.set_index("date").groupby(pd.Grouper(freq="M"))["eps"].last().reset_index()

df = df.drop("eps", axis=1)
df = pd.merge(df, grp, how="left", on="date")

df = df.set_index("date")

Using fillna(method="ffill")

df["eps_fillna"] = df["eps"].fillna(method="ffill")

fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(
        go.Scatter(
            x=df.index,
            y=df["close"],
            name = "Price"
        ),
        secondary_y = False,
    )
fig.add_trace(
        go.Scatter(
            x=df.index,
            y=df["eps_fillna"],
            name = "EPS",

        ),
        secondary_y = True,
    )

    
fig.update_yaxes(
        title_text="Price",
        secondary_y=False
    )
fig.update_yaxes(
        title_text="EPS",
        secondary_y=True,
    )
    
fig.show()

在此处输入图像描述

Using interpolate()

df["eps_interpolate"] = df["eps"].interpolate()

fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(
        go.Scatter(
            x=df.index,
            y=df["close"],
            name = "Price"
        ),
        secondary_y = False,
    )
fig.add_trace(
        go.Scatter(
            x=df.index,
            y=df["eps_interpolate"],
            name = "EPS",

        ),
        secondary_y = True,
    )

    
fig.update_yaxes(
        title_text="Price",
        secondary_y=False
    )
fig.update_yaxes(
        title_text="EPS",
        secondary_y=True,
    )
    
fig.show()

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM