简体   繁体   中英

Seaborn lineplot unexpected behaviour

I am hoping to understand why the following Seaborn lineplot behaviour occurs.

Spikes are occurring through the time-series and additional data has been added to the left of the actual data.

How can I prevent this unexpected behaviour in Seaborn?

Regular plot of data:

import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns

aussie_property[['Sydney(SYDD)']].plot();

在此处输入图像描述

Seaborn plot of data:

sns.lineplot(data=aussie_property, x='date', y='Sydney(SYDD)');

在此处输入图像描述

This is not a seaborn problem but a question of ambiguous datetimes.

Convert date to a datetime object with the following code:

aussie_property['date'] = pd.to_datetime(aussie_property['Date'], dayfirst=True)

and you get your expected plot with seaborn

在此处输入图像描述

Generally, it is advisable to provide the format during datetime conversions, eg,

aussie_property['date'] = pd.to_datetime(aussie_property['Date'], format="%d/%m/%Y")

because, as we have seen here, dates like 10/12/2020 are ambiguous. Consequently, the parser first thought the data would be month/day/year and later noticed this cannot be the case, so changed to parsing your input as day/month/year, giving rise to these time-travelling spikes in your seaborn graph. Why you didn't see them in the pandas plot, you ask? Well, this is plotted against the index, so you don't notice this conversion problem in the pandas plot.
More information on the format codes can be found in the Python documentation .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM