I want to plot some data over time. my dataframe has one column date
with format 2015-11-25 10:00:00
(datetime64) the other column, data
, is format 1.53
(just a series of numbers float64)
Now where it gets tricky, is that the samples were taken in series. eg :
2015-11-20 00:00:00
till 2015-11-21 00:00:00
2015-11-22 00:00:00
till 2015-11-23 00:00:00
2015-11-24 00:00:00
till 2015-11-25 00:00:00
All the data is one below the other, so there are no gaps in the data.
so when I execute my code:
ax = df.plot(x='Date', y='Data')
fig = ax.get_figure()
I get a graph that fills in the data on the dates that I never measured. All I want is to show is a graph with the data on the ACTUAL dates I measured. I don't understand why python extrapolates these data points. How can I turn off this feature?
Pandas' plot() function by default creates a line plot. If you only want to plot the data points you have, create a scatter plot instead.
ax = df.plot(kind='scatter', x='Date', y='Data')
See: http://pandas.pydata.org/pandas-docs/stable/visualization.html#visualization-scatter
Edit
As pandas' Scatter Plot plotting function requires numeric columns for both x and y axis, you'll run into issues with my original answer. The best way to do this is to plot using matplotlib directly. For what you're trying to do, the below sample should work:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.plot_date(df['Date'], df['Data'])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.