简体   繁体   中英

What is the best way to plot numerical Y axis, X axis Time series for a categorical variable in Python?

My Data Frame is in below format

Amount  Category    Transactiondatetime
9445    A16             22-04-2015 19:42
2000    A23             23-04-2015 16:29
1398    A16             02-05-2015 15:17
1995    A7              27-06-2015 13:51
2000    A23             07-08-2015 17:31

Variable Description

Assume category variable as some product categories sold on a website. Category variable has around 15-20 categories. Some products were sold 20 times in a year, some were sold 50 and so on for different different amount.

The time series is spread across the year and the data has 6000000 rows.

Aim of my task

I am interested in viewing which category gets most amount during which part of the year. This can be a little messy as the data is huge and there will be some over lapping in the categories on a time series scale.

So what would be the best way to visualize this kind of data - it can be matplotlib, seaborn or bokeh or any other library.

Will appreciate example with code.

在此处输入图片说明

Maybe just use a bar graph with amount on the y-axis and time on the x-axis?

import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv('something.csv')
df['Transactiondatetime'] = pd.to_datetime(df['Transactiondatetime'], infer_datetime_format=True)

categories = list(set(df['Category'].tolist()))
fig, ax = plt.subplots()
bar_width = 2.0
for category in categories:
    cat_df = df[df['Category'] == category]
    times = cat_df['Transactiondatetime'].tolist()
    values = cat_df['Amount'].tolist()
    ax.bar(times, values, bar_width, label=category)

ax.legend()
plt.xlabel('Transaction Date')
plt.ylabel('Amount')
plt.gcf().autofmt_xdate()

plt.show()

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM