My Data Frame is in below format
Amount Category Transactiondatetime
9445 A16 22-04-2015 19:42
2000 A23 23-04-2015 16:29
1398 A16 02-05-2015 15:17
1995 A7 27-06-2015 13:51
2000 A23 07-08-2015 17:31
Variable Description
Assume category
variable as some product categories sold on a website. Category variable has around 15-20 categories. Some products were sold 20 times in a year, some were sold 50 and so on for different different amount.
The time series is spread across the year and the data has 6000000 rows.
Aim of my task
I am interested in viewing which category gets most amount during which part of the year. This can be a little messy as the data is huge and there will be some over lapping in the categories on a time series scale.
So what would be the best way to visualize this kind of data - it can be matplotlib, seaborn or bokeh or any other library.
Will appreciate example with code.
Maybe just use a bar graph with amount on the y-axis and time on the x-axis?
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('something.csv')
df['Transactiondatetime'] = pd.to_datetime(df['Transactiondatetime'], infer_datetime_format=True)
categories = list(set(df['Category'].tolist()))
fig, ax = plt.subplots()
bar_width = 2.0
for category in categories:
cat_df = df[df['Category'] == category]
times = cat_df['Transactiondatetime'].tolist()
values = cat_df['Amount'].tolist()
ax.bar(times, values, bar_width, label=category)
ax.legend()
plt.xlabel('Transaction Date')
plt.ylabel('Amount')
plt.gcf().autofmt_xdate()
plt.show()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.