In python pandas I have create a dataframe with one value for each year and two subclasses - ie, one metric for a parameter triplet
import pandas, requests, numpy
import matplotlib.pyplot as plt
df
Metric Tag_1 Tag_2 year
0 5770832 FOOBAR1 name1 2008
1 7526436 FOOBAR1 xyz 2008
2 33972652 FOOBAR1 name1 2009
3 17491416 FOOBAR1 xyz 2009
...
16 6602920 baznar2 name1 2008
17 6608 baznar2 xyz 2008
...
30 142102944 baznar2 name1 2015
31 0 baznar2 xyz 2015
I would like to produce a bar plot with metrics as y-values over x=(year,Tag_1,Tag_2) and sorting primarily for years and secondly for tag_1 and color the bars depending on tag_1. Something like
(2008,FOOBAR,name1) --> 5770832 *RED*
(2008,baznar2,name1) --> 6602920 *BLUE*
(2008,FOOBAR,xyz) --> 7526436 *RED*
(2008,baznar2,xyz) --> ... *BLUE*
(2008,FOOBAR,name1) --> ... *RED*
I tried starting with a grouping of columns like
df.plot.bar(x=['year','tag_1','tag_2']
but have not found a way to separate selections into two bar sets next to each other.
This should get you on your way:
df = pd.read_csv('path_to_file.csv')
# Group by the desired columns
new_df = df.groupby(['year', 'Tag_1', 'Tag_2']).sum()
# Sort descending
new_df.sort('Metric', inplace=True)
# Helper function for generation sequence of 'r' 'b' colors
def get_color(i):
if i%2 == 0:
return 'r'
else:
return 'b'
colors = [get_color(j) for j in range(new_df.shape[0])]
# Make the plot
fig, ax = plt.subplots()
ind = np.arange(new_df.shape[0])
width = 0.65
a = ax.barh(ind, new_df.Metric, width, color = colors) # plot a vals
ax.set_yticks(ind + width) # position axis ticks
ax.set_yticklabels(new_df.index.values) # set them to the names
fig.tight_layout()
plt.show()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.