简体   繁体   中英

how to plot and show the distribution of the dataset in python?

I have a dataset which a small part of it looks like below,

data = [ ['2018-01-01',  1.323 ,    'AI' ,   2000,'Communications','Mothers'], 
   ['2018-01-02',  1.525 ,    'AI',    1500,'Communications','Mothers'],
   ['2018-01-03',  1.045 ,    'AI' ,    500,'Communications','Mothers'],
   ['2018-01-04',  1.845 ,    'AI' ,    600,'Communications','Mothers'],
  ['2018-01-05',  1.045 ,    'AI' ,    500,'Communications','Mothers'],
   ['2018-01-02',  1.446  ,  'BOC' ,    550,'Pharmaceuticals','JASDAQ Standard'],
   ['2018-01-03',  2.110 ,   'BOC' ,   3201,'Pharmaceuticals','JASDAQ Standard'],
   ['2018-01-04',  2.150 ,   'BOC' ,   5200,'Pharmaceuticals','JASDAQ Standard'],
   ['2018-01-05',  2.810 ,   'BOC' ,   1980,'Pharmaceuticals','JASDAQ Standard'],
   ['2018-01-03',  5.199 ,   'CAT' ,   2000,'Real Estate','Mothers'],
  ['2018-01-06',  4.980 ,   'CAT' ,    450,'Real Estate','Mothers'],
  ['2018-01-07',  4.990 ,   'CAT' ,   3000,'Real Estate','Mothers']]
df = pd.DataFrame(data,columns =['date',  'price', 'ticker',  'volume', 'Sector','Market Division'])

I want to show which market division has more stock and from which sector. I tried the treemap as below, but did not work How can I do this?

import plotly.express as px
import numpy as np

a=df.groupby(['Market Division','Sector']).count()

a["Exchange"] = "Exchange" # in order to have a single root node
fig = px.treemap(a, path=['Exchange', 'Market Division', 'Sector','ticker'], values='ticker')
fig.show()

You may try using stacked plots . Here is a dummy example:

import matplotlib.pyplot as plt
labels = list(set([md for md in df['Market Division']]))
fig, ax = plt.subplots()
jasdaq = [3434, 5454, 45454] 
mothers = [35345, 64534, 43543]
ax.bar(labels, jasdaq[0], label='Pharmaceuticals')
ax.bar(labels, jasdaq[1], label='Communication')
ax.bar(labels, jasdaq[2], label='Real Estate')
ax.bar(labels, mothers[0], label='Pharmaceuticals')
ax.bar(labels, mothers[1], label='Communication')
ax.bar(labels, mothers[2], label='Real Estate')

ax.legend()
plt.show()

在此处输入图像描述

You need to compute each sector for each Market division first and replace jasdaq and mothers to have the real plot you want.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM