简体   繁体   English

通过 python 创建多类别图表

[英]Create multicategory chart by python

I have the data like the following in excel:我在 excel 中有如下数据:

company公司 month-year月-年 #people got #人们得到了 interviewed接受采访 # people employed # 就业人数

在此处输入图像描述

link to the data: ( https://docs.google.com/spreadsheets/d/1DwZt9fpnzR9yUNBMjmqA1hg11d-2dXNs/edit?usp=share_link&ouid=113997824301423906122&rtpof=true&sd=true )数据链接:( https://docs.google.com/spreadsheets/d/1DwZt9fpnzR9yUNBMjmqA1hg11d-2dXNs/edit?usp=share_link&ouid=113997824301423906122&rtpof=true&sd=true )

when I try to create multicategory chart(company as first category and the year-month as second category) by plotly library by python it mixes up the order of second category for y,z company.当我尝试通过 plotly 库通过 python 创建多类别图表(公司作为第一类别,年月作为第二类别)时,它混淆了 y,z 公司的第二类别的顺序。 Putting the code and the screenshot of the chart below.将代码和图表的屏幕截图放在下面。

Code:代码:

import pandas as pd
from helper_functions import get_df
import plotly.graph_objects as go
from datetime import datetime

def multicat_chart(infile=None, sheet_name=None, chart_type = None, chart_title = None):
    
    #chart type must be given
    df=pd.read_excel(infile,sheet_name)
    df = df.fillna(method='ffill')
    cat = df.columns[0]
    sub_cat = df.columns[1]
    cols = df.columns[2:]
    fig = go.Figure()
    cats = []
    sub_cats = []
    
    for c in df[cat].unique():
        new_df = df.loc[df[cat] == c]
        scats = new_df[sub_cat]
        scats = scats.apply(lambda date: datetime.strptime(date, "%b-%Y"))
        scats = list(scats)
        scats.sort()

        scats = [datetime.strftime(element, '%b-%y') for element in scats]
        scats = [str(element) for element in scats]
        for sc in scats:
            cats.append(str(c))
            sub_cats.append(str(sc))
        print(c)
        for i in scats:
            print(i)

    fig.add_trace( go.Bar(x = [cats,sub_cats],y = df[cols[0]], name="# people got interviewed" ))
    fig.add_trace( go.Bar(x = [cats,sub_cats],y = df[cols[1]], name="# people employed" ))
    fig.update_layout(width = 1000, height = 1000)
    return fig
    

fig = multicat_chart(infile = 'data_for_test.xlsx', sheet_name = 'data', chart_type = 'bar')
fig.show()

Chart:图表: 在此处输入图像描述

I gave the data to the Bar() function in ordered way but it mixes somehow, I would like to have in ascending order, what I did I convert string to datetime object and then sorted all subcategory data with the sort() function of list, and converted back to string.我以有序的方式将数据提供给 Bar() function 但它以某种方式混合,我想按升序排列,我所做的我将字符串转换为日期时间 object 然后使用列表的 sort() function 对所有子类别数据进行排序, 并转换回字符串。 And By running the script you can notice that it prints in the right order, it means that it is given ordered to function, but it mixes, who can help me to understand why it behaves so?通过运行脚本,您可以注意到它以正确的顺序打印,这意味着它被命令为 function,但它混合了,谁能帮助我理解为什么它会这样?

Once you made the dates month-year, they were object type--character strings, not dates—as in date-type.一旦你将日期设为月年,它们就是object类型——字符串,而不是日期——如日期类型。 When you sorted, you sorted by calendar month.排序时,您按日历月排序。

First, use strptime to make it a date, sort it, then use strftime .首先,使用strptime使其成为日期,对其进行排序,然后使用strftime

import pandas as pd
import plotly.graph_objects as go
from datetime import datetime as dt

def multicat_chart(infile=None, sheet_name=None, chart_type = None, chart_title = None):
    
    #chart type must be given
    df = pd.read_excel(infile, sheet_name)
    df = df.fillna(method='ffill')       # fill in companies

    df['month'] = [dt.strptime(x, '%b-%Y') for x in df['month']] # date for ordering
    df.sort_values(by = ['month', 'company'], inplace = True)    # appearance order
    df['month2'] = [dt.strftime(x, '%b-%Y') for x in df['month']] # visual appearance
    fig = go.Figure()  # plot it
    fig.add_trace( go.Bar(x = [df.iloc[:, 0], df.iloc[:, 4]], 
                          y = df.iloc[:, 2], name="# people got interviewed" ))
    fig.add_trace( go.Bar(x = [df.iloc[:, 0], df.iloc[:, 4]], 
                          y = df.iloc[:, 3], name="# people employed" ))
    fig.update_layout(width = 1000, height = 1000)
    return fig

在此处输入图像描述

在此处输入图像描述

You had a lot of extra work going on in your function;您在 function 中进行了很多额外的工作; I cut a lot of that out because you didn't need it.我删掉了很多,因为你不需要它。

If you have any questions, let me know!如果您有任何疑问,请告诉我!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM