简体   繁体   English

如何从 Pandas pivot 表中获取 select 数据并用 0 填充缺失值?

[英]How to select data from Pandas pivot table and fill missing values with 0?

I need some guidance with pandas pivot table below:我需要一些关于 pandas pivot 表格的指导:

my code:我的代码:

def update_graph(Manager):
    if Manager == "All Managers":
        df_plot = df.copy()
    else:
       df_plot = df[df['Manager'] == Manager]

    pv = pd.pivot_table(
        df_plot,
        index=['Name'],
        columns=["Status"],
        values=['Quantity'],
        aggfunc=sum,
        fill_value=0)

myData.csv: myData.csv:

Account,Name,Manager,Quantity,Status
123,APAC,John,10,closed
1234,EMEA,Mike,4,open
12345,LATAM,Boris,2,escalated
123456,NAM,Jack,1,pending
123456,NAM,Mike,2,escalated
12345,LATAM,Sam,2,open

Data returned for 'All Managers':为“所有经理”返回的数据:

       Quantity                       
Status   closed escalated open pending
Name                                  
APAC         10         0    0       0
EMEA          0         0    4       0
LATAM         0         2    2       0
NAM           0         2    0       1

Data returned if 'Mike' selected as 'Manager':如果 'Mike' 被选为 'Manager' 则返回的数据:

        Quantity     
Status escalated open
Name                 
EMEA           0    4
NAM            2    0

Data won't display on my graph if I won't provide also 'pending' and 'closed' values for Mike's case.如果我不为 Mike 的案例提供“待定”和“关闭”值,数据将不会显示在我的图表上。 Could someone help me modify the pd.pivot_table so it captures Quantity as 0 for all missing Status (es)?有人可以帮我修改pd.pivot_table以便它将所有缺失Status (es)的Quantity捕获为 0 吗?

Expected:预期的:

       Quantity                       
Status   closed escalated open pending
Name                                  
APAC          0         0    0       0
EMEA          0         0    4       0
LATAM         0         0    0       0
NAM           0         2    0       0

Function with reindex: Function 重新索引:

def update_graph(Manager):
    if Manager == "All Managers":
        df_plot = df.copy()
    else:
        df_plot = df[df['Manager'] == Manager]

    pv = pd.pivot_table(
        df_plot,
        index='Name',
        columns='Status',
        values='Quantity',
        aggfunc=sum,
        fill_value=0)
    pv = pv.reindex(index=df['Name'].unique(), 
                     columns=df['Quantity'].unique(), 
                     fill_value=0)

Results after reindex:重新索引后的结果:

    Status  10  4   2   1 
Name                  
APAC     0   0   0   0
EMEA     0   0   0   0
LATAM    0   0   0   0
NAM      0   0   0   0

First remove [] from DataFrame.pivot_table for one element lists for avoid MultiIndex in columns and then use DataFrame.reindex by unique values of Name and Status columns of original DataFrame:首先从DataFrame.pivot_table中删除[]用于一个元素列表以避免MultiIndex in columns ,然后使用DataFrame.reindex通过原始 ZBA834BA059A9A3794459C112175EB8 的NameStatus列的唯一值:

def update_graph(Manager):
    if Manager == "All Managers":
        df_plot = df.copy()
    else:
       df_plot = df[df['Manager'] == Manager]


    pv = pd.pivot_table(
        df_plot,
        index='Name',
        columns="Status",
        values='Quantity',
        aggfunc=sum,
        fill_value=0)

    return pv.reindex(index=df['Name'].unique(), 
                       columns=df['Status'].unique(), 
                       fill_value=0)

print (update_graph('All Managers'))
Status  closed  open  escalated  pending
Name                                    
APAC        10     0          0        0
EMEA         0     4          0        0
LATAM        0     2          2        0
NAM          0     0          2        1

print (update_graph('John'))
Status  closed  open  escalated  pending
Name                                    
APAC        10     0          0        0
EMEA         0     0          0        0
LATAM        0     0          0        0
NAM          0     0          0        0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM