繁体   English   中英

Pandas pivot 无聚合表形状

[英]Pandas pivot table shape without aggregation

我想了解是否可以将 DataFrame 塑造为多索引和多标题/多列(枢轴)DataFrame 而无需聚合,因为此聚合计算已经存在于我的 ZBA834BA05217A378E4Z1C 的列中。

我有以下 DataFrame:

card_type           payment_status  airbnb                                     paid revenue - sum   revenue - min   debit - sum
American Express    Checked Out     Premium Queen Ensuite                      No   591.49          0.0             2
American Express    Checked Out     Queen Room w. Shared Facilities            No   255.52          0.0             2
American Express    Checked Out     Single Room w. Shared Facilities           No   1602.02         0.0             5
American Express    Confirmed       Compact Double Room w. Shared Facilities   No   189.05          0.0             1
American Express    Confirmed       Premium Queen Ensuite                      No   350.0           0.0             1
American Express    Confirmed       Queen Room w. Shared Facilities            Yes  110.53          0.0             1
American Express    Confirmed       Single Room w. Shared Facilities           No   4258.48         0.0             3
Mastercard          Cancelled       Queen Room w. Shared Facilities            No   28.5            0.0             3
Mastercard          Cancelled       Single Room w. Shared Facilities           Yes  578.55          0.0             2
Mastercard          Checked Out     Compact Double Room w. Shared Facilities   No   4637.71         0.0             22

...

df = pd.DataFrame.from_dict({
    'card_type': {0: 'American Express', 1: 'American Express', 2: 'American Express', 3: 'American Express', 4: 'American Express', 5: 'American Express', 6: 'American Express', 7: 'Mastercard', 8: 'Mastercard', 9: 'Mastercard'},
    'payment_status': {0: 'Checked Out', 1: 'Checked Out', 2: 'Checked Out', 3: 'Confirmed', 4: 'Confirmed', 5: 'Confirmed', 6: 'Confirmed', 7: 'Cancelled', 8: 'Cancelled', 9: 'Checked Out'},
    'airbnb': {0: 'Premium Queen Ensuite ', 1: 'Queen Room w. Shared Facilities ', 2: 'Single Room w. Shared Facilities ', 3: 'Compact Double Room w. Shared Facilities ', 4: 'Premium Queen Ensuite ', 5: 'Queen Room w. Shared Facilities ', 6: 'Single Room w. Shared Facilities ', 7: 'Queen Room w. Shared Facilities ', 8: 'Single Room w. Shared Facilities ', 9: 'Compact Double Room w. Shared Facilities '},
    'paid': {0: 'No', 1: 'No', 2: 'No', 3: 'No', 4: 'No', 5: 'Yes', 6: 'No', 7: 'No', 8: 'Yes', 9: 'No'},
    'revenue - sum': {0: 591.49, 1: 255.52, 2: 1602.02, 3: 189.05, 4: 350.0, 5: 110.53, 6: 4258.48,7: 28.5, 8: 578.55, 9: 4637.71},
    'revenue - min': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0, 6: 0.0, 7: 0.0, 8: 0.0, 9: 0.0},
    'debit - sum': {0: 2, 1: 2, 2: 5, 3: 1, 4: 1, 5: 1, 6: 3, 7: 3, 8: 2, 9: 22}})

我已经使用这种方法(基于Pandas Pivot table without agregating )来实现(部分)我正在寻找的形状。 However, I would like to swap the aggfuncs label to the bottom (probably with https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.swaplevel.html ) and it doesn't feel right because my values之前已经计算过了,我们不需要再次计算:

df.pivot_table(index=["card_type", "payment_status"], columns=["airbnb", "paid"], values=["revenue - sum", "revenue - min", "debit - sum"], aggfunc={"revenue - sum": ["sum"], "revenue - min": ["max"], "debit - sum": ["mean"]}, fill_value="-")

我期望实现的是与此类似的 DataFrame: 在此处输入图像描述

有什么办法可以解决这个问题吗? 谢谢!

如果你已经计算了你的值,你可以使用:

  • 带有pivot_table aggfunc='first'fill_value='_'
  • pivotfillna('-')

对于您的列级别,使用reorder_levels而不是swaplevel使用输入顺序重新排列列级别:级别 [0, 1, 2] 到 [1, 2, 0]:

out = df.pivot(index=["card_type", "payment_status"],
               columns=["airbnb", "paid"],
               values=["revenue - sum", "revenue - min", "debit - sum"]) \
        .fillna('-').reorder_levels([1, 2, 0], axis=1)

Output:

>>> out
airbnb                          Premium Queen Ensuite  Queen Room w. Shared Facilities  Single Room w. Shared Facilities   ... Compact Double Room w. Shared Facilities  Queen Room w. Shared Facilities  Single Room w. Shared Facilities 
paid                                                No                               No                                No  ...                                        No                              Yes                               Yes
                                         revenue - sum                    revenue - sum                     revenue - sum  ...                               debit - sum                      debit - sum                       debit - sum
card_type        payment_status                                                                                            ...                                                                                                             
American Express Checked Out                    591.49                           255.52                           1602.02  ...                                         -                                -                                 -
                 Confirmed                       350.0                                -                           4258.48  ...                                       1.0                              1.0                                 -
Mastercard       Cancelled                           -                             28.5                                 -  ...                                         -                                -                               2.0
                 Checked Out                         -                                -                                 -  ...                                      22.0                                -                                 -

更新

我想通过以下方式再创建一个由值拆分产生的级别:“-”

由于您必须将某些列名称分成两部分,因此请使用不同的策略。 首先,移动一些列作为 dataframe 的索引,然后将剩余的列名称分解为多级。 最后,取消堆叠您的airbnbpaid索引级别,然后重新排列您的列级别的顺序:

out = df.set_index(['card_type', 'payment_status', 'airbnb', 'paid'])
out.columns = out.columns.str.split(' - ').map(tuple)
out = out.unstack(['airbnb', 'paid'], fill_value='-') \
         .reorder_levels([2, 3, 0, 1], axis=1)

Output:

>>> out
airbnb                          Compact Double Room w. Shared Facilities          Premium Queen Ensuite   ... Queen Room w. Shared Facilities  Single Room w. Shared Facilities       
paid                                                                   No     Yes                     No  ...                              Yes                                No   Yes
                                                                  revenue revenue                revenue  ...                            debit                             debit debit
                                                                      sum     sum                    sum  ...                              sum                               sum   sum
card_type        payment_status                                                                           ...                                                                         
American Express Checked Out                                            -       -                 591.49  ...                                -                                 5     -
                 Confirmed                                         189.05       -                  350.0  ...                                1                                 3     -
Mastercard       Cancelled                                              -       -                      -  ...                                -                                 -     2
                 Checked Out                                      4637.71       -                      -  ...                                -                                 -     -

[4 rows x 24 columns]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM