簡體   English   中英

從 groupedby one 的分組列創建數據框

[英]Create a dataframe from the grouped columns of a groupedby one

我有一個 groupedby 數據框,並希望為回歸做准備。

>>>ts = sales.groupby(['date_block_num','shop_id'])['item_cnt_day'].sum()
date_block_num  shop_id
0               0          5578.0
                1          2947.0
                2          1146.0
                3           767.0
                4          2114.0
                            ...  
33              1         1972.0
                2         1263.0
                3         2316.0
                4         1446.0
                5          790.0

我想得到:

shop_id         1    ...    33 
0          5578.0       1972.0
1          2947.0       1263.0
2          1146.0       2316.0
3           767.0       1446.0
4          2114.0        790.0

能夠創建如下所示的點雲

圖像描述簡介

直到今天我試過:

pd.DataFrame(columns=ts.date_block_num, index= ts.shop_id, data=ts)

但它返回:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-30-82ae77e8cda8> in <module>()
----> 1 pd.DataFrame(columns=ts.date_block_num, index= ts.shop_id, data=ts)

/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in __getattr__(self, name)
   5137             if self._info_axis._can_hold_identifiers_and_holds_name(name):
   5138                 return self[name]
-> 5139             return object.__getattribute__(self, name)
   5140 
   5141     def __setattr__(self, name: str, value) -> None:

AttributeError: 'Series' object has no attribute 'date_block_num'

原始數據是這樣的:

>>> sales
    date    date_block_num  shop_id item_id item_price  item_cnt_day
0   2013-01-02  0   59  22154   999.00  1.0
1   2013-01-03  0   25  2552    899.00  1.0
2   2013-01-05  0   25  2552    899.00  -1.0
3   2013-01-06  0   25  2554    1709.05 1.0
4   2013-01-15  0   25  2555    1099.00 1.0
...

我認為您正在尋找unstack

out = (sales.groupby(['date_block_num','shop_id'])
           ['item_cnt_day'].sum()
           .unstack('date_block_num')
      )

你也可以這樣做:

out = sales.pivot_table(index='shop_id', 
                        columns='dat_block_num',
                        values='item_cnt_day',
                        aggfunc='sum')

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM