Pandas Pivot Table with external columns

Question

I have a list with some dates, eg: dates_list=[201701, 201702, 201703, 201704]. This is a user input of desired dates for a specific report.

And I have a database with three columns: id, date and value.

My database, sometimes, doesn't have records for all dates asked by user(eg: it has only records for 201701 and 201702). df is my database. I have this command:

raw = pd.pivot_table(df, index=['id'],
                         columns=['date'], values=['value'],
                         aggfunc=[np.sum], fill_value=0, margins=False)

Which, of course, will return a pivot table with only two columns: 201701 and 201702.

I want to know if it is possible to use dates_list as columns labels at pivot table construction, in order to return a column full of zeros for 201703 and 201704. If it is not possible, someone know the best approach for this problem?

Thanks in advance

Sample data:

 df = pd.DataFrame({'id':[1,1,2,1,2],
                    'date': [201701,201701,201701,201702,201702],
                    'value': [0.04, 0.02, 0.07, 0.08, 1.0]})
 df

     date  id  value
0  201701   1   0.04
1  201701   1   0.02
2  201701   2   0.07
3  201702   1   0.08
4  201702   2   1.00

raw = pd.pivot_table(df, index=['id'], columns=['date'], values=['value'],
                     aggfunc=[np.sum], fill_value=0, margins=False)

        sum
  value
date 201701 201702
id
1      0.06   0.08
2      0.07   1.00

date_list = [201701, 201702, 201703, 201704]

raw.reindex(columns=date_list, fill_value=0)

And I got ValueError: Buffer dtype mismatch, expected 'Python object' but got 'long long'

Answer 1

You can do reindex after pivot_table

pd.pivot_table(df, index=['id'],
                         columns=['date'], values=['value'],
                         aggfunc=[np.sum], fill_value=0, margins=False).\
    reindex(columns=[yourlist],fill_value=0)

Update

pd.pivot_table(df, index='id', columns='date', values='value',aggfunc='sum', fill_value=0, margins=False).reindex(columns=[201701,201702,201703])
Out[115]: 
date  201701  201702  201703
id                          
1       0.06    0.08     NaN
2       0.07    1.00     NaN

Pandas Pivot Table with external columns

Question

1 answers

solution1
2 ACCPTED 2018-03-15 16:27:55

Pandas Pivot Table with external columns

Question

1 answers

solution1 2 ACCPTED 2018-03-15 16:27:55

solution1
2 ACCPTED 2018-03-15 16:27:55