简体   繁体   中英

Add a row in pandas dataframe for every date in another dateframe column

I have a dataframe that contains an entry for a symbol occasionally and then a count. I would like to expand the dataframe so that every symbol contains a row for the entire daterange in the dataframe. I want to enter a value of '0' for the count where there is no entry for a symbol on a certain date.

My dataframe:

dates = ['2021-01-01','2021-01-02','2021-01-03']
symbol = ['a','b','a']
count = [1,2,3]
df = pd.DataFrame({'Mention Datetime': dates,
                'Symbol': symbol,
                'Count':count})


    Mention Datetime    Symbol  Count
0   2021-01-01  a   1
1   2021-01-02  b   2
2   2021-01-03  a   3

what I want it to look like:

Mention Datetime    Symbol  Count
0   2021-01-01  a   1
1   2021-01-02  a   0
2   2021-01-03  a   3
3   2021-01-01  b   0
4   2021-01-02  b   2
5   2021-01-03  b   0

Use pivot_table then stack :

df = df.pivot_table(index='Mention Datetime',
                    columns='Symbol', fill_value=0
                    ).stack().reset_index()

Output:

  Mention Datetime Symbol  Count
0       2021-01-01      a      1
1       2021-01-01      b      0
2       2021-01-02      a      0
3       2021-01-02      b      2
4       2021-01-03      a      3
5       2021-01-03      b      0

You can reindex with a new multi index created from the unique values of the columns in question.

import pandas as pd
from io import StringIO

s = '''
Mention Datetime    Symbol  Count
2021-01-01          a       1
2021-01-02          b       2
2021-01-03          a       3
'''

df = pd.read_fwf(StringIO(s), header=1)
df = df.set_index(['Mention Datetime', 'Symbol'])
df
                            Count
Mention Datetime    Symbol  
2021-01-01          a       1
2021-01-02          b       2
2021-01-03          a       3

df = df.reindex(
    pd.MultiIndex.from_product(
        [
        df.index.get_level_values('Mention Datetime').unique(), 
        df.index.get_level_values('Symbol').unique()
        ]
    ) 
).fillna(0)

df
                            Count
Mention Datetime    Symbol  
2021-01-01          a       1.0
                    b       0.0
2021-01-02          a       0.0
                    b       2.0
2021-01-03          a       3.0
                    b       0.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM