[英]Add a row in pandas dataframe for every date in another dateframe column
I have a dataframe that contains an entry for a symbol occasionally and then a count.我有一个 dataframe 偶尔包含一个符号条目,然后是一个计数。 I would like to expand the dataframe so that every symbol contains a row for the entire daterange in the dataframe.
我想扩展 dataframe 以便每个符号包含 dataframe 中整个日期范围的一行。 I want to enter a value of '0' for the count where there is no entry for a symbol on a certain date.
我想为在某个日期没有符号条目的计数输入一个值“0”。
My dataframe:我的 dataframe:
dates = ['2021-01-01','2021-01-02','2021-01-03']
symbol = ['a','b','a']
count = [1,2,3]
df = pd.DataFrame({'Mention Datetime': dates,
'Symbol': symbol,
'Count':count})
Mention Datetime Symbol Count
0 2021-01-01 a 1
1 2021-01-02 b 2
2 2021-01-03 a 3
what I want it to look like:我希望它看起来像什么:
Mention Datetime Symbol Count
0 2021-01-01 a 1
1 2021-01-02 a 0
2 2021-01-03 a 3
3 2021-01-01 b 0
4 2021-01-02 b 2
5 2021-01-03 b 0
Use pivot_table
then stack
:使用
pivot_table
然后stack
:
df = df.pivot_table(index='Mention Datetime',
columns='Symbol', fill_value=0
).stack().reset_index()
Output: Output:
Mention Datetime Symbol Count
0 2021-01-01 a 1
1 2021-01-01 b 0
2 2021-01-02 a 0
3 2021-01-02 b 2
4 2021-01-03 a 3
5 2021-01-03 b 0
You can reindex with a new multi index created from the unique values of the columns in question.您可以使用从相关列的唯一值创建的新多索引重新索引。
import pandas as pd
from io import StringIO
s = '''
Mention Datetime Symbol Count
2021-01-01 a 1
2021-01-02 b 2
2021-01-03 a 3
'''
df = pd.read_fwf(StringIO(s), header=1)
df = df.set_index(['Mention Datetime', 'Symbol'])
df
Count
Mention Datetime Symbol
2021-01-01 a 1
2021-01-02 b 2
2021-01-03 a 3
df = df.reindex(
pd.MultiIndex.from_product(
[
df.index.get_level_values('Mention Datetime').unique(),
df.index.get_level_values('Symbol').unique()
]
)
).fillna(0)
df
Count
Mention Datetime Symbol
2021-01-01 a 1.0
b 0.0
2021-01-02 a 0.0
b 2.0
2021-01-03 a 3.0
b 0.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.