简体   繁体   English

使用Pandas将当天的第一个值分配给当天的其余行

[英]Assign first value in the day to the rest of the rows for that day using Pandas

Please, I have a pandas dataframe containing intraday data for 2 stocks. 请给我一个熊猫数据框,其中包含2只股票的日内数据。 The index is a time series sampled by minute (ie 1/1/2017 9:30, 1/1/2017 9:31, 1/1/2017 9:32, ...). 该索引是按分钟采样的时间序列(即1/1/2017 9:30、1 / 1/2017 9:31、1 / 1/2017 9:32等)。 There are only two columns "Price A", "Price B". 只有两列“价格A”,“价格B”。 Total number of rows = 52000. I need to create a new column in which I store the 9.30 am value for every day. 总行数=52000。我需要创建一个新列,其中每天存储9.30 am的值。 Assuming for 1/1/2017, the 9:30 am "Price A" is 150, I would need to store this value in a new column called "Open A" for every row that has the same day. 假设在2017年1月1日上午9:30,“价格A”为150,那么我需要针对具有同一天的每一行,将该值存储在名为“打开A”的新列中。 For example: 例如: 在此处输入图片说明

Sample input: 输入样例:

                     Price A  Price B
date                                 
2017-01-01 09:30:00      150        1
2017-01-01 09:31:00      153        2
2017-01-01 09:31:00      149        3
2017-01-01 09:31:00      151        4
2017-02-01 09:30:00      145        1
2017-02-01 09:31:00      139        2
2017-02-01 09:31:00      142        3
2017-02-01 09:31:00      149        4

I tried to simply use: 我试图简单地使用:

for ind in df.index: df['Open A'][ind] = 2 对于df.index中的ind:df ['Open A'] [ind] = 2

just to make a test but this seems to be taking forever. 只是为了进行测试,但这似乎是永远的。 I also tried to read what's available here: How to iterate over rows in a DataFrame in Pandas? 我还尝试阅读此处提供的内容: 如何在Pandas的DataFrame中的行上进行迭代? but it doesn't seem to be of help. 但这似乎没有帮助。 does anybody have a suggestion? 有人有建议吗? Thanks 谢谢

If needed, set your index to datetime - 如果需要,将索引设置为datetime

df.index = pd.to_datetime(df.index, errors='coerce')

df

                     Price A  Price B
date                                 
2017-01-01 09:30:00      150        1
2017-01-01 09:31:00      153        2
2017-01-01 09:31:00      149        3
2017-01-01 09:31:00      151        4
2017-02-01 09:30:00      145        1
2017-02-01 09:31:00      139        2
2017-02-01 09:31:00      142        3
2017-02-01 09:31:00      149        4

An assumption here is that your day's recordings start at 9:30 , making our job really easy. 假设您一天的录音从9:30开始,这使我们的工作非常轻松。

Use groupby with a pd.Grouper + transform + first - groupbypd.Grouper + transform + first -

df['Open A'] = df.groupby(pd.Grouper(freq='1D'))['Price A'].transform('first')    
df

                     Price A  Price B  Open A
date                                         
2017-01-01 09:30:00      150        1     150
2017-01-01 09:31:00      153        2     150
2017-01-01 09:31:00      149        3     150
2017-01-01 09:31:00      151        4     150
2017-02-01 09:30:00      145        1     145
2017-02-01 09:31:00      139        2     145
2017-02-01 09:31:00      142        3     145
2017-02-01 09:31:00      149        4     145

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM