[英]Maintain a default of True/False for each date
During the day, new investment possibilities are registered, but the results ( lay
column) are only registered at midnight each day.白天,新的投资可能性被登记,但结果( lay
栏)仅在每天午夜登记。
So let's assume this CSV
:所以让我们假设这个CSV
:
clock_now,competition,market_name,lay
2022/12/30,A,B,-1
2022/12/31,A,B,1.28
2023/01/01,A,B,-1
2023/01/02,A,B,1
2023/01/03,A,B,1
2023/01/04,A,B,
2023/01/04,A,B,
2023/01/04,A,B,
Until yesterday, 2023/01/03
, the sum of the lines that have the value A
in competition
and B
in market_name
, was +1.28
直到昨天, 2023/01/03
, competition
中值为A
和market_name
中值为B
的行的总和为+1.28
I only invest if it is above 0
, so during today, every time this combination of values comes, the answer will be True
to invest.我只在它高于0
时才投资,所以在今天,每次出现这种价值组合时,答案都是True
投资。
At the end of the day, when the lay values are registered, I look at the total result:归根结底,当登记外行价值时,我会查看总结果:
clock_now,competition,market_name,lay
2022/12/30,A,B,-1
2022/12/31,A,B,1.28
2023/01/01,A,B,-1
2023/01/02,A,B,1
2023/01/03,A,B,1
2023/01/04,A,B,-1
2023/01/04,A,B,-1
2023/01/04,A,B,-1
End of the day: -1,72
当天结束: -1,72
This means that tomorrow, if that same combination of values appears in the columns, I will not invest once because it will always be negative because it only calculates the values that it has until the previous day.这意味着明天,如果相同的值组合出现在列中,我将不会投资一次,因为它总是负数,因为它只计算前一天之前的值。
I'm trying to create a column to show where it was True and where it was False:我正在尝试创建一个列来显示它在哪里是真的,哪里是假的:
df = pd.read_csv('example.csv')
combinations = [['market_name', 'competition']]
for cbnt in combinations:
df['invest'] = (df.groupby(cbnt)['lay']
.apply(lambda s: s.cumsum().shift())
.gt(df['lay'])
)
df['cumulative'] = (df.groupby(cbnt)['lay']
.apply(lambda s: s.cumsum().shift())
)
print(df[['clock_now','invest','cumulative']])
But the result is this:但结果是这样的:
clock_now invest cumulative
0 2022/12/30 False NaN
1 2022/12/31 False -1.00
2 2023/01/01 True 0.28
3 2023/01/02 False -0.72
4 2023/01/03 False 0.28
5 2023/01/04 True 1.28
6 2023/01/04 True 0.28
7 2023/01/04 True -0.72
The expected result would be this:预期的结果是这样的:
clock_now invest cumulative
0 2022/12/30 False NaN
1 2022/12/31 False -1.00
2 2023/01/01 True 0.28
3 2023/01/02 False -0.72
4 2023/01/03 True 0.28
5 2023/01/04 True 1.28
6 2023/01/04 True 0.28
7 2023/01/04 True -1.72
How should I proceed so that cumsum
can understand that attention must be paid to maintaining a daily pattern according to the results of previous days?我应该如何进行才能让cumsum
明白必须注意根据前几天的结果保持每天的模式?
Example Two:例子二:
clock_now,competition,market_name,lay
2022/08/09,A,B,-1.0
2022/08/12,A,B,1.28
2022/09/07,A,B,-1.0
2022/10/15,A,B,1.0
2022/10/15,A,B,-1.0
2022/11/20,A,B,1.0
Note that on 2022/10/15
, it is delivering one False
and one True
, so in fact it is not tracking according to the date which is how I want it to happen:请注意,在2022/10/15
,它提供了一个False
和一个True
,所以实际上它没有根据我希望它发生的日期进行跟踪:
clock_now invest cumulative
0 2022/08/09 False NaN
1 2022/08/12 False -1.00
2 2022/09/07 True 0.28
3 2022/10/15 False -0.72
4 2022/10/15 True 0.28
5 2022/11/20 False -0.72
The correct would be always or all False
or all True
when on equal dates.在相同的日期,正确的总是或全为False
或全为True
。 Like this:像这样:
clock_now invest cumulative
0 2022/08/09 False NaN
1 2022/08/12 False -1.00
2 2022/09/07 True 0.28
3 2022/10/15 False -0.72
4 2022/10/15 False 0.28
5 2022/11/20 False -0.72
(df.join(
# Count market&competition specific cumsum for each row
# and join back with df
df.groupby(['market_name', 'competition']).lay.cumsum().rename('lay_cumsum') > 0
)
# Group by market&comp&date to get last cumsum within each day
.groupby(['market_name', 'competition', 'clock_now'])
# Get cumsum Series for each group
.lay_cumsum
# Getting last cumsum within group
.last()
# Group by market&comp
.groupby(['market_name', 'competition'])
# Shift by one to assign to each date prev date's cumsum
.shift(1)
.rename('lay_cumsum')
.reset_index()
# Merge back with original df
.merge(df, on=['clock_now', 'market_name', 'competition']))
This will output这将 output
market_name competition clock_now lay_cumsum lay
0 B A 2022/12/30 NaN -1.00
1 B A 2022/12/31 False 1.28
2 B A 2023/01/01 True -1.00
3 B A 2023/01/02 False 1.00
4 B A 2023/01/03 True 1.00
5 B A 2023/01/04 True -1.00
6 B A 2023/01/04 True -1.00
7 B A 2023/01/04 True -1.00
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.