![](/img/trans.png)
[英]Pandas/Python: Set value of one column based on value in another column
[英]Set increasing value of one column based on value in another column in pandas dataframe
我有一个看起来像这样的熊猫数据框:
import pandas as pd
d = {'date': ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05', '2021-01-06', '2021-01-07', '2021-01-08', '2021-01-09', '2021-01-10']}
df = pd.DataFrame(data=d)
df
date
0 2021-01-01
1 2021-01-02
2 2021-01-03
3 2021-01-04
4 2021-01-05
5 2021-01-06
6 2021-01-07
7 2021-01-08
8 2021-01-09
9 2021-01-10
我想在这个 df 中添加一个新列out
以指示婚礼何时以值 0 开始。开始日期之后的行应该增加,而开始日期之前的行应该减少。 例如,如果婚礼在 '2021-01-05' 开始,我想要的输出是这样的:
date out
0 2021-01-01 -5
1 2021-01-02 -4
2 2021-01-03 -3
3 2021-01-04 -2
4 2021-01-05 -1
5 2021-01-06 0
6 2021-01-07 1
7 2021-01-08 2
8 2021-01-09 3
9 2021-01-10 4
正在做
df['out'] = (pd.to_datetime(df.date) - pd.to_datetime('2021-01-06')).dt.days
Out[20]:
0 -5
1 -4
2 -3
3 -2
4 -1
5 0
6 1
7 2
8 3
9 4
Name: date, dtype: int64
如果您的date
列不重复,您可以尝试
df['time'] = range(len(df))
df['time'] = df['time'] - df.set_index('date').loc['2021-01-06', 'time']
print(df)
date time
0 2021-01-01 -5
1 2021-01-02 -4
2 2021-01-03 -3
3 2021-01-04 -2
4 2021-01-05 -1
5 2021-01-06 0
6 2021-01-07 1
7 2021-01-08 2
8 2021-01-09 3
9 2021-01-10 4
或者
df['time'] = df.index.values - df['date'].tolist().index('2021-01-06')
您可以使用 cumcount() 获取发布的预期输出
import pandas as pd
d = {'date': ['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05', '2021-01-06', '2021-01-07', '2021-01-08', '2021-01-09', '2021-01-10']}
df = pd.DataFrame(data=d)
df['Control'] = 1
date_lookback_location = df.loc[df['date'] == '2021-01-06'].index.tolist()[0]
df['time'] = df.sort_values(['date'], ascending=True).groupby(['Control']).cumcount() -date_lookback_location
df[['date', 'time']]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.