簡體   English   中英

Pandas - 根據另一列的條件值創建新列

[英]Pandas - create new column based on conditional value of another column

我想根據列df['col2']的前一行的值創建一個新列df['indexed'] ] 。 除非,如果在df['col2']列的行中不是"x" (在本例中為字符串 - 日期),我希望在df['indexed']中設置100 因此,如果df['col2']不是"x" ,我希望每次都以100的值開始"indexed"列。

import pandas as pd
d = {'col1': [0.02,0.12,-0.1,0-0.07,0.01,0.02,0.12,-0.1,0-0.07,0.01],
     'col2': ['x','x','x','2021-60-30','x','x','x','x','x','x']}
df = pd.DataFrame(data=d)
df['col1'] = df['col1']+1
df['indexed'] = 0
df['indexed'].iloc[0] = 100 #to set a start

#what i tried:
for index, row in df.iterrows():
    if row['col2'] == 'x':
        df['indexed']= df['col1'] * df['indexed'].shift(1)
    else:
        df['indexed']= 100

我預計:

在此處輸入圖像描述

您可以使用where

df['indexed'] = (df['col1'] * df['col1'].shift(1)).where(df['col2']=='x', 100)
df

Output:


   col1        col2   indexed
0  1.02           x       NaN
1  1.12           x    1.1424
2  0.90           x    1.0080
3  0.93  2021-60-30  100.0000
4  1.01           x    0.9393
5  1.02           x    1.0302
6  1.12           x    1.1424
7  0.90           x    1.0080
8  0.93           x    0.8370
9  1.01           x    0.9393

更新如果要從col2中的每個非x值開始計算累積產品:

g = df.groupby(df['col2'].ne('x').cumsum())['col1']
df['indexed'] = g.cumprod() / g.transform('first') * 100

Output:

   col1        col2     indexed
0  1.02           x  100.000000
1  1.12           x  112.000000
2  0.90           x  100.800000
3  0.93  2021-60-30  100.000000
4  1.01           x  101.000000
5  1.02           x  103.020000
6  1.12           x  115.382400
7  0.90           x  103.844160
8  0.93           x   96.575069
9  1.01           x   97.540819

您是否嘗試過 apply 方法並且只使用您自己的 function:

def my_funct(row)

    if row['col2'] == 'x':
       row['indexed']= row['col1'] * row['col1'].shift(1)
    else:
       row['indexed']= 100

接着:

df= df.apply(my_funct, axis=1)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM