简体   繁体   中英

New row based on other's row past value to current value

I'm trying to create a new column called move in df that gives the value of 1 if the value in x is higher than its previous value and a 0 if the value is lower , so the first value in move should be a NaN .

d = {'x': [1, 0, 2, 5, 4]}
df = pandas.DataFrame(d)

The column should look like this:

df['move'] = pandas.Series([NaN, 0, 1, 1, 0])

You can compare using shift with a slice of the column using iloc and cast the boolean series to numeric dtype using astype :

In [82]:
df['move'] = (df['x'].iloc[1:] > df['x'].iloc[1:].shift()).astype(int)
df

Out[82]:
   x  move
0  1   NaN
1  0   0.0
2  2   1.0
3  5   1.0
4  4   0.0

Note that the presence of NaN forces the dtype to be float here

I think you need compare with shift ed values in column x and last you can change first value to NaN (if necessary):

df['move'] = (df.x > df.x.shift()).astype(int)
df.ix[0, 'move'] = np.nan
print (df)
   x  move
0  1   NaN
1  0   0.0
2  2   1.0
3  5   1.0
4  4   0.0

Timings :

len(df)=50k :

In [82]: %timeit (edch(df1))
100 loops, best of 3: 3.99 ms per loop

In [83]: %timeit (jez(df))
1000 loops, best of 3: 1.44 ms per loop

Code for timings :

d = {'x': [1, 0, 2, 5, 4]}
df = pd.DataFrame(d)
df = pd.concat([df]*10000).reset_index(drop=True)
df1 = df.copy()

def jez(df):
    df['move'] = (df.x > df.x.shift()).astype(int)
    df.ix[0, 'move'] = np.nan
    return df

def edch(df):
    df['move'] = (df['x'].iloc[1:] > df['x'].iloc[1:].shift()).astype(int)
    return df

print (jez(df))
print (edch(df1))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM