简体   繁体   中英

Create a column in a pandas DataFrame using the previously computed value

I have for example the following input DataFrame:

> df = pandas.DataFrame({'x': [1, 6, 8, 5, 2, 6, 12]})
> df
    x
0   1
1   6
2   8
3   5
4   2
5   6
6  12

And I would like to create the column y such that:

y[i] = 0 if x < 4 ,

y[i] = 1 if x > 6

and y[i] = y[i - 1] if 4 <= x <= 6

So that with the example above the output would be:

    x  y
0   1  0
1   6  0
2   8  1
3   5  1
4   2  0
5   6  0
6  12  1

What is the best way to do this? A simple apply() does not seem to work as I did not find a way to reference a previously computed value in the column that is being created by the apply() .

You may use np.select followed by .fillna :

>>> df['y'] = np.select([df['x'] < 4, 6 < df['x']], [0, 1], np.nan)
>>> df['y'] = df['y'].fillna(method='ffill').astype('int')
>>> df
    x  y
0   1  0
1   6  0
2   8  1
3   5  1
4   2  0
5   6  0
6  12  1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM