Create a column in a pandas DataFrame using the previously computed value

Question

I have for example the following input DataFrame:

> df = pandas.DataFrame({'x': [1, 6, 8, 5, 2, 6, 12]})
> df
    x
0   1
1   6
2   8
3   5
4   2
5   6
6  12

And I would like to create the column y such that:

y[i] = 0 if x < 4 ,

y[i] = 1 if x > 6

and y[i] = y[i - 1] if 4 <= x <= 6

So that with the example above the output would be:

What is the best way to do this? A simple apply() does not seem to work as I did not find a way to reference a previously computed value in the column that is being created by the apply() .

Answer 1

You may use np.select followed by .fillna :

>>> df['y'] = np.select([df['x'] < 4, 6 < df['x']], [0, 1], np.nan)
>>> df['y'] = df['y'].fillna(method='ffill').astype('int')
>>> df
    x  y
0   1  0
1   6  0
2   8  1
3   5  1
4   2  0
5   6  0
6  12  1

Create a column in a pandas DataFrame using the previously computed value

Question

1 answers

solution1
1 ACCPTED 2015-11-15 20:44:39

Create a column in a pandas DataFrame using the previously computed value

Question

1 answers

solution1 1 ACCPTED 2015-11-15 20:44:39

solution1
1 ACCPTED 2015-11-15 20:44:39