I am an R programmer and looking for a similar way to do something like this in R:
data[data$x > value, y] <- 1
(basically, take all rows where the x column is greater than some value and assign the y column at those rows the value of 1)
In pandas it would seem the equivalent would go something like:
data['y'][data['x'] > value] = 1
But this gives a SettingWithCopyWarning.
Equivalent statements I've tried are:
condition = data['x']>value
data.loc(condition,'x')=1
But I'm seriously confused. Maybe I'm thinking too much in R terms and can't wrap my head around what's going on in Python. What would be equivalent code for this in Python, or workarounds?
Your statement is incorrect it should be:
data.loc[condition, 'x'] = 1
Example:
In [3]:
df = pd.DataFrame({'a':np.random.randn(10)})
df
Out[3]:
a
0 -0.063579
1 -1.039022
2 -0.011687
3 0.036160
4 0.195576
5 -0.921599
6 0.494899
7 -0.125701
8 -1.779029
9 1.216818
In [4]:
condition = df['a'] > 0
df.loc[condition, 'a'] = 20
df
Out[4]:
a
0 -0.063579
1 -1.039022
2 -0.011687
3 20.000000
4 20.000000
5 -0.921599
6 20.000000
7 -0.125701
8 -1.779029
As you are subscripting the df you should use square brackets []
rather than parentheses ()
which is a function call. See the docs
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.