I'm a bit of noob with pandas and I'm trying to perform some calculations and modifications on parts of a masked dataframe using apply
. The part I want to operate on is defined by my mask and I don't want to modify any non-masked values.
The thing is that I have no idea what is the proper way to put the result of the apply
call on the masked dataframe back where it belongs in the original dataframe (or a copy of it, doesn't matter).
Here is a toy example of what I'm struggling with, I will try to make all values in the A
column negative using a mask and apply:
import pandas as pd
import numpy as np
def make_df():
np.random.seed(4)
df = pd.DataFrame(np.random.randn(5, 2),columns=["A","B"])
return df
df = make_df()
mask = (df["A"]>0)
print(df)
A B
0 0.050562 0.499951
1 -0.995909 0.693599
2 -0.418302 -1.584577
3 -0.647707 0.598575
4 0.332250 -1.147477
The expected result is this :
A B
0 -0.050562 0.499951
1 -0.995909 0.693599
2 -0.418302 -1.584577
3 -0.647707 0.598575
4 -0.332250 -1.147477
What I hoped would work was this :
df = make_df()
df[mask]["A"] = df[mask]["A"].apply(lambda v: -v)
print(df)
A B
0 0.050562 0.499951
1 -0.995909 0.693599
2 -0.418302 -1.584577
3 -0.647707 0.598575
4 0.332250 -1.147477
But it fails with pandas warning me that df[mask]["A"]
is a copy not a view so modifications on it do not affect df
.
Try to use loc[]
:
In [11]: df.loc[mask, 'A'] *= -1
In [12]: df
Out[12]:
A B
0 -0.050562 0.499951
1 -0.995909 0.693599
2 -0.418302 -1.584577
3 -0.647707 0.598575
4 -0.332250 -1.147477
你可以试试:
df.loc[df['A'] > 0,'A'] = -df.A
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.