I'm quite seasoned in R and now learning Python by trying to 'translate' an existing series of scripts from R to Python ( df
is a pandas DataFrame). I'm stuck at this line :
df[df$id != df$id_old, c("col1", "col2")] <- NA
Ie I'm trying to fill NA values in specific rows / columns. I've been trying different things, the most promising route seemed to be
index = np.where(df.id != df.id_old)
df.col1[index] = np.repeat(np.nan, np.size(index))
But this throws the following error at the second line (don't fully understand this).
Can only tuple-index with a MultiIndex
What would be the cleanest way to achieve my objective?
Example :
df = pd.DataFrame({'id' : [1, 1, 1, 2, 2, 3, 4, 4, 4, 4, 5, 5],
'id_old' : [1, 1, 2, 2, 3, 4, 4, 4, 4, 5, 5, 5],
'col1' : np.random.normal(size = 12),
'col2' : np.random.randint(low = 20, high = 50, size = 12),
'col3' : np.repeat('other info', 12)})
print(df)
Output :
id id_old col1 col2 col3
0 1 1 0.320982 31 other info
1 1 1 0.398855 42 other info
2 1 2 -0.664073 30 other info
3 2 2 1.428694 48 other info
4 2 3 -1.240363 49 other info
5 3 4 0.023167 42 other info
6 4 4 -0.645114 44 other info
7 4 4 -1.033602 47 other info
8 4 4 0.295143 27 other info
9 4 5 0.531660 32 other info
10 5 5 -0.787401 33 other info
11 5 5 2.033503 48 other info
Expected result :
id id_old col1 col2 col3
0 1 1 0.320982 31 other info
1 1 1 0.398855 42 other info
2 1 2 NaN NaN other info
3 2 2 1.428694 48 other info
4 2 3 NaN NaN other info
5 3 4 NaN NaN other info
6 4 4 -0.645114 44 other info
7 4 4 -1.033602 47 other info
8 4 4 0.295143 27 other info
9 4 5 NaN NaN other info
10 5 5 -0.787401 33 other info
11 5 5 2.033503 48 other info
use .loc
and pass a list where in R you would do c(...)
loc
allows to do in-place assignment.
example:
df.loc[df.id!=df.id_old, ['col1', 'col2']] = np.nan
outputs:
col1 col2 col3 id id_old
0 2.411473 31.0 other info 1 1
1 0.874083 43.0 other info 1 1
2 NaN NaN other info 1 2
3 2.156903 20.0 other info 2 2
4 NaN NaN other info 2 3
5 NaN NaN other info 3 4
6 0.933760 22.0 other info 4 4
7 -1.239806 42.0 other info 4 4
8 -0.493344 41.0 other info 4 4
9 NaN NaN other info 4 5
10 -0.751290 30.0 other info 5 5
11 0.327527 31.0 other info 5 5
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.