Starting from one single dataframe:
I,a,b,c,d,e,f
A,1,3,5,6,4,2
B,3,4,7,1,0,0
C,1,3,5,2,0,7
I would like to keep/mask the first three elements in the rows by value keeping the order of the columns, so that the resulting dataframe appears as:
I,a,b,c,d,e,f
A,0,0,5,6,4,0
B,3,4,7,0,0,0
C,0,3,5,0,0,7
So far I've been able to sort the dataframe with:
a = df.values
and
a.sort(axis=1)
so that:
[[1 1 2 3 4 5]
[0 0 1 1 3 4]
[0 1 1 3 5 7]]
obtaining a sorted numpy array, loosing information about the columns.
You can rank the values row-wise and then filter them and call fillna
:
In [248]:
df[df.rank(axis=1, method='min')>3].fillna(0)
Out[248]:
I a b c d e f
0 0 0 0 5 6 4 0
1 0 3 4 7 0 0 0
2 0 0 3 5 0 0 7
You can then concat this back to get the 'I' column back:
In [268]:
pd.concat([df['I'], df[df.rank(axis=1, method='min')>3].fillna(0)[df.columns[1:]]], axis=1)
Out[268]:
I a b c d e f
0 A 0 0 5 6 4 0
1 B 3 4 7 0 0 0
2 C 0 3 5 0 0 7
Output from intermediate dfs:
In [269]:
df.rank(axis=1, method='min')
Out[269]:
a b c d e f
0 1 3 5 6 4 2
1 4 5 6 3 1 1
2 2 4 5 3 1 6
In [270]:
df.rank(axis=1, method='min')>3
Out[270]:
a b c d e f
0 False False True True True False
1 True True True False False False
2 False True True False False True
I had a similar problem when I had to select n first truth values from a pd.Series object to use them as a mask to modify values in a pd.DataFrame . This is how I solved it
df = pd.DataFrame({'animal': ['alligator', 'bee', 'falcon', 'lion',
'monkey', 'parrot', 'shark', 'whale', 'zebra']})
ser = pd.Series([True,False,False,True,False,True])
df.loc[ser.nlargest(n=2).index, "animal"] = "new animal"
print(df)
animal
0 new animal
1 bee
2 falcon
3 new animal
4 monkey
5 parrot
6 shark
7 whale
8 zebra
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.