I want to fill the matrix ref
using pd.DataFrame
xxx
but skip the NaN
.
print xxx
OUT >>
intensity name rowtype1 rowtype2
0 100 A 1 4.0
1 200 A 2 NaN
2 300 B 3 5.0
Then I fill the matrix by ref[rowtype,col] = intensity
where I have 2 rowtype
.
ref = np.zeros(shape=(7,4))
for idx, inte, name, r1, r2 in xxx.itertuples():
ref[r1,idx] = inte
ref[r2,idx] = inte # error because of NaN in rowtype2
print ref
How can I skip NaN
here? I know one way to use drop.na()
but it has to create new dataframe which has rowtype2
and intensity
. I would like to have quick simple way, like just jump across NaN
with intensity = 200
to next rowtype2 = 5
with intensity = 300
.
Additional info:
1) Here is how to create xxx
prot = ['A','A','B']
calc_m = [1,2,3]
calc_m2 = [4, np.nan,5]
inte = [100,200,300]
xxx = pd.DataFrame({'name' : pd.Series(prot),
'rowtype1': pd.Series(calc_m),
'rowtype2': pd.Series(calc_m2),
'intensity': pd.Series(inte)
})
You could use this option using melt
, and then setting the index of ref
using numpy's indexing vs. using a for loop
set = xxx.reset_index().melt(['intensity','index'],['rowtype1','rowtype2']).dropna()
ref[set.value.astype(int).values,set['index'].values] = set.intensity.values
which gives you
array([[ 0., 0., 0., 0.],
[ 100., 0., 0., 0.],
[ 0., 200., 0., 0.],
[ 0., 0., 300., 0.],
[ 100., 0., 0., 0.],
[ 0., 0., 300., 0.],
[ 0., 0., 0., 0.]])
I'm not sure I fully understand what behavior you are looking for, but the pandas dropna() command has the "subset" argument... for example, dropping all rows with NaN in the rowtype2 column could be done with
xxx.dropna(subset=['rowtype2'],inplace=True)
That way, you would drop only rows with NaN in the rowtype2 column.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.