![](/img/trans.png)
[英]New level in MultiIndex DataFrame based on existing column level values
[英]Filtering a multiindex dataframe based on column values dropping all rows inside level
我正在嘗試根據一個或多個值過濾 DataFrame。 這是一個示例 CSV:
AlignmentId,TranscriptId,classifier,value
ENSMUST00000025010-1,ENSMUST00000025010,AlnCoverage,0.99612
ENSMUST00000025010-1,ENSMUST00000025010,AlnIdentity,0.93553
ENSMUST00000025010-1,ENSMUST00000025010,Badness,0.06749
ENSMUST00000025014-1,ENSMUST00000025014,AlnCoverage,1.0
ENSMUST00000025014-1,ENSMUST00000025014,AlnIdentity,0.96382
ENSMUST00000025014-1,ENSMUST00000025014,Badness,0.03618
加載時:
>>> df = pd.read_csv('tmp.csv', index_col=['AlignmentId', 'TranscriptId'])
>>> df
classifier value
AlignmentId TranscriptId
ENSMUST00000025010-1 ENSMUST00000025010 AlnCoverage 0.99612
ENSMUST00000025010 AlnIdentity 0.93553
ENSMUST00000025010 Badness 0.06749
ENSMUST00000025014-1 ENSMUST00000025014 AlnCoverage 1.00000
ENSMUST00000025014 AlnIdentity 0.96382
ENSMUST00000025014 Badness 0.03618
我想刪除所有未通過一系列classifiers
AlignmentId
組。 對於這個例子,假設我想刪除ENSMUST00000025010
因為AlnCoverage < 1.0
。 因此,我想以這個數據框結束:
ENSMUST00000025014-1 ENSMUST00000025014 AlnCoverage 1.00000
ENSMUST00000025014 AlnIdentity 0.96382
ENSMUST00000025014 Badness 0.03618
我怎么能這樣做?
嘗試這個:
In [169]: df = df.drop(df[(df.classifier=='AlnCoverage') & (df.value < 1)].index)
In [170]: df
Out[170]:
classifier value
AlignmentId TranscriptId
ENSMUST00000025014-1 ENSMUST00000025014 AlnCoverage 1.00000
ENSMUST00000025014 AlnIdentity 0.96382
ENSMUST00000025014 Badness 0.03618
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.