[英]How to filter column names from multiindex dataframe for a specific condition?
df1 = pd.DataFrame(
{
"empid" : [1,2,3,4,5,6],
"empname" : ['a', 'b','c','d','e','f'],
"empcity" : ['aa','bb','cc','dd','ee','ff']
})
df1
df2 = pd.DataFrame(
{
"empid" : [1,2,3,4,5,6],
"empname" : ['a', 'b','m','d','n','f'],
"empcity" : ['aa','bb','cc','ddd','ee','fff']
})
df2
df_all = pd.concat([df1.set_index('empid'),df2.set_index('empid')],axis='columns',keys=['first','second'])
df_all
df_final = df_all.swaplevel(axis = 'columns')[df1.columns[1:]]
df_final
orig = df1.columns[1:].tolist()
print (orig)
['empname', 'empcity']
df_final = (df_all.stack()
.assign(comparions=lambda x: x['first'].eq(x['second']))
.unstack()
.swaplevel(axis = 'columns')
.reindex(orig, axis=1, level=0))
print (df_final)
如何从 dataframe df_final 中过滤级别 [0] 列名称列表,其中比较 = False(考虑在级别 0 有超过 300 个这样的列)
首先通过DataFrame.xs
和DataFrame.all
测试级别comparions
是否都是True
:
s = df_final.xs('comparions', level=1, axis=1).all()
然后反转掩码以测试至少一个带有过滤器索引的False
:
L = s.index[~s].tolist()
print (L)
['empname', 'empcity']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.