[英]Adding level to middle of DF in Pandas
I would like to add a new level to my DF (so that then I can use pd.reindex
to do something else). 我想向我的DF添加一个新级别(这样我就可以使用
pd.reindex
来做其他事情)。 My DF basically has something like this: 我的DF基本有这样的东西:
df = pd.DataFrame({('A','a'): [-1,-1,0,10,12],
('A','b'): [0,1,2,3,-1],
('A','c'): [-1,1,0,10,12],
('A','d'): [1,1,2,3,-1],
('B','a'): [-20,-10,0,10,20],
('B','b'): [-200,-100,0,-1,200],
('B','c'): [-20,-10,0,10,20],
('B','d'): [-200,-100,0,100,200]
})
##df
A B
a b c d a b c d
0 -1 0 1 1 -20 -200 -20 -200
1 -1 1 -1 1 -10 -100 -10 -100
2 0 2 0 2 0 0 0 0
3 10 3 10 3 10 -1 10 100
4 12 -1 12 -1 20 200 20 200
I want to assign new level keys L1
= a
+ b
, and L2
= c
+ d
. 我想分配新的级别键
L1
= a
+ b
和L2
= c
+ d
。 How do I do this? 我该怎么做呢?
The desired output would be 所需的输出将是
##df
A B
L1 L2 L1 L2
a b c d a b c d
0 -1 0 1 1 -20 -200 -20 -200
1 -1 1 -1 1 -10 -100 -10 -100
2 0 2 0 2 0 0 0 0
3 10 3 10 3 10 -1 10 100
4 12 -1 12 -1 20 200 20 200
Edit: the objective is to achieve something similar to what was asked in here . 编辑:目标是实现类似于此处要求的功能。 This means that some rows will have
NA
s for the same KEY, depending on the value of other columns. 这意味着某些行的同一KEY将具有
NA
,具体取决于其他列的值。 Eg if I want to filter columns a
and c
by respectively testing whether columns b
and d
are negative: 例如,如果我想通过分别测试列
b
和d
是否为负数来过滤列a
和c
:
##df
A B
L1 L2 L1 L2
a b c d a b c d
0 -1 0 1 1 NA NA NA NA
1 -1 1 -1 1 NA NA NA NA
2 0 2 0 2 0 0 0 0
3 10 3 10 3 NA NA 10 100
4 NA NA NA NA 20 200 20 200
You need create new array
with map
and then assign: 您需要使用
map
创建新array
,然后分配:
d = {'a':'L1','b':'L1','c':'L2','d':'L2'}
a = df.columns.get_level_values(1).map(lambda x: d[x])
print (a)
['L1' 'L1' 'L2' 'L2' 'L1' 'L1' 'L2' 'L2']
df.columns = [df.columns.get_level_values(0),a,df.columns.get_level_values(1)]
#same as
df.columns = pd.MultiIndex.from_arrays([df.columns.get_level_values(0),
df.columns.get_level_values(1).map(lambda x: d[x]),
df.columns.get_level_values(1)])
print (df)
A B
L1 L2 L1 L2
a b c d a b c d
0 -1 0 -1 1 -20 -200 -20 -200
1 -1 1 1 1 -10 -100 -10 -100
2 0 2 0 2 0 0 0 0
3 10 3 10 3 10 -1 10 100
4 12 -1 12 -1 20 200 20 200
Second output is really complicated, for me works: 第二个输出确实很复杂,对我来说工作:
#filter columns
idx = pd.IndexSlice
mask = df.loc[:, idx[:,:,['b','d']]] < 0
print (mask)
A B
L1 L2 L1 L2
b d b d
0 False False True True
1 False False True True
2 False False False False
3 False False True False
4 True True False False
#create mask to columns a,c
mask1 = mask.reindex(columns=df.columns)
mask1 = mask1.groupby(level=[0,1], axis=1).apply(lambda x: x.bfill(axis=1))
print (mask1)
A B
L1 L2 L1 L2
a b c d a b c d
0 False False False False True True True True
1 False False False False True True True True
2 False False False False False False False False
3 False False False False True True False False
4 True True True True False False False False
print (df.mask(mask1))
A B
L1 L2 L1 L2
a b c d a b c d
0 -1.0 0.0 -1.0 1.0 NaN NaN NaN NaN
1 -1.0 1.0 1.0 1.0 NaN NaN NaN NaN
2 0.0 2.0 0.0 2.0 0.0 0.0 0.0 0.0
3 10.0 3.0 10.0 3.0 NaN NaN 10.0 100.0
4 NaN NaN NaN NaN 20.0 200.0 20.0 200.0
Another solution with reindex
and method='bfill'
, but is necessary double transpose (I think it is bug - works only with MultiIndex
in index
, not with MultiIndex
in columns
): 带有
reindex
和method='bfill'
另一种解决方案,但有必要进行两次转置(我认为这是MultiIndex
仅MultiIndex
于index
MultiIndex
,而不MultiIndex
于columns
MultiIndex
):
idx = pd.IndexSlice
mask = df.loc[:, idx[:,['b','d']]] < 0
print (mask)
A B
b d b d
0 False False True True
1 False False True True
2 False False False False
3 False False True False
4 True True False False
mask1 = mask.T.reindex(df.columns, method='bfill').T
print (mask1)
A B
a b c d a b c d
0 False False False False True True True True
1 False False False False True True True True
2 False False False False False False False False
3 False False False False True True False False
4 True True True True False False False False
print (df.mask(mask1))
A B
a b c d a b c d
0 -1.0 0.0 -1.0 1.0 NaN NaN NaN NaN
1 -1.0 1.0 1.0 1.0 NaN NaN NaN NaN
2 0.0 2.0 0.0 2.0 0.0 0.0 0.0 0.0
3 10.0 3.0 10.0 3.0 NaN NaN 10.0 100.0
4 NaN NaN NaN NaN 20.0 200.0 20.0 200.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.