简体   繁体   English

在 MultiIndex DataFrame 的较低级别更改多个值

[英]Change multiple values in lower level of MultiIndex DataFrame

Consider the following DataFrame:考虑以下数据帧:

import numpy as np
import pandas as pd

arrays = [['foo', 'foo', 'foo', 'bar', 'bar', 'bar'],
      ['A', 'B', 'C', 'A', 'B', 'C']]
tuples = list(zip(*arrays))
index_values = pd.MultiIndex.from_tuples(tuples)
df = pd.DataFrame(np.random.rand(6), index = index_values)

print(df)

              0
foo A  0.726699
    B  0.001700
    C  0.936495
bar A  0.298490
    B  0.167234
    C  0.476725

Say I want to scale df with the following values:假设我想用以下值缩放 df:

df_scale = pd.DataFrame([0,1,4], index=['A','B','C'])
print(df_scale)

   0
A  0
B  1
C  4

That is, I want all A's to be multiplied by 0, all B's by 1, and all C's by 4.也就是说,我希望所有 A 乘以 0,所有 B 乘以 1,所有 C 乘以 4。

Currently, I use the following approach:目前,我使用以下方法:

df_new = df.copy()
list_df_new_index = list(df_new.index)
for index in list_df_new_index:
    cntr, prod = index
    df_new.loc[cntr, prod] = df_new.loc[cntr, prod]*df_scale.loc[prod]
print(df_new)

              0
foo A  0.000000
    B  0.001700
    C  3.745981
bar A  0.000000
    B  0.167234
    C  1.906900

While this works, I can't help but think there is a functionality within pandas that would allow me to do just that.虽然这有效,但我不禁想到 Pandas 中有一个功能可以让我做到这一点。

I went through the answers on Select rows in pandas MultiIndex DataFrame .我浏览了有关Pandas MultiIndex DataFrame 中 Select rows的答案。

At first I thought I could use df.xs(), but if I understand correctly this only allows me to select and not change values.起初我以为我可以使用 df.xs(),但如果我理解正确的话,这只允许我选择而不是更改值。

Next I looked into pd.IndexSlice(), but I don't see how I can use this to change multiple values.接下来我查看了 pd.IndexSlice(),但我不知道如何使用它来更改多个值。

Does pandas offer the functionality of changing multiple values in a lower level of a MultiIndex DataFrame? pandas 是否提供在 MultiIndex DataFrame 的较低级别更改多个值的功能?

You can multiple by DataFrame.mul :您可以通过DataFrame.mul进行多个:

df = df.mul(df_scale, level=1, axis=0)
#if want multiple by column 0
#df = df.mul(df_scale[0], level=1, axis=0)
print (df)
              0
foo A  0.000000
    B  0.393081
    C  2.495880
bar A  0.000000
    B  0.880499
    C  1.196688

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM