简体   繁体   English

更改Pandas MultiIndex DataFrame中的特定列值

[英]Change Particular Column values in a Pandas MultiIndex DataFrame

Consider we have the following dataframe 考虑我们有以下数据帧

myDF = DataFrame(np.random.randn(4,2), index= [[1,1,2,2],['Mon','Tue','Mon','Tue']])
myDF

             0           1
1   Mon -0.910930    1.592856
    Tue -0.167228   -0.763317
2   Mon -0.926121   -0.718729
    Tue  0.372288   -0.417337

If i want change the values of the first column for all rows in index 1, i try doing this: 如果我想要更改索引1中所有行的第一列的值,我尝试这样做:

myDF.ix[1,:][0] = 99

But that doesn't work and returns the same DataFrame unchanged. 但这不起作用,并返回相同的DataFrame不变。 What am I missing. 我错过了什么 Thank you 谢谢

Recent versions of pandas give a warning when you try something like this. 当您尝试这样的事情时,最新版本的熊猫会发出警告。 For example, on version 0.13.1, you'd get this: 例如,在版本0.13.1上,你会得到这个:

In [4]: myDF.ix[1,:][0] = 99
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead

What you have done is called chained assignment , and it fails due to subtleties in the internal workings of numpy, on which pandas depends. 你所做的是被称为链式分配 ,并且由于大熊猫所依赖的numpy内部运作的微妙之处而失败。

Your situation is more complicated that the one that general warning is addressing because you have a MultiIndex. 由于您有MultiIndex,因此您的情况比一般警告所解决的情况更复杂。 To select all rows with the label 1 in the outer level and the column label 0 , use .loc[1, 0] . 要选择外层中标签为1且列标签为0所有行,请使用.loc[1, 0] (Also see this answer .) (另见这个答案 。)

In [5]: myDF.loc[1, 0] = 99

In [6]: myDF
Out[6]: 
           0         1
1 Mon  99.000000  1.609539
  Tue  99.000000  1.464771
2 Mon  -0.819186 -1.122967
  Tue  -0.545171  0.475277

I believe we can have true flexibility by using the following: 我相信通过使用以下方法我们可以拥有真正的灵活性:

index = [idx for idx, vals in enumerate(myDF.index.values) if vals[1] in ['Mon','Wed'] and vals[0] in [2,3,4]]
colums = [0,1]
myDF.iloc[index, columns] = 99

creating the index from the for loop isn't the most efficient way so one can create a dictionary where the keys are multiIndex tuples and values are the indexes. 从for循环创建索引不是最有效的方法,因此可以创建一个字典,其中键是multiIndex元组,值是索引。

This way we can specify which values in both levels of index we want to change. 这样我们就可以指定我们想要改变的两个索引级别中的哪些值。 .xs() does something similar but you can't change values through that function. .xs()执行类似的操作,但您无法通过该函数更改值。

If there is a simpler way, I would be really interested in finding it out.. 如果有一个更简单的方法,我真的很想找到它..

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM