在python中将单个熊猫索引转换为三级MultiIndex

Question

I have some data in a pandas dataframe which looks like this: 我在pandas数据框中有一些数据，如下所示：

gene                                  VIM  
time:2|treatment:TGFb|dose:0.1  -0.158406  
time:2|treatment:TGFb|dose:1     0.039158  
time:2|treatment:TGFb|dose:10   -0.052608  
time:24|treatment:TGFb|dose:0.1  0.157153  
time:24|treatment:TGFb|dose:1    0.206030  
time:24|treatment:TGFb|dose:10   0.132580  
time:48|treatment:TGFb|dose:0.1 -0.144209  
time:48|treatment:TGFb|dose:1   -0.093910  
time:48|treatment:TGFb|dose:10  -0.166819  
time:6|treatment:TGFb|dose:0.1   0.097548  
time:6|treatment:TGFb|dose:1     0.026664  
time:6|treatment:TGFb|dose:10   -0.008032

where the left is an index. 左边是索引。 This is just a subsection of the data which is actually much larger. 这只是数据的一部分，实际上要大得多。 The index is composed of three components, time, treatment and dose. 该指标由时间，治疗和剂量三部分组成。 I want to reorganize this data such that I can access it easily by slicing. 我想重新组织这些数据，以便可以通过切片轻松访问它。 The way to do this is to use pandas MultiIndexing but I don't know how to convert my DataFrame with one index into another with three. 这样做的方法是使用pandas MultiIndexing，但是我不知道如何将具有一个索引的DataFrame转换为具有三个索引的DataFrame。 Does anybody know how to do this? 有人知道怎么做这个吗？

To clarify, the desired output here is the same data with a three level index, the outer being treatment, middle is dose and the inner being time. 为了明确起见，此处所需的输出是具有三级索引的相同数据，外部是治疗，中间是剂量，内部是时间。 This would be useful so then I could access the data with something like df['time']['dose'] or 'df[0]` (or something to that effect at least). 这将很有用，因此我可以使用df['time']['dose']或'df [0]`之类的数据（或至少可以达到此目的的数据）来访问数据。

Answer 1

You can first replace unnecessary strings (index has to be converted to Series by to_series , because replace doesnt work with index yet) and then use split . 您可以首先replace不必要的字符串（索引必须由to_series转换为Series ，因为replace尚不适用于index ），然后使用split 。 Last set index names by rename_axis (new in pandas 0.18.0 ) 通过最后一组的目录名称rename_axis （新的pandas 0.18.0 ）

df.index = df.index.to_series().replace({'time:':'','treatment:': '','dose:':''}, regex=True)
df.index = df.index.str.split('|', expand=True)
df = df.rename_axis(('time','treatment','dose'))

print (df)
                          VIM
time treatment dose          
2    TGFb      0.1  -0.158406
               1     0.039158
               10   -0.052608
24   TGFb      0.1   0.157153
               1     0.206030
               10    0.132580
48   TGFb      0.1  -0.144209
               1    -0.093910
               10   -0.166819
6    TGFb      0.1   0.097548
               1     0.026664
               10   -0.008032

在python中将单个熊猫索引转换为三级MultiIndex

问题描述

1 个解决方案

解决方案1
1 已采纳 2016-07-27 13:48:37

在python中将单个熊猫索引转换为三级MultiIndex

问题描述

1 个解决方案

解决方案1 1 已采纳 2016-07-27 13:48:37

解决方案1
1 已采纳 2016-07-27 13:48:37