熊猫使用MultiIndex切片数据

Question

I have some features that I want to write to some csv files. 我有一些要写入某些csv文件的功能。 I want to use pandas for this approach if possible. 如果可能的话，我想将熊猫用于这种方法。
I am following the instruction in here and have created some dummy data to check it out. 我正在按照此处的说明进行操作，并创建了一些虚拟数据以将其检出。 Basically there are some activities with a random number of features belonging to them. 基本上，有些活动具有属于它们的随机数量的功能。

import io
data = io.StringIO('''Activity,id,value,value,value,value,value,value,value,value,value
Run,1,1,2,2,5,6,4,3,2,1
Run,1,2,4,4,10,12,8,6,4,2
Stand,2,1.5,3.,3.,7.5,9.,6.,4.5,3.,1.5
Sit,3,0.5,1.,1.,2.5,3.,2.,1.5,1.,0.5
Sit,3,0.6,1.2,1.2,3.,3.6,2.4,1.8,1.2,0.6
Run, 2, 0.8, 1.6, 1.6, 4. , 4.8, 3.2, 2.4, 1.6, 0.8
''')
df_unindexed = pd.read_csv(data)
df = df_unindexed.set_index(['Activity', 'id'])

When I run: 当我跑步时：

df.xs('Run')

I get 我懂了

    value  value.1  value.2  value.3  value.4  value.5  value.6  value.7  \
id                                                                         
1     1.0      2.0      2.0      5.0      6.0      4.0      3.0      2.0   
1     2.0      4.0      4.0     10.0     12.0      8.0      6.0      4.0   
2     0.8      1.6      1.6      4.0      4.8      3.2      2.4      1.6   
    value.8  
id           
1       1.0  
1       2.0  
2       0.8

which almost what I want, that is all run activities. 这几乎是我想要的，那就是所有run活动。 I want to remove the 1st row and 1st column, ie the header and the id column. 我想删除第一行和第一列，即标题和id列。 How do I achieve this? 我该如何实现？

Also a second question is when I want only one activity, how do I get it. 另外一个第二个问题是，当我只想要一项活动时，如何获得它。
When using 使用时

idx = pd.IndexSlice
df.loc[idx['Run', 1], :]

gives 给

             value  value.1  value.2  value.3  value.4  value.5  value.6  \
Activity id                                                                
Run      1     1.0      2.0      2.0      5.0      6.0      4.0      3.0   
         1     2.0      4.0      4.0     10.0     12.0      8.0      6.0   
             value.7  value.8  
Activity id                    
Run      1       2.0      1.0  
         1       4.0      2.0

but slicing does not work as I would expect. 但切片无法像我期望的那样工作。 For example trying 例如尝试

df.loc[idx['Run', 1], 2:11]

instead produces an error: 而是产生一个错误：

TypeError: cannot do slice indexing on with these indexers [2] of 'int'> TypeError：无法使用“ int”>的这些索引器[2]进行切片索引

So, how do I get my features in this place? 那么，如何在这个地方获得功能？

PS If not clear I am new to Pandas so be gentle. PS：如果不清楚，我对Pandas并不Pandas所以要保持温柔。 Also the column id is editable to be unique to each activity or to whole dataset if this makes things easier etc 此外，列id可以编辑，以使每个活动或整个数据集都是唯一的，如果这样会使事情变得更容易等。

Answer 1

You can use a little hack - get columns names by positions, because iloc for MultiIndex is not yet supported : 您可以使用一些技巧-通过位置获取列名称，因为尚不支持 iloc for MultiIndex ：

print (df.columns[2:11])
Index(['value.2', 'value.3', 'value.4', 'value.5', 'value.6', 'value.7',
       'value.8'],
      dtype='object')

idx = pd.IndexSlice
print (df.loc[idx['Run', 1], df.columns[2:11]])
             value.2  value.3  value.4  value.5  value.6  value.7  value.8
Activity id                                                               
Run      1       2.0      5.0      6.0      4.0      3.0      2.0      1.0
         1       4.0     10.0     12.0      8.0      6.0      4.0      2.0

If want save file to csv without index and columns: 如果要将文件保存到没有索引和列的csv中：

df.xs('Run').to_csv(file, index=False, header=None)

Answer 2

I mostly look at https://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-integer when I'm stuck with these kind of issues. 当我遇到这类问题时，我通常会查看https://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-integer 。

Without any testing I think you can remove rows and columns like 没有任何测试，我想您可以删除行和列，例如

df = df.drop(['rowindex'], axis=0)
df = df.drop(['colname'], axis=1)

Answer 3

Avoid the problem by recognizing the index columns at CSV read-time: 通过在CSV读取时识别索引列来避免此问题：

pd.read_csv(header=0, # to read in the header row as a header row, and 
... index_col=['id'] or index_col=0 to pick the index column.

熊猫使用MultiIndex切片数据

问题描述

3 个解决方案

解决方案1
2 2018-05-11 08:49:51

解决方案2
0 2018-05-11 08:47:12

解决方案3
0 2018-05-11 09:14:11

熊猫使用MultiIndex切片数据

问题描述

3 个解决方案

解决方案1 2 2018-05-11 08:49:51

解决方案2 0 2018-05-11 08:47:12

解决方案3 0 2018-05-11 09:14:11

解决方案1
2 2018-05-11 08:49:51

解决方案2
0 2018-05-11 08:47:12

解决方案3
0 2018-05-11 09:14:11