[英]Extracting and plotting data from a MultiIndex DataFrame in pandas
I've managed to get the following table into a pandas DataFrame. 我设法将下表放入pandas DataFrame中。 It has a multi-dimensional index (file_type, server_count, file_count, thread_count, cacheclear_type) which represents a configuration for some performance measurement.
它具有一个多维索引(file_type,server_count,file_count,thread_count,cacheclear_type),该索引表示一些性能度量的配置。 I then have 5 runs for each configuration.
然后,每个配置有5次运行。
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| | | | | | run_001 | run_002 | run_003 | run_004 | run_005 |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| file_type | server_count | file_count | thread_count | cacheclear_type | | | | | |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| gor | 01servers | 05files | 20threads | ccALWAYS | 15.918 | 16.275 | 15.807 | 17.781 | 16.233 |
| | 08servers | 05files | 20threads | ccALWAYS | 17.061 | 15.414 | 16.819 | 15.597 | 16.818 |
| gorz | 01servers | 05files | 20threads | ccALWAYS | 12.285 | 11.218 | 12.009 | 14.122 | 10.991 |
| | 08servers | 05files | 20threads | ccALWAYS | 9.881 | 9.405 | 9.322 | 10.184 | 9.924 |
| gor | 01servers | 10files | 20threads | ccALWAYS | 17.322 | 17.636 | 16.096 | 16.484 | 16.715 |
| | 08servers | 10files | 20threads | ccALWAYS | 17.167 | 17.666 | 15.950 | 18.867 | 16.569 |
| gorz | 01servers | 10files | 20threads | ccALWAYS | 14.718 | 19.553 | 17.930 | 21.415 | 21.495 |
| | 08servers | 10files | 20threads | ccALWAYS | 10.236 | 9.948 | 12.605 | 9.780 | 10.320 |
| gor | 01servers | 15files | 20threads | ccALWAYS | 19.265 | 17.128 | 17.630 | 18.739 | 16.833 |
| | 08servers | 15files | 20threads | ccALWAYS | 23.083 | 22.084 | 25.024 | 24.677 | 20.648 |
| gorz | 01servers | 15files | 20threads | ccALWAYS | 15.401 | 28.282 | 28.727 | 24.645 | 27.509 |
| | 08servers | 15files | 20threads | ccALWAYS | 10.307 | 12.217 | 13.005 | 12.277 | 12.224 |
| gor | 01servers | 20files | 20threads | ccALWAYS | 23.744 | 20.539 | 21.416 | 22.921 | 22.794 |
| | 08servers | 20files | 20threads | ccALWAYS | 35.393 | 36.218 | 35.949 | 35.157 | 37.342 |
| gorz | 01servers | 20files | 20threads | ccALWAYS | 19.505 | 23.756 | 25.767 | 26.575 | 25.239 |
| | 08servers | 20files | 20threads | ccALWAYS | 11.398 | 11.332 | 15.086 | 16.115 | 13.479 |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
I would like to take all the gor,1servers,20threads,ccALWAYS configurations and create one data point for each of the XXfiles configurations. 我想采用所有gor,1servers,20threads,ccALWAYS配置,并为每个XXfiles配置创建一个数据点。 So to begin with I'd like to somehow get a DataFrame that looks like this:
因此,首先,我想以某种方式获取如下所示的DataFrame:
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| | | | | | run_001 | run_002 | run_003 | run_004 | run_005 |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| file_type | server_count | file_count | thread_count | cacheclear_type | | | | | |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
| gor | 01servers | 05files | 20threads | ccALWAYS | 15.918 | 16.275 | 15.807 | 17.781 | 16.233 |
| gor | 01servers | 10files | 20threads | ccALWAYS | 17.322 | 17.636 | 16.096 | 16.484 | 16.715 |
| gor | 01servers | 15files | 20threads | ccALWAYS | 19.265 | 17.128 | 17.630 | 18.739 | 16.833 |
| gor | 01servers | 20files | 20threads | ccALWAYS | 23.744 | 20.539 | 21.416 | 22.921 | 22.794 |
+-----------+--------------+------------+--------------+-----------------+---------+---------+---------+---------+---------+
How do I do that? 我怎么做?
我使用以下代码设法使用query()函数过滤数据,使其看起来像问题中的第二张表:
df.query('file_type == "gor" & server_count == "01servers"').sortlevel(2)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.