使用Multiindex列切片数据

Question

I have a dataframe with MultiIndex columns. 我有一个带有MultiIndex列的数据框。 I want to filter the data using the columns of the dataset. 我想使用数据集的列过滤数据。 When I try df.columns I get this information: 当我尝试df.columns我得到以下信息：

MultiIndex(levels=[['power'], ['active']],
           codes=[[0], [0]],
           names=['physical_quantity', 'type'])

A short description of the dataset is: 数据集的简短描述是：

physical_quantity          power
type                      active
2011-04-18 09:22:13-04:00    6.0
2011-04-18 09:22:16-04:00    6.0
2011-04-18 09:22:20-04:00    6.0
2011-04-18 09:22:23-04:00    6.0
2011-04-18 09:22:26-04:00    6.0

The first thing I found is that although you see two columns there, the dataframe really says that it is a dataset of [529757 rows x 1 columns] . 我发现的第一件事是，尽管在那里看到两列，但数据[529757 rows x 1 columns]确实说这是[529757 rows x 1 columns]的数据集。

What I want to do is to filter the data selecting an interval of time, choosing the first column called physical_quantity type . 我想要做的是选择一个时间间隔来过滤数据，选择第一列称为physical_quantity type 。

On the other hand, the data of that first column ( physical_quantity type ) it is unknown: 另一方面， 第一列的数据（ physical_quantity type ）是未知的：

physical_quantity  type  
power              active    float32
dtype: object

Checking with df.index I managed to see this information about the dataframe: 检查与df.index我设法看到有关数据df.index此信息：

DatetimeIndex(['2011-04-18 09:22:13-04:00', '2011-04-18 09:22:16-04:00',
               '2011-04-18 09:22:20-04:00', '2011-04-18 09:22:23-04:00',
               '2011-04-18 09:22:26-04:00', '2011-04-18 09:22:30-04:00',
               '2011-04-18 09:22:33-04:00', '2011-04-18 09:22:37-04:00',
               '2011-04-18 09:22:40-04:00', '2011-04-18 09:22:44-04:00',
               ...
               '2011-05-14 23:59:26-04:00', '2011-05-14 23:59:29-04:00',
               '2011-05-14 23:59:33-04:00', '2011-05-14 23:59:36-04:00',
               '2011-05-14 23:59:40-04:00', '2011-05-14 23:59:43-04:00',
               '2011-05-14 23:59:46-04:00', '2011-05-14 23:59:50-04:00',
               '2011-05-14 23:59:53-04:00', '2011-05-14 23:59:57-04:00'],
              dtype='datetime64[ns, US/Eastern]', length=529757, freq=None)

So I understood that the data of that column is something like dtype='datetime64[ns, US/Eastern] 所以我知道该列的数据类似于dtype='datetime64[ns, US/Eastern]

So I aim to slice the data, from an specific day and hour to another day and hour. 因此，我旨在对数据进行切片，从特定的日期和时间到另一天的时间。

from 2011-05-10 19:44:51-04:00 to 2011-05-10 23:17:59-04:00 从2011-05-10 19：44：51-04：00到2011-05-10 23：17：59-04：00

I tried doing something like this: 我试图做这样的事情：

df[df['physical_quantity', 'type']] > 2011-05-10 19:44:51-04:00 
& 
df[df['physical_quantity', 'type']] < 2011-05-10 23:17:59-04:00

df[df['physical_quantity', 'type']] > 2011-05-10 19:44:51-04:00

File "<ipython-input-133-27848c7d6afc>", line 1
    df[df['physical_quantity', 'type']] > 2011-05-10 19:44:51-04:00
                                                ^
SyntaxError: invalid token

How can I solve my problem? 我该如何解决我的问题？

Answer 1

尝试这个

df['ts'] = pd.to_datetime(df["ts"], unit="ms")```

使用Multiindex列切片数据

问题描述

1 个解决方案

解决方案1
0 2019-07-03 17:29:46

使用Multiindex列切片数据

问题描述

1 个解决方案

解决方案1 0 2019-07-03 17:29:46

解决方案1
0 2019-07-03 17:29:46