如何通过 pandas MultiIndex 高效过滤和求和

Question

I have a DataFrame with a MultiIndex where I would like to, as efficiently as possible:我有一个带有 MultiIndex 的 DataFrame，我希望尽可能高效：

Filter by one index ( flag & flag_filter != 0 )按一个索引过滤 ( flag & flag_filter != 0 )
Group and sum by the other two ( df.groupby(['time', 'sensor']).sum(['col1','col2','col3']) )对其他两个进行分组和求和（ df.groupby(['time', 'sensor']).sum(['col1','col2','col3']) ）

So as a setup:所以作为一个设置：

import pandas as pd
import numpy as np

index = pd.MultiIndex.from_product(
    [
        range(0, 0xff),
        range(0, 5000),
        range(1, 3),
    ], names = ["flags", "time", "sensor"]
)

data = pd.DataFrame({
    "col1": np.random.uniform(size=len(index), low=0.0, high=0.5),
    "col2": np.random.uniform(size=len(index), low=0.0, high=0.5),
    "col3": np.random.uniform(size=len(index), low=0.0, high=0.5),
}, index = index)

I'm hoping to get, from this, a DataFrame with the same columns, but an index of just time, sensor .我希望从中得到一个 DataFrame 具有相同的列，但只是time, sensor 。 The idea is we threw out rows that didn't match the filter, and summed the rows that did, while still maintaining the time, sensor grouping.这个想法是我们扔掉了与过滤器不匹配的行，并对匹配的行求和，同时仍然保持time, sensor分组。

Answer 1

Combine .loc with droplevel :结合.loc与droplevel ：

# Let's say we want to filter for even flags
flag_filter = data.index.get_level_values("flags") % 2 == 0

# Select matching rows and drop the first level 
data.loc[flag_filter, :].droplevel(0)

如何通过 pandas MultiIndex 高效过滤和求和

问题描述

1 个解决方案

解决方案1
0 2022-10-08 14:15:23

如何通过 pandas MultiIndex 高效过滤和求和

问题描述

1 个解决方案

解决方案1 0 2022-10-08 14:15:23

解决方案1
0 2022-10-08 14:15:23