Python pandas 按多个索引范围切片数据帧

Question

按更多索引范围（例如10:12<\/code>和25:28<\/code> ）对数据帧进行切片的 pythonic 方法是什么？

我想要一种更优雅的方式：

df = pd.DataFrame({'a':range(10,100)})
df.iloc[[i for i in range(10,12)] + [i for i in range(25,28)]]

Answer 1

您可以使用 numpy 的r_ “切片技巧”：

df = pd.DataFrame({'a':range(10,100)})
df.iloc[pd.np.r_[10:12, 25:28]]

注意：现在发出警告The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead 。 为此，您可以import numpy as np ，然后按以下方式切片：

df.iloc[np.r_[10:12, 25:28]]

这给出：

Answer 2

您可以利用熊猫的 isin 功能。

df = pd.DataFrame({'a':range(10,100)})
ls = [i for i in range(10,12)] + [i for i in range(25,28)]
df[df.index.isin(ls)]


    a
10  20
11  21
25  35
26  36
27  37

Answer 3

在@KevinOelen 使用 Panda 的 isin 函数的基础上，这里有一种 Python 方式（Python 3.8）来查看 Pandas DataFrame 或 GeoPandas GeoDataFrame，只显示几行头部和尾部。 此方法不需要导入 numpy。

要使用只需调用glance(your_df)。 文档字符串中的附加说明。

import pandas as pd
import geopandas as gpd  # if not needed, remove gpd.GeoDataFrame from the type hinting and no need to import Union
from typing import Union


def glance(df: Union[pd.DataFrame, gpd.GeoDataFrame], size: int = 2) -> None:
    """ Provides a shortened head and tail summary of a Dataframe or GeoDataFrame in Jupyter Notebook or Lab.

    Usage
    ----------
    # default glance (2 head rows, 2 tail rows)
    glance( df )
    
    # glance defined number of rows in head and tail (3 head rows, 3 tails rows)
    glance( df, size=3 )

    Parameters
    ----------
    :param df: Union[pd.DataFrame, gpd.GeoDataFrame]: A (Geo)Pandas data frame to glance at.
    :param size: int: The number of rows in the head and tail to display, total rows will be double provided size.
    :return: None: Displays result in Notebook or Lab.
    """
    
    # min and max of the provided dataframe index
    min_ = df.index.min()
    max_ = df.index.max()

    # define slice
    sample = [i for i in range(min_, size)] + [i for i in range(max_ - size, max_)]

    # slice
    df = df[df.index.isin(sample)]
    
    # display
    display( df )

Python pandas 按多个索引范围切片数据帧

问题描述

3 个解决方案

解决方案1
71 已采纳 2016-09-08 14:43:19

解决方案2
4 2017-10-13 09:56:38

解决方案3
0 2022-01-22 19:12:53

Python pandas 按多个索引范围切片数据帧

问题描述

3 个解决方案

解决方案1 71 已采纳 2016-09-08 14:43:19

解决方案2 4 2017-10-13 09:56:38

解决方案3 0 2022-01-22 19:12:53

解决方案1
71 已采纳 2016-09-08 14:43:19

解决方案2
4 2017-10-13 09:56:38

解决方案3
0 2022-01-22 19:12:53