简体   繁体   中英

Creating a Pandas Panel with Non-Unique Index Objects

I have the following data file LawSchoolSample.csv :

LSAT,GPA
622,3.23
542,2.83
579,3.24
653,3.12
606,3.09

I'd like to create a pandas dataframe, and then resample from this dataframe B times to form a pandas panel. Here's my attempt (critiques welcomed):

import pandas as pd

df = pd.read_csv("LawSchoolSample.csv")

B = 3
resamples = {}

for i in range(0,B):
    name = "Resample {}".format(i)
    resamples[name] = df.sample(5,replace=True)

print resamples

resamples_panel = pd.Panel(resamples)

All is well except the last line: resamples_panel = pd.Panel(resamples) . The error is:

pandas.core.index.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

I have two questions, then:

  1. Is using a panel worth it for this? Or is whatever data structure resamples is good enough?
  2. What's the preferred method of adding dataframes to a panel?

The long term plan is to deprecate Panel , see pandas documentation:

In a future version of pandas, we will be deprecating Panel and other >2 ndim objects. In order to provide for continuity, all NDFrame objects have gained the .to_xarray() method in order to convert to xarray objects, which has a pandas-like interface for > 2 ndim. (GH11972)

http://pandas.pydata.org/pandas-docs/version/0.18.0/whatsnew.html#to-xarray

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM