简体   繁体   中英

How to fix this error while using pandas profiling in jupyter notebook

Everytime I use pandas profiling in different data sets, notebook shows me this error.

IndexError: only integers, slices ( : ), ellipsis ( ... ), numpy.newaxis ( None ) and integer or boolean arrays are valid indices.

import pandas as pd

df = pd.read_csv('H:\DATA Sets\cereal.csv')

from pandas_profiling import ProfileReport

profile = ProfileReport(df,title='cereal-eda',html={'style' : {'full_width':True}})

dataset used - cereal.csv from kaggle https://www.kaggle.com/crawford/80-cereals

Edit: A PR has already been made to fix this. It seems to be an issue using Pandas 1.4.[01]See this issue on pandas-profiling's github.

I think the error occurs because Numpy deprecated indexing arrays in a manner used by one of pandas-profiling's modules.

If you are getting the same traceback I'm getting where this error occurs in pandas_profiling.model.pandas.utils_pandas , you should be able to fix this by changing:

w_median = data[weights == np.max(weights)][0]

to

w_median = data[np.where(weights == np.max(weights))][0]

In the weighted_median function in $(YOUR_VIRTUAL_ENVIRONMENT_OR_PYTHON_DIR)/lib/python$(PYVERSION)/site-packages/pandas-profiling/model/pandas/utils_pandas.py

(line 13 for pandas-profiling version 3.1.0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM