简体   繁体   中英

get_dummies(), Exception: Data must be 1-dimensional

I have this data

在此输入图像描述

I am trying to apply this:

one_hot = pd.get_dummies(df)

But I get this error:

在此输入图像描述

Here is my code up until then:

# Import modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import tree
df = pd.read_csv('AllMSAData.csv')
df.head()
corr_matrix = df.corr()
corr_matrix
df.describe()
# Get featurs and targets
labels = np.array(df['CurAV'])
# Remove the labels from the features
# axis 1 refers to the columns
df = df.drop('CurAV', axis = 1)
# Saving feature names for later use
feature_list = list(df.columns)
# Convert to numpy array
df = np.array(df)

IMO, the documentation should be updated, because it says pd.get_dummies accepts data that is array-like, and a 2-D numpy array is array like (despite the fact that there is no formal definition of array-like ). However, it seems to not like multi-dimensional arrays.

Take this tiny example:

>>> df
   a  b  c
0  a  1  d
1  b  2  e
2  c  3  f

You can't get dummies on the underlying 2D numpy array:

>>> pd.get_dummies(df.values)

Exception: Data must be 1-dimensional

But you can get dummies on the dataframe itself:

>>> pd.get_dummies(df)
   b  a_a  a_b  a_c  c_d  c_e  c_f
0  1    1    0    0    1    0    0
1  2    0    1    0    0    1    0
2  3    0    0    1    0    0    1

Or on the 1D array underlying an individual column:

>>> pd.get_dummies(df['a'].values)
   a  b  c
0  1  0  0
1  0  1  0
2  0  0  1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM