简体   繁体   中英

Using the label-encoding function with Pandas(df.apply) and dimensional problem Python

I'm using a function that encodes the label as encode_labels on train.csv for `Make columuns.

train.csv as follows:

Make,Model,Year,Engine Fuel Type,Engine HP,Engine Cylinders,Transmission Type,Driven_Wheels,Number of Doors,Market Category,Vehicle Size,Vehicle Style,highway MPG,city mpg,Popularity,MSRP
BMW,1 Series M,2011,premium unleaded (required),335,6,MANUAL,rear wheel drive,2,Factory Tuner,Luxury,High-Performance,Compact,Coupe,26,19,3916,46135
Audi,100,1992,regular unleaded,172,6,MANUAL,front wheel drive,4,Luxury,Midsize,Sedan,24,17,3105,2000
Chrysler,200,2015,flex-fuel (unleaded/E85),184,4,AUTOMATIC,front wheel drive,4,Flex Fuel,Midsize,Sedan,36,23,1013,25170

and the code

from sklearn import preprocessing
from keras.utils.np_utils import to_categorical
def encode_labels(y):
    encoder = preprocessing.LabelEncoder()
    encoder.fit(y)
    encoded_y = encoder.transform(y)
    y = to_categorical(encoded_y)
    return y

Normally the output of encode_labels function is like this: [[[0., 1., 0.]] also two-dimensional.

I want to use df['encoded_label'] = df.apply(lambda x: encode_labels(['Make']), axis=1). But this function's out is [[1.0]] . I could not find where I am doing wrong.

I got such a printout like this

1-) I think there is a problem in using lamda. Lambda doesn't work properly. Are there problems with using lambda too?

2-) The fact that the function encode_labels is 2-dimensional also creates a problem for me. So how can we transform this output ([[0., 0., 0., 1.]]) to 1-dimensional?

How can we deal with these two problems?

Thanks a lot.

Firstly; I think the answer to the first question; When we want to use dataframe with lambda and apply it as dataframe, it prints the whole result on a single line. I understood this from my work. If I'm wrong, I'll be glad if you correct it.

Secondly; I solved my second problem by using pandas and pd.get_dummies function.

I wish conveniences

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM