使用带有 Pandas(df.apply) 和维度问题 Python 的标签编码 function

Question

I'm using a function that encodes the label as encode_labels on train.csv for `Make columuns.我正在使用 function 将 label 编码为encode_labels上的train.csv用于`Make columuns。

train.csv as follows: train.csv如下：

Make,Model,Year,Engine Fuel Type,Engine HP,Engine Cylinders,Transmission Type,Driven_Wheels,Number of Doors,Market Category,Vehicle Size,Vehicle Style,highway MPG,city mpg,Popularity,MSRP
BMW,1 Series M,2011,premium unleaded (required),335,6,MANUAL,rear wheel drive,2,Factory Tuner,Luxury,High-Performance,Compact,Coupe,26,19,3916,46135
Audi,100,1992,regular unleaded,172,6,MANUAL,front wheel drive,4,Luxury,Midsize,Sedan,24,17,3105,2000
Chrysler,200,2015,flex-fuel (unleaded/E85),184,4,AUTOMATIC,front wheel drive,4,Flex Fuel,Midsize,Sedan,36,23,1013,25170

and the code和代码

from sklearn import preprocessing
from keras.utils.np_utils import to_categorical
def encode_labels(y):
    encoder = preprocessing.LabelEncoder()
    encoder.fit(y)
    encoded_y = encoder.transform(y)
    y = to_categorical(encoded_y)
    return y

Normally the output of encode_labels function is like this: [[[0., 1., 0.]] also two-dimensional.通常 encode_labels encode_labels的 output 是这样的： [[[0., 1., 0.]]也是二维的。

I want to use df['encoded_label'] = df.apply(lambda x: encode_labels(['Make']), axis=1).我想使用df['encoded_label'] = df.apply(lambda x: encode_labels(['Make']), axis=1). But this function's out is [[1.0]] .但是这个函数的输出是[[1.0]] 。 I could not find where I am doing wrong.我找不到我做错的地方。

I got such a printout like this我得到了这样的打印输出

1-) I think there is a problem in using lamda. 1-) 我认为使用 lamda 有问题。 Lambda doesn't work properly. Lambda 无法正常工作。 Are there problems with using lambda too?使用 lambda 也有问题吗？

2-) The fact that the function encode_labels is 2-dimensional also creates a problem for me. 2-) function encode_labels是二维的这一事实也给我带来了问题。 So how can we transform this output ([[0., 0., 0., 1.]]) to 1-dimensional?那么我们如何将这个 output ([[0., 0., 0., 1.]])转换为一维的呢？

How can we deal with these two problems?我们该如何处理这两个问题？

Thanks a lot.非常感谢。

Answer 1

Firstly;首先; I think the answer to the first question;我认为第一个问题的答案； When we want to use dataframe with lambda and apply it as dataframe, it prints the whole result on a single line.当我们想使用 dataframe 和lambda并将其apply为 dataframe 时，它将whole result打印在一行上。 I understood this from my work.我从我的工作中明白了这一点。 If I'm wrong, I'll be glad if you correct it.如果我错了，如果你纠正它，我会很高兴。

Secondly;第二; I solved my second problem by using pandas and pd.get_dummies function.我通过使用pandas和pd.get_dummies function 解决了我的第二个问题。

I wish conveniences我希望方便

使用带有 Pandas(df.apply) 和维度问题 Python 的标签编码 function

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-02-01 10:34:35

使用带有 Pandas(df.apply) 和维度问题 Python 的标签编码 function

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-02-01 10:34:35

解决方案1
0 已采纳 2021-02-01 10:34:35