简体   繁体   English

LabelEncoder: TypeError: '>' 在 'float' 和 'str' 的实例之间不支持

[英]LabelEncoder: TypeError: '>' not supported between instances of 'float' and 'str'

I'm facing this error for multiple variables even treating missing values.对于多个变量,即使处理缺失值,我也面临这个错误。 For example:例如:

le = preprocessing.LabelEncoder()
categorical = list(df.select_dtypes(include=['object']).columns.values)
for cat in categorical:
    print(cat)
    df[cat].fillna('UNK', inplace=True)
    df[cat] = le.fit_transform(df[cat])
#     print(le.classes_)
#     print(le.transform(le.classes_))


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-24-424a0952f9d0> in <module>()
      4     print(cat)
      5     df[cat].fillna('UNK', inplace=True)
----> 6     df[cat] = le.fit_transform(df[cat].fillna('UNK'))
      7 #     print(le.classes_)
      8 #     print(le.transform(le.classes_))

C:\Users\paula.ceccon.ribeiro\AppData\Local\Continuum\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py in fit_transform(self, y)
    129         y = column_or_1d(y, warn=True)
    130         _check_numpy_unicode_bug(y)
--> 131         self.classes_, y = np.unique(y, return_inverse=True)
    132         return y
    133 

C:\Users\paula.ceccon.ribeiro\AppData\Local\Continuum\Anaconda3\lib\site-packages\numpy\lib\arraysetops.py in unique(ar, return_index, return_inverse, return_counts)
    209 
    210     if optional_indices:
--> 211         perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')
    212         aux = ar[perm]
    213     else:

TypeError: '>' not supported between instances of 'float' and 'str'

Checking the variable that lead to the error results ins:检查导致错误结果的变量:

df['CRM do Médico'].isnull().sum()
0

Besides nan values, what could be causing this error?除了 nan 值,还有什么可能导致此错误?

This is due to the series df[cat] containing elements that have varying data types eg(strings and/or floats).这是由于df[cat]系列包含具有不同数据类型的元素,例如(字符串和/或浮点数)。 This could be due to the way the data is read, ie numbers are read as float and text as strings or the datatype was float and changed after the fillna operation.这可能是由于读取数据的方式造成的,即数字被读取为浮点数,文本被读取为字符串,或者数据类型为浮点数并在fillna操作后更改。

In other words换句话说

pandas data type 'Object' indicates mixed types rather than str type pandas 数据类型 'Object' 表示混合类型而不是 str 类型

so using the following line:所以使用以下行:

df[cat] = le.fit_transform(df[cat].astype(str))


should help应该有帮助

As string data types have variable length, it is by default stored as object type.由于字符串数据类型具有可变长度,因此默认情况下存储为对象类型。 I faced this problem after treating missing values too.在处理缺失值后,我也遇到了这个问题。 Converting all those columns to type 'category' before label encoding worked in my case.在我的情况下,在标签编码工作之前将所有这些列转换为类型“类别”。

df[cat]=df[cat].astype('category')

And then check df.dtypes and perform label encoding.然后检查 df.dtypes 并执行标签编码。

或使用具有拆分为统一类型 str 的强制转换

unique, counts = numpy.unique(str(a).split(), return_counts=True)

df['cat'] = df['cat'].apply(str) 工作。

In my case, I had nan in a list ;就我而言,我在list中有nan which limits certain operations you can do这限制了您可以执行的某些操作

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 sklearn LabelEncoder:TypeError:&#39;int&#39;和&#39;str&#39;的实例之间不支持&#39;&lt;&#39; - sklearn LabelEncoder : TypeError : '<' not supported between instances of 'int' and 'str' TypeError:“ float”和“ str”的实例之间不支持“&gt;” - TypeError: '>' not supported between instances of 'float' and 'str' TypeError:“float”和“str”实例之间不支持“&gt;=” - TypeError: '>=' not supported between instances of 'float' and 'str' 类型错误:“str”和“float”的实例之间不支持“&lt;” - Prettytable 错误 - TypeError: '<' not supported between instances of 'str' and 'float' - Prettytable Error TypeError:将字符串转换为浮点数后,“str”和“int”实例之间不支持“&lt;” - TypeError: '<' not supported between instances of 'str' and 'int' after converting string to float Python 初学者 - “TypeError: '&lt;' 在 'float' 和 'str' 的实例之间不支持” - Python Beginner - “TypeError: '<' not supported between instances of 'float' and 'str'” TypeError:尝试更改值时,“ str”和“ float”的实例之间不支持“ &lt;” - TypeError: '<' not supported between instances of 'str' and 'float' when trying to change values 添加额外的操作时,在“ str”和“ float”的实例之间不支持TypeError:“&gt;” - TypeError: '>' not supported between instances of 'str' and 'float' when added extra operations &#39;str&#39; 和 &#39;float&#39; 的实例之间不支持 Python + Pandas &#39;&gt;=&#39; - Python + Pandas '>=' not supported between instances of 'str' and 'float' 类型错误:“&#39;str&#39;和&#39;float实例之间不支持”吗? - Type error: “not supported between instances of 'str' and 'float”?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM