[英]Dataframe fit_transform throwing error with seemingly incorrect error
I am running the given line in Python: 我在Python中运行给定的行:
df = df.apply(lambda x: d[x.name].fit_transform(x))
And getting the following error: 并得到以下错误:
~/anaconda3/envs/python3/lib/python3.6/site-packages/numpy/lib/arraysetops.py in _unique1d(ar, return_index, return_inverse, return_counts)
278
279 if optional_indices:
--> 280 perm = ar.argsort(kind='mergesort' if return_index else 'quicksort')
281 aux = ar[perm]
282 else:
TypeError: ("'<' not supported between instances of 'str' and 'float'", 'occurred at index name')
I don't have the character '<' anywhere in my file, so not sure what the error is? 我的文件中的任何地方都没有字符“ <”,所以不确定错误是什么吗?
New to Python, so any insights on how to understand these errors, greatly appreciated. Python的新手,因此非常感谢您对如何理解这些错误的任何见解。
I think this may be happening because you are not passing clean or correct data to your fit_transform
. 我认为这可能是因为您没有将干净或正确的数据传递给fit_transform
。 It's hard to tell without an answer to my question in the comments (What does the d
stand for in df = df.apply(lambda x: d[x.name].fit_transform(x))
?) 在评论中不回答我的问题就很难说( d
在df = df.apply(lambda x: d[x.name].fit_transform(x))
什么??
I took some dummy data and made an example of how you might apply a fit_transform
to a dataframe with apply. 我接受了一些虚拟数据,并举例说明了如何使用apply将fit_transform
应用于数据fit_transform
。
import random
import pandas as pd
import numpy as np
# Random dummy data
s = "Crime Type Summer|Crime Type Winter".split("|")
j = {x: [random.choice(["ASB", "Violence", "Theft", "Public Order", "Drugs"]) for j in range(300)] for x in s}
df = pd.DataFrame(j)
# Instantiate the vectorizer for use in the lambda function.
from sklearn.feature_extraction.text import CountVectorizer
cv = CountVectorizer()
# Now we can call the transform directly in the lambda function.
df = df.apply(lambda x: cv.fit_transform(df[x.name].values))
This completes successfully and gives: 这成功完成,并提供:
Crime Type Summer (0, 1)\t1\n (1, 4)\t1\n (2, 2)\t1\n (2, 3...
Crime Type Winter (0, 5)\t1\n (1, 0)\t1\n (2, 0)\t1\n (3, 5...
dtype: object
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.