[英]How to solve "AttributeError: 'float' object has no attribute 'lower'"
enter image description here Getting issues with my code unable to understand what to do next can anyone help me out enter image description here我的代码出现问题 无法理解下一步该做什么 谁能帮帮我
# Importing the libraries
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM, SpatialDropout1D
from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import to_categorical
import pickle
import re
# Importing the dataset
filename = "MoviePlots.csv"
data = pd.read_csv(filename, encoding= 'unicode_escape')
# Keeping only the neccessary columns
data = data[['Plot']]
# Clean the data
data['Plot'] = data['Plot'].apply(lambda x: x.lower())
data['Plot'] = data['Plot'].apply((lambda x: re.sub('[^a-zA-z0-9\s]', '', x)))
# Create the tokenizer
tokenizer = Tokenizer(num_words=5000, split=" ")
tokenizer.fit_on_texts(data['Plot'].values)
# Save the tokenizer
with open('tokenizer.pickle', 'wb') as handle:
pickle.dump(tokenizer, handle, protocol=pickle.HIGHEST_PROTOCOL)
# Create the sequences
X = tokenizer.texts_to_sequences(data['Plot'].values)
X = pad_sequences(X)
# Create the model
model = Sequential()
model.add(Embedding(5000, 256, input_length=X.shape[1]))
model.add(Bidirectional(LSTM(256, return_sequences=True, dropout=0.1, recurrent_dropout=0.1)))
model.add(LSTM(256, return_sequences=True, dropout=0.1, recurrent_dropout=0.1))
model.add(LSTM(256, dropout=0.1, recurrent_dropout=0.1))
model.add(Dense(256, activation='relu', kernel_regularizer=regularizers.l2(0.01)))
model.add(Dense(5000, activation='softmax'))
# Compile the model
model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.01), metrics=['accuracy'])
# Train the model
model.fit(X, X, epochs=100, batch_size=128, verbose=1)
# Saving the model
model.save('visioniser.h5')
This is my code and error in the image attached这是我的代码和所附图片中的错误
Anyone please help me out solve this problem of my code please diagnose it任何人请帮我解决我的代码的这个问题请诊断它
It appears that the error is happening with data['Plot'] = data['Plot'].apply(lambda x: x.lower())
(you are calling the apply
function on a column of data -> one of the values in the column is not a string so it doesn't have the lower
method)!似乎错误发生在
data['Plot'] = data['Plot'].apply(lambda x: x.lower())
(您在一列数据上调用apply
function -> 其中一个列中的值不是字符串,因此它没有lower
方法)!
You could fix this by checking if the instance is actually of type string:您可以通过检查实例是否实际上是字符串类型来解决此问题:
data['Plot'] = data['Plot'].apply(lambda x: x.lower() if isinstance(x, str) else x)
or instead of using a lambda function:或者不使用 lambda function:
data['Plot'] = data['Plot'].str.lower()
whereas panda
´s str.lower
skips values that are not strings! data['Plot'] = data['Plot'].str.lower()
而panda
的str.lower
跳过非字符串的值!
It seems like your column Plot
holds some NaN
values (considered as float
by pandas), hence the error.您的列
Plot
似乎包含一些NaN
值(被 pandas 视为float
),因此出现错误。 Try then to cast the column as str
with pandas.Series.astype
before calling pandas.Series.apply
:然后尝试在调用
pandas.Series.apply
之前使用pandas.Series.astype
将该列转换为str
:
data['Plot'] = data['Plot'].astype(str).apply(lambda x: x.lower())
Or simply use pandas.Series.str.lower
:或者简单地使用
pandas.Series.str.lower
:
data['Plot'] = data['Plot'].astype(str).str.lower()
The same goes with re.sub
, you could use pandas.Series.replace
: re.sub
也是一样,你可以使用pandas.Series.replace
:
data['Plot'] = data['Plot'].astype(str).replace(r'[^a-zA-z0-9\s]', '', regex=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.