[英]preprocessing class error, "AttributeError: 'function' object has no attribute 'str'"
So I did an nlp project earlier now I have pickled the model and trying to apply it to a new data set, the data set is something I scrapped from twitter.所以我早些时候做了一个 nlp 项目,现在我已经腌制了模型并尝试将其应用于新的数据集,该数据集是我从 twitter 上删除的。 So of course the new dataframe doesn't have the same columns as the old dataset, so I am making a class to preprocess the data to make closer the old dataframe which was used for the nlp project.
所以当然,新数据框与旧数据集没有相同的列,所以我正在创建一个类来预处理数据,以更接近用于 nlp 项目的旧数据框。 This is what I did
这就是我所做的
def __init__(self):
pass
def fit(self, text_column):
df = pd.DataFrame(text_column)
df.text_length = self.text_length(text_column)
df.num_capital_letters = self.num_capital_letters(text_column)
df.percentage_of_capital_letters = self.percentage_of_capital_letters(text_column)
df.greater_than_50_percent = self.greater_than_50_percent(text_column)
df.reading_level = self.reading_level(text_column)
#df =pd.DataFrame(Text.df_user_tweets
return df
def text_length(self,column):
return column.apply(lambda x: len(x))
def num_capital_letters(self,column):
return column.apply.str.findall(r"[A-Z]").str.len()
def percentage_of_capital_letters(self,column):
return column.apply.str.findall(r"[A-Z]").str.len()/column.apply(lambda x: len(x))
def greater_than_50_percent(self,column):
return column.apply(lambda x: x>= .5 )
def reading_level(self,column):
return column.apply(lambda x :textstat.flesch_reading_ease(x))
pre = Preprocesser()
pre.fit(text_column = df_user_tweets.Text)
This is the error that I got这是我得到的错误
<ipython-input-136-3b74ba5d2425> in num_capital_letters(self, column)
17 return column.apply(lambda x: len(x))
18 def num_capital_letters(self,column):
---> 19 return column.apply.str.findall(r"[A-Z]").len()
20 def percentage_of_capital_letters(self,column):
21 return column.apply.str.findall(r"[A-Z]").str.len()/column.apply(lambda x: len(x))
AttributeError: 'function' object has no attribute 'str'
It sounds like my error is in line 19 but not sure what I need to do fix it, appreciate any help听起来我的错误在第 19 行,但不确定我需要做什么来修复它,感谢任何帮助
df_user_tweets.Text
is of type pd.Series
and it has a method apply
. df_user_tweets.Text
是pd.Series
类型,它有一个方法apply
。 this method takes a lambda
function to do some work on values of that Series
(which is a column), and it does not have an str
attribute.此方法采用
lambda
函数对该Series
的值(即一列)进行一些处理,并且它没有str
属性。
So instead of column.apply.findall
do column.str.findall
.所以代替
column.apply.findall
做column.str.findall
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.