简体   繁体   中英

TFIDF in Python

Hi below is my function to create tfidf matrix in python

def tf_idf(self,job_id,method='local'):
    jobtext = self.get_job_text ( job_id , method=method )
    tfidf_vectorizer = TfidfVectorizer( max_df=0.8 , max_features=200000 ,
                                        min_df=0.2 , stop_words='english' ,
                                        use_idf=True , tokenizer=self.tokenize_and_stem(jobtext), ngram_range=(1, 3) )
    #tfidf_vectorizer.fit(jobtext)
    tfidf_matrix = tfidf_vectorizer.fit_transform(jobtext) #fit the vectorizer to synopses
    print(tfidf_matrix.shape)

and i am getting following error :

Traceback (most recent call last):

  File ".../employment_skills_extraction-master/api/process_request.py", line 206, in <module> main() File ".../employment_skills_extraction-master/api/process_request.py", line 202, in main print pr.process(json.dumps(test)) File ".../employment_skills_extraction-master/api/process_request.py", line 188, in process termVector=self.tf_idf(job_id) File ".../employment_skills_extraction-master/api/process_request.py", line 174, in tf_idf tfidf_matrix = tfidf_vectorizer.fit_transform(jobtext) #fit the vectorizer to synopses File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 1285, in fit_transform X = super(TfidfVectorizer, self).fit_transform(raw_documents) File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 804, in fit_transform self.fixed_vocabulary_) File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 739, in _count_vocab for feature in analyze(doc): File "/usr/local/lib/python2.7/dist-packages/sklearn/feature_extraction/text.py", line 236, in <lambda> tokenize(preprocess(self.decode(doc))), stop_words) TypeError: 'list' object is not callable 

Please help why i am getting this Error?

TypeError: 'list' object is not callable looks like the relevant part of the error and it concerns your variable job_id which is probably not what you think it is. Whatever it is supposed to be it is probably instead a list (how long I don't know) which contains the thing you want.

If you insert a line on the second line of the function and change a variable name to keep it elegant like this:

job_id_element = job_id[0]
jobtext = self.get_job_text ( job_id_element , method=method )

it will probably work.

Just examine the contents of the variable job_id and consider whether you want the first element of it - the 0 as I wrote - or the last in which len(job_id) is what you need instead of 0, or maybe a different one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM