繁体   English   中英

使用 SklearnTutorial 无法理解 vectorizer.get_feature_names_out() 的 output

[英]Using SklearnTutorial and unable to undertand the output of vectorizer.get_feature_names_out()

output 是我的 20newsgroups_train 数据的一部分吗? 还是来自默认库? 因为像“zz_g9q3”这样的词没有意义。 当前使用 20newsgroups_train 数据集和 20newsgroups_test 数据集

输入:

vectorizer=TfidfVectorizer()
vectors_test=vectorizer.transform(newsgroups_test.data)
print(vectorizer.get_feature_names_out()[-50:])

Output:

['zyra' 'zysec' 'zysgm3r' 'zysv' 'zyt' 'zyu' 'zyv' 'zyxel' 'zyxel1496b'
 'zz' 'zz20d' 'zz93sigmc120' 'zz_g9q3' 'zzcrm' 'zzd' 'zzg6c' 'zzi776'
 'zzneu' 'zznki' 'zznkj' 'zznkjz' 'zznkzz' 'zznp' 'zzo' 'zzr11' 'zzr1100'
 'zzrk' 'zzt' 'zztop' 'zzy_3w' 'zzz' 'zzzoh' 'zzzz' 'zzzzzz' 'zzzzzzt'
 'ªl' '³ation'
 'º_________________________________________________º_____________________º'
 'ºnd' 'çait' 'çon' 'ère' 'ée' 'égligent' 'élangea' 'érale' 'ête'
 'íålittin' 'ñaustin' 'ýé'] 

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM