簡體   English   中英

如何用FastText訓練機器學習 model output

[英]How to train machine learning model with FastText output

是否有 Fasttext 的任何方法,我可以通過它從 Fasttext 的以下 output 或任何我可以訓練我的 ML model 的方法。自從我使用 TF-IDF 之后我得到了稀疏矩陣並且我訓練了 ML model 但現在我想用 FastText 訓練 model。

fasttext_out=model_ted.wv.most_similar("The Lemon Drop Kid , a New York City swindler, is illegally touting horses at a Florida racetrack. After several successful hustles, the Kid comes across a beautiful, but gullible, woman intending to bet a lot of money. The Kid convinces her to switch her bet, employing a prefabricated con. Unfortunately for the Kid, the woman belongs to notorious gangster Moose Moran , as does the money. The Kid's choice finishes dead last and a furious Moran demands the Kid provide him with $10,000  by Christmas Eve, or the Kid won't make it to New Year's. The Kid decides to return to New York to try to come up with the money. He first tries his on-again, off-again girlfriend Brainy Baxter . However, when talk of long-term commitment arises, the Kid quickly makes an escape.")

model_ted.wv.most_similar("school")

Output:

[('Psycho-biddy', 0.9323669672012329),
 ('Slasher', 0.8850599527359009),
 ('Demonic child', 0.8805997967720032),
 ('Giallo', 0.8504119515419006),
 ('Road-Horror', 0.821454644203186),
 ('Anthology', 0.8191317915916443),
 ('Czechoslovak New Wave', 0.8187490105628967),
 ('Supernatural', 0.813347339630127),
 ('Psychological thriller', 0.8018383979797363),
 ('Kitchen sink realism', 0.8017964959144592)]

我的主要目的是將 output 轉換為向量並訓練機器學習 model。請確認。

我對您之前類似問題回答仍然適用,具體而言:

FastText本質上只為您提供詞向量:每個詞一個向量。 如果您想要一個用於較長文本運行的向量,比如很多單詞,您需要做出更多決定,以決定如何將一堆單獨的單詞向量轉換成其他東西。

簡單地嘗試是一個不錯的決定:將所有這些詞平均在一起。 (還有許多其他方法可以將較大的文本表示為向量或其他值袋。)

然后,您可以嘗試將這些平均值作為特征傳遞給下游分類器。

另外,正如之前的答案中也指出的那樣,如果您像示例代碼所示那樣傳遞一個長字符串,您將不會獲得一組有意義的.most_similar()結果。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM