即使密鑰存在於 Dict 中，也會出現 Keyerror？

Question

我正在嘗試查找用於評論分析的詞向量 NLP 情感分析，但出現關鍵錯誤，但字典中存在該關鍵，我不知道為什么會收到此錯誤

txt_fname = 'C:\\Users\\arune\\Desktop\\sentiment labelled sentences\\amazon_cells_labelled.txt'
df = pd.read_table(txt_fname,names=['sentence','sentiment'])

df['tokenized'] = df['sentence'].apply(lambda a: word_tokenize(a))


```
vocab = set()
for tokens in df['tokenized']:
    for a in tokens:
        vocab.add(a)
        
len(vocab)
```

vocab = {a:b for a,b in enumerate(sorted(vocab))}
vocab   

rand_wv = np.random.rand(len(vocab),300)
rand_wv.shape

from sklearn.model_selection import train_test_split
train, test = train_test_split(df, test_size=0.2,random_state=42)

X_train = []
for tok_sent in train['tokenized']:
    doc_vec = np.zeros(300)
    for t in tok_sent:
        word_index = vocab[t]
        word_vec = rand_wv[word_index]
        doc_vec += word_vec
    doc_vec = doc_vec/len(tok_sent)
    X_train.append(doc_vec)
    
X_train = np.array(X_train)
y_train = train['sentiment']
X_train.shape

獲取錯誤為：

KeyError                                  Traceback (most recent call last)
<ipython-input-64-add70741d289> in <module>
      3     doc_vec = np.zeros(300)
      4     for t in tok_sent:
----> 5         word_index = vocab[t]
      6         word_vec = rand_wv[word_index]
      7         doc_vec += word_vec

KeyError: 'does'

Answer 1

正如蒂姆羅伯茨在評論中回答的那樣：

你的vocab表是從enumerate創建的，所以它的鍵是從 0 開始的整數。

正如所建議的那樣，您應該像這樣創建詞匯表：

vocab = {w: id_ for id_, w in enumerate(sorted(vocab))}

即使密鑰存在於 Dict 中，也會出現 Keyerror？

問題描述

1 個解決方案

解決方案1
0

即使密鑰存在於 Dict 中，也會出現 Keyerror？

問題描述

1 個解決方案

解決方案1 0

解決方案1
0