[英]'KeyError:' when iterating over pandas data frame
我有一個 Dataframe df 有兩列:'label' 和 'review'。 作為數據清理過程,我刪除了所有空值。 現在我想從評論欄中刪除所有停用詞和標點符號。
當我嘗試這段代碼時,我收到了 keyerror。
stemmer = PorterStemmer()
for i in range(len(df)):
review = re.sub('[^a-zA-Z]', ' ',df['review'][i] )
review = review.lower()
review = review.split()
review = [ stemmer.stem(word) for word in review if word not in stopwords.words('english')]
df['review'][i] = " ".join(review)
KeyError Traceback (most recent call last)
<ipython-input-44-91ef309cd900> in <module>
2
3 for i in range(len(df)):
----> 4 review = re.sub('[^a-zA-Z]', ' ',df['review'][i] )
5 review = review.lower()
6 review = review.split()
~\Anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
866 key = com.apply_if_callable(key, self)
867 try:
--> 868 result = self.index.get_value(self, key)
869
870 if not is_scalar(result):
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_value(self, series, key)
4373 try:
4374 return self._engine.get_value(s, k,
-> 4375 tz=getattr(series.dtype, 'tz', None))
4376 except KeyError as e1:
4377 if len(self) > 0 and (self.holds_integer() or self.is_boolean()):
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.Int64HashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in
pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 140
請幫幫我。
下面是一個沒有循環的解決方案。 在 Pandas 中使用循環作為最后的資源:
df['review'] = df['review'].replace('[^a-zA-Z]',' ',regex=True)
df['review'] = df['review'].str.lower()
df['review'] = df['review'].str.split()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.