I have a dataframe which look like:
df:
+---------------------------------------------------+-------------+------------+------------+
|Text_en | pos_score | neg_score | sent_score|
+---------------------------------------------------+-------------+------------+------------+
| inspir afternoon bang stori ahmad sahroni v AS...| 0.000 | 0.0 | 0 |
| | 0.000 | 0.0 | 0 |
| some drastic measur taken manag bodi temperatu. | 1.625 | 0.5 | 1 |
| ahmad sahroni tu | 0.000 | 0.0 | 0 |
| busi success mudah legisl mandat who make inte...| 1.125 | 0.0 | 1 |
+---------------------------------------------------+-------------+------------+------------+
I want to generate/assigned positive text, negative text, neutral text for further processing using this code:
pos_text=""
neg_text=""
neut_text=""
for i in range(len(df_copy.index)):
if(df_copy.loc[i]["sent_score"]==1):
pos_text+=df_copy.loc[i]["Text_en"]
elif(df_copy.loc[i]["sent_score"]==-1):
neg_text+=df_copy.loc[i]["Text_en"]
else:
neut_text+=df_copy.loc[i]["Text_en"]
list_text = [pos_text,neg_text,neut_text]
But it raised an error:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2645 try:
-> 2646 return self._engine.get_loc(key)
2647 except KeyError:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 1
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-11-8ff17d5161f0> in <module>
4
5 for i in range(len(df_copy.index)):
----> 6 if(df_copy.loc[i]["sent_score"]==1):
7 pos_text+=df_copy.loc[i]["Text"]
8 elif(df_copy.loc[i]["sent_score"]==-1):
~\anaconda3\lib\site-packages\pandas\core\indexing.py in __getitem__(self, key)
1765
1766 maybe_callable = com.apply_if_callable(key, self.obj)
-> 1767 return self._getitem_axis(maybe_callable, axis=axis)
1768
1769 def _is_scalar_access(self, key: Tuple):
~\anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_axis(self, key, axis)
1962 # fall thru to straight lookup
1963 self._validate_key(key, axis)
-> 1964 return self._get_label(key, axis=axis)
1965
1966
~\anaconda3\lib\site-packages\pandas\core\indexing.py in _get_label(self, label, axis)
622 raise IndexingError("no slices here, handle elsewhere")
623
--> 624 return self.obj._xs(label, axis=axis)
625
626 def _get_loc(self, key: int, axis: int):
~\anaconda3\lib\site-packages\pandas\core\generic.py in xs(self, key, axis, level, drop_level)
3535 loc, new_index = self.index.get_loc_level(key, drop_level=drop_level)
3536 else:
-> 3537 loc = self.index.get_loc(key)
3538
3539 if isinstance(loc, np.ndarray):
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2646 return self._engine.get_loc(key)
2647 except KeyError:
-> 2648 return self._engine.get_loc(self._maybe_cast_indexer(key))
2649 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2650 if indexer.ndim > 1 or indexer.size > 1:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 1
Anyway, at first, I run my code it worked like charm then I did some cleaning like dropping some duplicates rows then when I tried to run again I got those error
The application of df.loc[i] expects "i" to be a valid index value in your pandas DataFrame. If you have dropped some rows (eg, duplicates), you have removed some of the indices and thus they are not in your DataFrame index anymore.
Apply df.reset_index(drop=True, inplace=True) to generate a fresh index with consecutive numbers in your clean DataFrame.
You generally don't want to loop over a dataframe in this way. Instead, use loc to query and then do something with the results.
pos_text = df_copy.loc[df['sent_score'] == 1, 'Text_en']
neg_text = df_copy.loc[df['sent_score'] == -1, 'Text_en']
neut_text = df_copy.loc[df['sent_score'] == 0, 'Text_en']
its that [i] part for sure, because that's where it breaks. df_copy.loc[1] is an error there is no [1]. So I need to reset my index using
df_copy=df_copy.reset_index(drop=True)
It worked like a charm
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.