[英]How to merge based on unequal conditions on more than one columns in Python
[英]how to impute more than one specific columns in DataSet: Python(sklearn)
不浪費時間,直奔問題。 我實際上是在Python中使用sklearn.SimpleImputer將我的 DataSet 輸入。 但是我的 DataSet 包含一些帶有整數的列和一些帶有其他字母點的列。 所以,我使用Median來填充空白,我只想為我的特定列使用整數,而不是整個 DataSet。 我試過這個:
from sklearn.impute import SimpleImputer
imputer = SimpleImputer(strategy="median")
imputer.fit(students['age'], ['sex'], ['failures'])
我想只對這些只有整數值而不是所有數據集的列進行插補,因為所有數據集都包含帶有 alphbets 數據點的列,它們的中位數不能取。
從上面的代碼中,我得到了這個錯誤:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2894 try:
-> 2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: ('age', 'sex', 'failures')
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-26-8961e0ce249f> in <module>
2 from sklearn.impute import SimpleImputer
3 imputer = SimpleImputer(strategy="median")
----> 4 imputer.fit(students['age', 'sex', 'failures'])
~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2900 if self.columns.nlevels > 1:
2901 return self._getitem_multilevel(key)
-> 2902 indexer = self.columns.get_loc(key)
2903 if is_integer(indexer):
2904 indexer = [indexer]
~\anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2895 return self._engine.get_loc(casted_key)
2896 except KeyError as err:
-> 2897 raise KeyError(key) from err
2898
2899 if tolerance is not None:
KeyError: ('age', 'sex', 'failures')
數據的鏈接是https://archive.ics.uci.edu/ml/machine-learning-databases/00320/
謝謝,希望你理解這個問題。 我盡力解釋它。
嘗試:
imputer.fit_transform([students['age'], students['sex'], students['failures']])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.