![](/img/trans.png)
[英]How to create a new column in a Pandas DataFrame based on a column in another DataFrame?
[英]Adding new column in pandas dataframe based on another column
我有一個 dataframe,它有一個基於該列的bmi列我想創建另一列,它將顯示相對於該行的 bmi 值的 bmi 范圍。 下面是我的代碼:
for i in range(df["bmi"].count()):
if df["bmi"][i] < 18.5:
df["bmi_category"] = "Under Weight"
elif 25 > df["bmi"][i] >= 18.5:
df["bmi_category"] = "Healthy Weight"
elif 30 > df["bmi"][i] >= 25:
df["bmi_category"] = "Overweight"
elif df["bmi"][i] >= 30:
df["bmi_category"] = "Obese"
但是當我運行這段代碼時,我得到了這個錯誤。
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
c:\users\hridoy\appdata\local\programs\python\python39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3079 try:
-> 3080 return self._engine.get_loc(casted_key)
3081 except KeyError as err:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 228
The above exception was the direct cause of the following exception:
KeyError Traceback (most recent call last)
<ipython-input-220-e7569ff34eec> in <module>
1 for i in range(cardio["bmi"].count()):
----> 2 if cardio["bmi"][i] < 18.5:
3 cardio["bmi_category"] = "Under Weight"
4 elif 25 > cardio["bmi"][i] >= 18.5:
5 cardio["bmi_category"] = "Healthy Weight"
c:\users\hridoy\appdata\local\programs\python\python39\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
849
850 elif key_is_scalar:
--> 851 return self._get_value(key)
852
853 if is_hashable(key):
c:\users\hridoy\appdata\local\programs\python\python39\lib\site-packages\pandas\core\series.py in _get_value(self, label, takeable)
957
958 # Similar to Index.get_value, but we do not fall back to positional
--> 959 loc = self.index.get_loc(label)
960 return self.index._get_values_for_loc(self, loc, label)
961
c:\users\hridoy\appdata\local\programs\python\python39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
3080 return self._engine.get_loc(casted_key)
3081 except KeyError as err:
-> 3082 raise KeyError(key) from err
3083
3084 if tolerance is not None:
KeyError: 228
誰能告訴我我在這里做錯了什么? 以及如何解決這個問題?
以下將bmi
列中的值映射到bmi_category
列中的值
def get_category(bmi):
if not bmi:
return None
if bmi < 18.5:
return "Under Weight"
if bmi < 25:
return "Healthy Weight"
if bmi < 30:
return "Overweight"
return "Obese"
df['bmi_category'] = df['bmi'].apply(get_category)
PS 如果您發現自己在迭代 dataframe 幾乎總是有一個 function 會更快更干凈地完成它。
您可以使用pd.cut
有效地執行此操作。
df = pd.DataFrame(np.random.randint(16,35,(50,1)), columns=["bmi"])
df['bmi_category'] = pd.cut(df['bmi'], [0, 18.5, 25, 30, np.infty], labels=["Under Weight", "Healthy Weight", "Overweight", "Obese"], right=False)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.