[英]Replacing specific values of a specific column of a Python dataframe throws “SettingWithCopyWarning”
此代碼的執行工作:
import pandas as pd
df = pd.DataFrame({'name':["Adam", "Sarah", "Tom", "Sarah", "Adam", "Tom", "Will"], 'score':[1,16,2,32,11,9,50]})
print(df)
colName = 'score'
df[colName][df[colName] <= 10] = 1
df[colName][(df[colName] > 10) & (df[colName] <= 20)] = 11
df[colName][df[colName] > 20] = 21
print(df)
...但拋出此警告:
test.py:9:SettingWithCopyWarning:試圖在 DataFrame 的切片副本上設置值
請參閱文檔中的警告: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df[colName][df[colName] < = 10] = 1 test.py:10: SettingWithCopyWarning: 試圖在 DataFrame 的切片副本上設置值
請參閱文檔中的警告: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df[colName][(df[colName] > 10) & (df[colName] <= 20)] = 11 test.py:11: SettingWithCopyWarning: 試圖在 DataFrame 的切片副本上設置值
請參閱文檔中的警告: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df[colName][df[colName] > 20] = 21
我想這是深/淺復制的問題? 但是我該如何解決呢? 必須有一種簡單易讀的方法來進行如此簡單的操作嗎?
編輯:它適用於:
df.loc[df[colName] <= 10, colName] = 1
...但這有點不合邏輯,因為 colName 作為第二個參數是違反直覺的...
您可以通過以下方式來排除警告消息:
這里的關鍵元素是 apply 方法。
import pandas as pd
def change(x):
"""function to change column 'score' value """
if x <= 10:
return 1
elif 10 < x <= 20:
return 11
elif x > 20:
return 21
else:
return x # do no changes
if __name__ == "__main__":
df = pd.DataFrame({
'name': ["Adam", "Sarah", "Tom", "Sarah", "Adam", "Tom", "Will"],
'score': [1, 16, 2, 32, 11, 9, 50]
})
print(df)
print("*" * 100)
df['score'] = df['score'].apply(lambda x: change(x))
print(df) # changed dataframe
讓我知道它是否有幫助。
試試下面的代碼,希望這會有所幫助。
import pandas as pd
import numpy as np
df = pd.DataFrame({'name':["Adam", "Sarah", "Tom", "Sarah", "Adam", "Tom", "Will"], 'score':[1,16,2,32,11,9,50]})
print(df)
colName = 'score'
df[colName] = np.where(df[colName] <= 10, 1, df[colName])
df[colName] = np.where((df[colName] > 10) & (df[colName] <= 20), 11 , df[colName])
df[colName] = np.where(df[colName] > 20, 21 , df[colName])
print(df)
輸出將是:
name score
0 Adam 1
1 Sarah 16
2 Tom 2
3 Sarah 32
4 Adam 11
5 Tom 9
6 Will 50
****NEW*********
name score
0 Adam 1
1 Sarah 11
2 Tom 1
3 Sarah 21
4 Adam 11
5 Tom 1
6 Will 21
這不會給出任何警告,因為您沒有處理 dataframe 的任何切片,而是使用條件子句並更新列的值。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.