簡體   English   中英

替換 Python dataframe 的特定列的特定值會引發“SettingWithCopyWarning”

[英]Replacing specific values of a specific column of a Python dataframe throws “SettingWithCopyWarning”

此代碼的執行工作:

import pandas as pd


df = pd.DataFrame({'name':["Adam", "Sarah", "Tom", "Sarah", "Adam", "Tom", "Will"], 'score':[1,16,2,32,11,9,50]})

print(df)

colName = 'score'
df[colName][df[colName] <= 10] = 1
df[colName][(df[colName] > 10) & (df[colName] <= 20)] = 11
df[colName][df[colName] > 20] = 21

print(df)

...但拋出此警告:

test.py:9:SettingWithCopyWarning:試圖在 DataFrame 的切片副本上設置值

請參閱文檔中的警告: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df[colName][df[colName] < = 10] = 1 test.py:10: SettingWithCopyWarning: 試圖在 DataFrame 的切片副本上設置值

請參閱文檔中的警告: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df[colName][(df[colName] > 10) & (df[colName] <= 20)] = 11 test.py:11: SettingWithCopyWarning: 試圖在 DataFrame 的切片副本上設置值

請參閱文檔中的警告: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df[colName][df[colName] > 20] = 21

我想這是深/淺復制的問題? 但是我該如何解決呢? 必須有一種簡單易讀的方法來進行如此簡單的操作嗎?

編輯:它適用於:

df.loc[df[colName] <= 10, colName] = 1

...但這有點不合邏輯,因為 colName 作為第二個參數是違反直覺的...

您可以通過以下方式來排除警告消息:

這里的關鍵元素是 apply 方法。

import pandas as pd


def change(x):
    """function to change column 'score' value """
    if x <= 10:
        return 1
    elif 10 < x <= 20:
        return 11
    elif x > 20:
        return 21
    else:
        return x  # do no changes


if __name__ == "__main__":
    df = pd.DataFrame({
        'name': ["Adam", "Sarah", "Tom", "Sarah", "Adam", "Tom", "Will"],
        'score': [1, 16, 2, 32, 11, 9, 50]
    })

print(df)
print("*" * 100)

df['score'] = df['score'].apply(lambda x: change(x))
print(df)  # changed dataframe

讓我知道它是否有幫助。

試試下面的代碼,希望這會有所幫助。

import pandas as pd
import numpy as np

df = pd.DataFrame({'name':["Adam", "Sarah", "Tom", "Sarah", "Adam", "Tom", "Will"], 'score':[1,16,2,32,11,9,50]})

print(df)

colName = 'score'
df[colName] = np.where(df[colName] <= 10, 1, df[colName])
df[colName] = np.where((df[colName] > 10) & (df[colName] <= 20), 11 , df[colName])
df[colName] = np.where(df[colName] > 20, 21 , df[colName])
print(df)

輸出將是:

  name  score
0   Adam      1
1  Sarah     16
2    Tom      2
3  Sarah     32
4   Adam     11
5    Tom      9
6   Will     50

****NEW*********


    name  score
0   Adam      1
1  Sarah     11
2    Tom      1
3  Sarah     21
4   Adam     11
5    Tom      1
6   Will     21

這不會給出任何警告,因為您沒有處理 dataframe 的任何切片,而是使用條件子句並更新列的值。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM