簡體   English   中英

列的長度必須與鍵相同。 為什么我會收到這樣的錯誤? 誰能幫我嗎?

[英]Columns must be same length as key. Why I'm getting such error? can anyone help me out?

cols = ["Gender", "Married", "Education", "Self_Employed", "Property_Area", "Loan_Status", "Dependents"]
for col in cols:
    df[col] = pd.get_dummies(df[col], drop_first=True)

get_dummies生成DataFrame可能有很多列(即使您使用drop_first=True )並且您需要join()來添加所有新列。

稍后您可以使用del df[col]刪除舊列。


最小的例子:

import pandas as pd

data = {
    'Gender':  ['Female','Male','Female'], 
    'Married': ['Yes','No','Yes'], 
}

df = pd.DataFrame(data)

print('\n--- before ---\n')
print(df)

for col in df.columns:
    result = pd.get_dummies(df[col])

    df = df.join(result)

    #del df[col]
    
print('\n--- after ---\n')
print(df)

結果:

--- before ---

   Gender Married
0  Female     Yes
1    Male      No
2  Female     Yes

--- after ---

   Gender Married  Female  Male  No  Yes
0  Female     Yes       1     0   0    1
1    Male      No       0     1   1    0
2  Female     Yes       1     0   0    1

列可能具有相同的值 - 即。 YesNo - 因此您可能需要為新列添加前綴。

    result = pd.get_dummies(df[col], prefix=col)

最終

    result = result.add_prefix(f"{col}_")

最小的例子:

import pandas as pd

data = {
    'Gender':  ['Female','Male','Female'], 
    'Married': ['Yes','No','Yes'],
    'Self_Employed': ['No','Yes','Yes'],
}

df = pd.DataFrame(data)

print('\n--- before ---\n')
print(df)

for col in df.columns:
    result = pd.get_dummies(df[col], prefix=col)
    #result = result.add_prefix(f"{col}_")

    df = df.join(result)

    #del df[col]

print('\n--- after ---\n')
print(df.to_string())

結果:

--- before ---

   Gender Married Self_Employed
0  Female     Yes            No
1    Male      No           Yes
2  Female     Yes           Yes

--- after ---

   Gender Married Self_Employed  Gender_Female  Gender_Male  Married_No  Married_Yes  Self_Employed_No  Self_Employed_Yes
0  Female     Yes            No              1            0           0            1                 1                  0
1    Male      No           Yes              0            1           1            0                 0                  1
2  Female     Yes           Yes              1            0           0            1                 0                  1

編輯:

您可以在get_dummies()for -loop 使用columns=執行相同的操作

all_results = pd.get_dummies(df, columns=df.columns) #, drop_first=True)
df = df.join(all_results)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM