简体   繁体   中英

Memory error when using fit_transform with OneHotEncoder

I am trying to One Hot Encode categorical columns in my dataset. I am using the following function:

def create_ohe(df, col):
    le = LabelEncoder()
    a = le.fit_transform(df_new[col]).reshape(-1,1)
    ohe = OneHotEncoder(sparse=False)
    column_names = [col + "_" + str(i) for i in le.classes_]
    return (pd.DataFrame(ohe.fit_transform(a), columns=column_names))

I am getting MemoryError when I call the function in this loop:

for column in categorical_columns:
    temp_df = create_ohe(df_new, column)
    temp = pd.concat([temp, temp_df], axis=1)

Error Traceback:

MemoryError                               Traceback (most recent call last)
<ipython-input-40-9b241e8bf9e6> in <module>
      1 for column in categorical_columns:
----> 2     temp_df = create_ohe(df_new, column)
      3     temp = pd.concat([temp, temp_df], axis=1)
      4 print("\nShape of final df after one hot encoding: ", temp.shape)

<ipython-input-34-1530423fdf06> in create_ohe(df, col)
      8     ohe = OneHotEncoder(sparse=False)
      9     column_names = [col + "_" + str(i) for i in le.classes_]
---> 10     return (pd.DataFrame(ohe.fit_transform(a), columns=column_names))

MemoryError: 

Ah memory error means that either your computer is at the maximum use of your memory (RAM) or that python is at the maximum: Memory errors and list limits?

you could try to split the a = le.fit_transform(df_new[col]).reshape(-1,1) method. Try to run b= le.fit(df_new[col]) so that you are fitting your label encoder with the full dataset, and then you could split it that you do not transform it for every row at the same time, maybe this helps. If b= le.fit(df_new[col]) is also not working, you have a memory problem, the col you have the replace with your column names.

fit_transform is a combination of fit and transform .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM