简体   繁体   中英

What's the easiest way to replace categorical columns of data with codes in Pandas?

I have a table of data in .dta format which I have read into python using Pandas. The data is mostly in the categorical data type and I want to replace the columns with numerical data that can be used with machine learning, such as boolean (1/0) or codes. The trouble is that I can't directly replace the data because it won't let me change the categories, unless I add them.

I have tried using pd.get_dummies(), but it keeps returning an error:
TypeError: 'columns' is an invalid keyword argument for this function

print(pd.get_dummies(feature).head(), columns=['smkevr', 'cignow', 'dnnow', 
                                               'dnever', 'complst'])

Is there a simple way to replace this data with numerical codes based on the value (for example 'Not applicable' = 0)?

I do it the following way:

df_dumm = pd.get_dummies(feature).head()
df_dumm.columns = ['smkevr', 'cignow', 'dnnow', 
                   'dnever', 'complst']
print (df_dumm.head())

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM