The main problem is to create a list full of categorical factor's indices
There is dataframe with lots of columns types of which were detemined before importing file with pd.read_csv()
.
dtypes = {
...
'Format_type': 'category',
'Geo_new': 'category',
'Age_min': 'int16',
'Age_max': 'int16',
'Sex': 'category',
...}
So I made a table with columns names and their indices , and than take categorical colums by myself
col_list = [i for i in (df.columns.get_values())]
idx_list = [i for i in range(len(df.columns.get_values()))]
column_num = pd.DataFrame(data = {'column_name': col_list,
'idx_list': idx_list})
column_num
Than get table of columns name column_name
and indices idx_list
column_name idx_list
...
Format_type 5
Geo_new 6
Age_min 7
Age_max 8
Sex 9
...
and insert categorical columns indices in the list:
categorical_features = [...5, 6, 9...]
Thus, i fill list by myself. Is there the way to create list of columns, which values are calegory
automatically?
I believe you need DataFrame.select_dtypes
with Index.get_indexer
for indices:
df = pd.DataFrame({
'A':list('abcdef'),
'B':pd.Categorical([4,5,4,5,5,4]),
'C':[7,8,9,4,2,3],
'D': pd.Categorical([1,3,5,7,1,0]),
'E':[5,3,6,9,2,4],
'F':list('aaabbb')
})
c = df.select_dtypes('category').columns
print (c)
Index(['B', 'D'], dtype='object')
i = df.columns.get_indexer(df.select_dtypes('category').columns)
print (i)
[1 3]
Also your code should be simlify:
col_list = df.columns.tolist()
idx_list = range(len(col_list))
column_num = pd.DataFrame(data = {'column_name': col_list, 'idx_list': idx_list})
还有另一种方法!
categorical_list = list(np.where(df.dtypes == 'category')[0])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.