I have a Dataframe
storing different types of columns, float/int/object .
Since, the Dataset
is too large I am looking for ways to reduce memory usage.
I found "Categorical"
can be adopted to reduce the memory usage on "Object"
type columns and I applied this on them. But once I, change a column with list value, error "TypeError: unhashable type: 'list'"
Here is my dataframe
vs_df = pd.DataFrame({'plan_name': ['abc', 'def'], 'plan_id': [10001, 10002]})
vs_df['handled_plans_id'] = np.empty((len(vs_df), 0)).tolist()
vs_df.at[[0, 1], 'handled_plans_id'] = [[105,120], []]
vs_df.handled_plans_id = vs_df.handled_plans_id.astype('category') # Error here
print(vs_df)
plan_id plan_name handled_plans_id
0 10001 abc [105, 120]
1 10002 def []
Error:
TypeError: unhashable type: 'list'
File "pandas\_libs\hashtable_class_helper.pxi", line 1367, in pandas._libs.hashtable.PyObjectHashTable.get_labels
Any methods solving this or reduce the size of this column with list values are appreciate. Thanks!
Update
Lots of values inside handled_plans_id
column are different. Would like to see any methods to reduce memory usage on this column.
Using tuples.
vs_df = pd.DataFrame({'plan_name': ['abc', 'def'], 'plan_id': [10001, 10002]})
vs_df['handled_plans_id'] = [()]*len(vs_df)
vs_df.at[[0, 1], 'handled_plans_id'] = [(105,120), ()]
vs_df.handled_plans_id = vs_df.handled_plans_id.astype('category')
print(vs_df)
plan_id plan_name handled_plans_id
0 10001 abc (105, 120)
1 10002 def ()
If the tuples will be a known maximum length, you could split them into columns. Categorical should help more then, although not if you have too many distinct numbers. Categorical data is usually one of a small set--an enumeration , like 'heart', 'spades', 'diamonds', 'clubs', that sort of thing. If you have too many distinct values, converting to a category won't help much.
If the file is too large to fit in memory, you can chunk it.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.