简体繁体中英

Pandas performance: Multiple dtypes in one column or split into different dtypes?

原文 2014-05-21 13:25:16 2 1 python/ pandas

I have huge pandas DataFrames I work with. 20mm rows, 30 columns. The rows have a lot of data, and each row has a "type" that uses certain columns. Because of this, I've currently designed the DataFrame to have some columns that are mixed dtypes for whichever 'type' the row is.

My question is, performance wise, should I split out mixed dtype columns into two separate columns or keep them as one? I'm running into problems getting some of these DataFrames to even save(to_pickle) and trying to be as efficient as possible.

The columns could be mixes of float/str, float/int, float/int/str as currently constructed.

1 answers

Seems to me that it may depend on what your subsequent use case is. But IMHO I would make each column unique type otherwise functions such as group by with totals and other common Pandas functions simply won't work.

pandas dtypes column coercion

pandas unique values multiple columns different dtypes

Assign pandas dataframe column dtypes

Changing dataframe column dtypes in Pandas

Pandas Converting columns to different dtypes

Pandas apply based on column dtypes

Filter pandas dataframe based on different dtypes

Eliminate outliers in a dataframe with different dtypes - Pandas

How to set dtypes by column in pandas DataFrame

Loading pandas table with column names and dtypes

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question pandas dtypes column coercion pandas unique values multiple columns different dtypes Assign pandas dataframe column dtypes Changing dataframe column dtypes in Pandas Pandas Converting columns to different dtypes Pandas apply based on column dtypes Filter pandas dataframe based on different dtypes Eliminate outliers in a dataframe with different dtypes - Pandas How to set dtypes by column in pandas DataFrame Loading pandas table with column names and dtypes

Related Tags

Pandas performance: Multiple dtypes in one column or split into different dtypes?

Question

1 answers

solution1 0 2014-05-21 13:37:26

solution1
0 2014-05-21 13:37:26