简体   繁体   中英

Ignore columns that only contain missing values using pd.read_csv

I have created an application that reads in data using pd.read_csv. There are some datasets that we get that have columns that only contain missing values (empty cells). Is there any way using pandas to not load those columns into the dataframe? As the dataset can be quite large it would be more convenient to ignore them in the pre-loading stage.

Of course I could delete them from the excel sheet, but I am aiming to make the data loading as automated as possible.

You can use pd.read_csv(file, keep_default_na=False). All NA values will not be loaded, but instead of them will be empty rows, so this will save memory. For more details you can read: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM