简体   繁体   中英

Pandas: Remove all NaN values in all columns

I have a data frame with many null records:

Col_1    Col_2      Col_3
10         5          2
22         7          7
3         9          5       
4         NaN       NaN
5         NaN       NaN
6         4         NaN
7         6          7
8         10        NaN
12        NaN        1

I want to remove all NaN values in all rows of columns . As you could see, each column has different number of rows. So, I want to get something like this:

Col_1    Col_2      Col_3
10         5          2
22         7          7
3          9          5       
4          4          7
6          6          1
7         10          
8                 
12    

I tried

filtered_df = df.dropna(how='any')

But it removes all records in the dataframe. How may I do that ?

Using Divakar's justify function—

df[:] = justify(df.values, invalid_val=np.nan, axis=0, side='up')
df = df.fillna('')

print(df)

   Col_1 Col_2 Col_3
0   10.0     5     2
1   22.0     7     7
2    3.0     9     5
3    4.0     4     7
4    5.0     6     1
5    6.0    10      
6    7.0            
7    8.0            
8   12.0            

As you could see, each column has different number of rows.

A DataFrame is a tabular data structure: you can look up an index and a column, and find the value. If the number of rows is different per columns, then the index is meaningless and misleading. A dict might be a better alternative:

{c: df[c].dropna().values for c in df.columns}

or

{c: list(df[c]) for c in df.columns}

You can also use pd.concat on a list of series.

Note that columns Col_2 and Col_3 are unavoidably float due to NaN elements, if you remove dtype=object as an option.

res = pd.concat([df[x].dropna().reset_index(drop=True) for x in df], axis=1)

print(res)

   Col_1  Col_2  Col_3
0     10    5.0    2.0
1     22    7.0    7.0
2      3    9.0    5.0
3      4    4.0    7.0
4      5    6.0    1.0
5      6   10.0    NaN
6      7    NaN    NaN
7      8    NaN    NaN
8     12    NaN    NaN

你也可以试试这个

censos_data.dropna(subset=censos_data.columns,inplace=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM