简体   繁体   中英

How to delete only '0' from pandas dataframe

If my data looks like this:

0                     1               2          3
 1                     0               19          12
 2                     5               0           13
 3                     6               21          0
 4                     7               4           15 

How can I just remove the '0'?

I don't want to delete the entire row/ column just the value.

I want it to look like this:

0                     1               2          3
 1                     5                19           12
 2                     6                21           13
 3                     7                4            15
 

I'm reading the dataframe into pandas from a csv file. This is where I read in the dataframe.

df = pd.read_csv('Plate0-s4_R1.csv')
df.fillna(0, inplace=True)

This is where I'm doing the minimum calculations.

df = df[df.columns].groupby(lambda x: x, axis=1).min()
df = df.groupby(df[0]).min()
df = df.reset_index()

The other solutions I found on the internet involve dropping entire columns/ rows. I don't want to lose the other data in those columns and rows. I just don't want the 0 to interfere with my min calculation.

For further clarification. This is how my dataframe sits at the moment.

df

The '0.0' us what I need to get rid of.

If you're just trying to get the min, as per the comments above, replace all zeros with np.nan Original df (from your post above):

df=pd.DataFrame({'1':[0,5,6,7],'2':[19,0,21,4],'3':[12,13,0,15]})

Temp df for min calc:

df_temp=df.replace(0,np.nan)

Then you can get the min values across all columns and save to a new Series:

df_min=df_temp.min()

If your string above is df_str you can do

import io
import pandas as pd

df = df = pd.read_csv(io.StringIO(df_str), engine='python', sep='\s+', index_col=0)
df.apply(lambda x: x[x != 0].reset_index(drop=True))

and it gives you

   1   2   3
0  5  19  12
1  6  21  13
2  7   4  15

You could sort the values using equality to zero as key. This will move the zeros to the bottom while keeping the other values in order (stable sort).

Then slice the dataframe to remove the last N rows, where N is the max number of zeros in a column.

df = pd.DataFrame({1: [0, 5, 6, 7],
                   2: [19, 0, 21, 4],
                   3: [12, 13, 0, 15]
                   })
(df.apply(lambda c: c.sort_values(key=lambda x: x==0).reset_index(drop=True))
   .iloc[:-df.eq(0).sum().max()]
 )

Output:

   1   2   3
0  5  19  12
1  6  21  13
2  7   4  15

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM