简体   繁体   中英

In python dataframe how to select rows based if all the columns values are same?

I have the below format of df:

Name  A1   A2  A3  A4
def   0    0   0   0
def1  0    1   0   0
def2  0    0   0   0
def3  1    0   0   0
def4  0    0   0   0

Expected output:

Name  A1   A2  A3  A4
def   0    0   0   0
def2  0    0   0   0
def4  0    0   0   0

Use if Name is first column compare first column by all columns selected by DataFrame.iloc and DataFrame.all :

df1 = df.iloc[:, 1:]
#if `Name` is any column
#df1 = df.drop('Name', axis=1)

df = df[df1.eq(df1.iloc[:, 0], axis=0).all(axis=1)]
print (df)
   Name  A1  A2  A3  A4
0   def   0   0   0   0
2  def2   0   0   0   0
4  def4   0   0   0   0

If Name is index:

print (df)
      A1  A2  A3  A4
Name                
def    0   0   0   0
def1   0   1   0   0
def2   0   0   0   0
def3   1   0   0   0
def4   0   0   0   0


df = df[df.eq(df.iloc[:, 0], axis=0).all(axis=1)]
print (df)
      A1  A2  A3  A4
Name                
def    0   0   0   0
def2   0   0   0   0
def4   0   0   0   0

If performance is not important, because slow in large DataFrame use DataFrame.nunique :

df = df[df.nunique(axis=1).eq(1)]

Using pandas.DataFrame.nunique with axis=1 :

df.set_index("Name").nunique(1).eq(1)

Output:

Name
def      True
def1    False
def2     True
def3    False
def4     True
dtype: bool

Alternative approach is to check the variance of each column:

df[df.var(axis=1) == 0]
Name  A1   A2  A3  A4
def   0    0   0   0
def2  0    0   0   0
def4  0    0   0   0

Using the drop method on your dataframe you delete the entire row.

data.drop(["def1", "def3"], inplace = False)

The first argument is a list of index. Inplace argument is to change the original dataframe if true.

to learn more visit: Dataframe

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM