简体   繁体   中英

pandas: How to compare float values of two columns

My problem is that i do not know how to compare the numbers in two different columns (in the same dataframe). I would like to know if a number in second column is at least two times bigger than the number of the first column in the same row and check if it is the same for the rest of the rows and eventually filter them and, in the end, have a dataframe in which all the numbers in second column are at least two times bigger than the numbers in the first column. So, at first i did this:

ac = pd.DataFrame.dropna(ab)
ad = pd.DataFrame.drop_duplicates(ac)

There were so many NAN that i decided to get rid of them

ad["first column"] = ad["first column"].astype(float)
ad["second column"] = ad["second column"].astype(float)

Even without theses line, i still get the same error in the following

Then i tried to take the next step:

boolean = []

def comp(number):
    if ad.loc[:, "first column"] >= ad.loc[:, "second column"]*2:

        boolean.append[True]

    else:

         boolean.append[False]

At first i wrote it as a for loop but then i changed it to this function. So, i could use apply() method but either way i get this error:

ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index Probe Set ID')

You can create a new series dataframe for each column and apply a comparison using that.

df = pd.DataFrame(... all your data with columns...)
df = df.astype(float) #convert your whole df to a float

firstcol = df['firstcol']
secondcol = df['secondcol']*2

#a new series of True/False
booleanmatch = firstcol>secondcol

#remove rows that are false from df
df= df.loc[booleanmatch,:]

Hope this solves the problem.

To compare two columns in a dataframe, you should use .query https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.query.html

import pandas as pd
d = {'col1': [1, 2, 6], 'col2': [3, 4, 5]}
df = pd.DataFrame(data=d)
df.query('col1 > col2')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM