简体   繁体   中英

compare two columns value in dataframe

I have a csv data frame like below, I'd like to compare two column value and generate third column, if value is same will return True , not same return False , how to compare with pandas python?

one two
1   a
2   b
3   a
4   b
5   5
6   6
7   7
8   8
9   9
10  10

You need if values are mixed ( string and int ):

df['three'] = df.one == df.two

But need to_numeric if values are not mixed - dtype of first column is int and second is object what is obviously string and in column one are not NaN values, because to_numeric with parameter errors='coerce' return NaN for non numeric values:

print (pd.to_numeric(df.two, errors='coerce'))
0     NaN
1     NaN
2     NaN
3     NaN
4     5.0
5     6.0
6     7.0
7     8.0
8     9.0
9    10.0
Name: two, dtype: float64

df['three'] = df.one == pd.to_numeric(df.two, errors='coerce')
print (df)
   one two  three
0    1   a  False
1    2   b  False
2    3   a  False
3    4   b  False
4    5   5   True
5    6   6   True
6    7   7   True
7    8   8   True
8    9   9   True
9   10  10   True

Faster solution with Series.eq :

df['three'] = df.one.eq(pd.to_numeric(df.two, errors='coerce'))
print (df)
   one two  three
0    1   a  False
1    2   b  False
2    3   a  False
3    4   b  False
4    5   5   True
5    6   6   True
6    7   7   True
7    8   8   True
8    9   9   True
9   10  10   True

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM