How to get percentage similarity between two float values dataframe in Python?

Question

This is what I'm imagining. Here's a sample table:

Col A	Col B	Similarity
5.0	5.0	1.000
3.8	2.3	0.700
1.3	6.7	0.300
2.7	8.5	0.350
2.9	2.9	1.000

What algorithm should I use to do this?

Answer 1

If you need an arbitrary similarity function which returns something between 1 and 0 this is really simple but it would work.

df['similarity'] = df['Col A'] / df['Col B']
df.loc[df['similarity'] > 1, 'similarity'] = 1 / df['similarity']

Answer 2

You can try something like this:

x = df.max(axis=1)
y = abs(df['Col A']- df['Col B'])
df['Similarity'] = 1-(y/x)

df:

    Col A   Col B   Similarity
0   5.0     5.0     1.000000
1   3.8     2.3     0.605263
2   1.3     6.7     0.194030
3   2.7     8.5     0.317647
4   2.9     2.9     1.000000

How to get percentage similarity between two float values dataframe in Python?

Question

2 answers

solution1
1 2021-07-29 13:16:17

solution2
0 2021-07-29 13:54:38

How to get percentage similarity between two float values dataframe in Python?

Question

2 answers

solution1 1 2021-07-29 13:16:17

solution2 0 2021-07-29 13:54:38

solution1
1 2021-07-29 13:16:17

solution2
0 2021-07-29 13:54:38