简体   繁体   中英

Element-wise comparison of numpy array

I'd like to ask a question for the table below. Which coding should be ok for comparing the each row with other rows in the array except itself.

For example, I want to compare first row with the rest of the columns to observe whether the values of the first row smaller than any of the rest of the column.

for example:

5>2, 8>9,9<5 it is not because 8>9 is not true);
5>4, 8>5,9>11 it is not as well
5>3,  8>7, 9>8 it should be the final answer.

a=[5,8,9]
b=[2,9,5]
c=[4,5,11]
d=[3,7,8]
df = pd.DataFrame({"c1":[5,2,4,3], "c2":[8,9,5,7], "c3":[9,5,11,8]})

#    c1 c2 c3   
# 0  5  8   9
# 1  2  9   5
# 2  4  5  11
# 3  3  7   8

Which python code should be implemented to get this particular return?

I've tried lots of code blocks but never get to answer so if anyone who knows how to get it done could help me out, I'd be appreciated.

You could use a simple function to compare two items whether separate Lists or rows of a numpy array:

def cmp(i, j):
    for x, y in zip(i, j):
        if x < y:
            return False
    return True

so for your Lists

print(cmp(a, c))
print(cmp(a, d))

You can do this for example (disclaimer for mathematicians: I use 'not comparable' in a very loose sense:) ):

import numpy as np
    
a=[5,8,9]
b=[2,9,5]
c=[4,5,11]
d=[3,7,8]

ax = np.array([a, b, c, d])

def compare(l1,l2):
    if all([x1>x2 for x1,x2 in zip(l1,l2)]):
        return f'{l1} > {l2}'
    elif all([x1<x2 for x1,x2 in zip(l1,l2)]):
        return f'{l1} < {l2}'
    else:
        return f'{l1} and {l2} are not comparable'
    
for i in range(len(ax)):
    print([compare(ax[i],ax[j]) for j in range(len(ax)) if j!=i])

Output:

# ['[5 8 9] and [2 9 5] are not comparable', '[5 8 9] and [ 4  5 11] are not comparable', '[5 8 9] > [3 7 8]']
# ['[2 9 5] and [5 8 9] are not comparable', '[2 9 5] and [ 4  5 11] are not comparable', '[2 9 5] and [3 7 8] are not comparable']
# ['[ 4  5 11] and [5 8 9] are not comparable', '[ 4  5 11] and [2 9 5] are not comparable', '[ 4  5 11] and [3 7 8] are not comparable']
# ['[3 7 8] < [5 8 9]', '[3 7 8] and [2 9 5] are not comparable', '[3 7 8] and [ 4  5 11] are not comparable']

First step: create a second dataframe with the same dimension but only filled with the first row value

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'c1':[5,2,4,3], 'c2':[8,9,5,7], 'c3':[9,5,11,8]})

# df2 : only df1 first row value on definition
df2 = df1[:1]
# Then repeat first row in df2
df2 = df2.loc[np.repeat(df2.index.values, len(df1))]
# reindex like df1
df2.index = df1.index

df2:

  c1 c2 c3
0  5  8  9
1  5  8  9
2  5  8  9
3  5  8  9

Second step: difference between the two dataframes

# difference between df and df2
df = df1-df2

Third step: a query on df to get the row number (which is 3 thus 'd')

r = df.query('c1<0 and c2<0 and c3<0')
print(r.index[0])

Complete code:

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'c1':[5,2,4,3], 'c2':[8,9,5,7], 'c3':[9,5,11,8]})

# df2 : only df1 first row value on definition
df2 = df1[:1]
# Then repeat first row in df2
df2 = df2.loc[np.repeat(df2.index.values, len(df1))]
# reindex like df1
df2.index = df1.index
# difference between df1 and df2
df = df1-df2

r = df.query('c1<0 and c2<0 and c3<0')
print(df1.iloc[r.index[0]])
c1    3
c2    7
c3    8
Name: 3, dtype: int64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM