How to compare values in pandas pivot_table with different indices?

Question

Pivot Table:

COURSE          ENGLISH       MATH       ART
STUDENT              

StudentA        95.0          83.0       97.0
StudentB        91.0          93.0       47.0
StudentC        85.0          84.0       92.0
StudentD        97.0          84.0       85.0
StudentE        93.0          88.0       85.0
StudentAvg      94.5          83.7       96.9

I want a list of students who have a grade more than 5% lower than StudentAvg by subject. So in this case I'd want something like:

English: StudentC Math: Art: StudentB, StudentD, StudentE

How can I do this in Pandas?

Answer 1

This returns a list of tuples that show which student and in which subject had a grade more than 5% less that the average.

avg = df.loc['StudentAvg', :]
i, j = np.where(((df / avg) - 1) < -.05)
list(zip(df.index[i], df.columns[j]))

[('StudentB', 'ART'),
 ('StudentC', 'ENGLISH'),
 ('StudentC', 'ART'),
 ('StudentD', 'ART'),
 ('StudentE', 'ART')]

We can speed up a bit with

p = df.index.get_loc('StudentAvg')
v = df.values
i, j = np.where(((v / v[p]) - 1) < -.05)
list(zip(df.index[i], df.columns[j]))

[('StudentB', 'ART'),
 ('StudentC', 'ENGLISH'),
 ('StudentC', 'ART'),
 ('StudentD', 'ART'),
 ('StudentE', 'ART')]

Timing

%%timeit
p = df.index.get_loc('StudentAvg')
v = df.values
i, j = np.where(((v / v[p]) - 1) < -.05)
list(zip(df.index[i], df.columns[j]))
10000 loops, best of 3: 41.7 µs per loop

%%timeit
avg = df.loc['StudentAvg', :]
i, j = np.where(((df / avg) - 1) < -.05)
list(zip(df.index[i], df.columns[j]))\
1000 loops, best of 3: 662 µs per loop

Answer 2

EDIT:

df.apply(lambda x: str(x.name)+ ': ' + ', '.join(df[((x-x.loc['StudentAvg'])/x.loc['StudentAvg']*100<-5.0)].index.tolist())).values.tolist()

Output:

['ENGLISH: StudentC', 'MATH: ', 'ART: StudentB, StudentC, StudentD, StudentE']

Let's use this:

mask = df.apply(lambda x: (x-x.loc['StudentAvg'])/x.loc['StudentAvg']*100<-5.0).any(axis=1)
df[mask].index.tolist()

Output:

['StudentB', 'StudentC', 'StudentD', 'StudentE']

How to compare values in pandas pivot_table with different indices?

Question

2 answers

solution1
2 ACCPTED 2017-06-21 02:55:18

solution2
1 2017-06-21 02:38:32

EDIT:

How to compare values in pandas pivot_table with different indices?

Question

2 answers

solution1 2 ACCPTED 2017-06-21 02:55:18

solution2 1 2017-06-21 02:38:32

EDIT:

solution1
2 ACCPTED 2017-06-21 02:55:18

solution2
1 2017-06-21 02:38:32