简体   繁体   中英

How to use numpy digitize for concatenated values that fall outside of range?

I am trying to use digitize via the numpy module to assist in maintaining a gradebook. The idea is to input the total points earned by a student in a classroom such that the output is the corresponding letter grade. My attempt is below:

import numpy as np
from collections import OrderedDict

## letter grades and points at cusps of letter grades
letter_grades = np.array(['F', 'D-', 'D', 'D+', 'C-', 'C', 'C+', 'B-', 'B', 'B+', 'A-', 'A'])
point_edges = np.concatenate(np.linspace(101, 153, len(letter_grades)), 10**3)
point_edges[0] = 0

## each letter grade corresponds to point values within the two corresponding point edges
edge_pairs = np.array([('{} - {}'.format(point_edges[idx-1], point_edges[idx])) for idx in range(1, len(point_edges))])
criteria = OrderedDict(zip(letter_grades, edge_pairs))
# print(criteria)

## sample data (the top one works, the one below throws an error)
# point_scores = (0, 100, 100.9, 101, 101.1, 136)
point_scores = (0, 100, 100.9, 101, 101.1, 136, 146, 150, 152, 153, 154)

## use numpy to get result
indices = np.digitize(point_scores, point_edges)
final_grades = letter_grades[indices]

for point, grade in zip(point_scores, final_grades):
    print("\n .. {} POINTS :: {}\n".format(point, grade))

Running the above code outputs the following error:

IndexError: index 12 is out of bounds for axis 1 with size 12

I made 1000 the last element of point_edges so that any input values greater than 153 would output 'A' (as seen in the print(criteria) statement commented out above. However, the algorithm only works for input values strictly less than 153. Why is this happening and how can I fix it?

np.digizize has a different numbering than np.histogram to indicate values that are outside of the boundaries:

From the docs :

If values in x are beyond the bounds of bins, 0 or len(bins) is returned as appropriate.

The index 12 in your case indicates that a value is above the given limits. If you want the last bin, this means index 11 in your case. The first bin with index 0 are values below the lower boundary, index 1 is the first valid bin.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM