简体   繁体   中英

Numpy - How can I transform this array without using python loops?

My input is a list of y_true labels, where the element in position i contains a value in the range of 0..len(classes) and depicts what class that element of the data set truly is. i ranges from 0 to len(data) . Example below:

# 5 elements in data, 3 classes, all of which had representation in the data:
y_true = [0,2,1,0,1]

I want my output to be a list of lists that is len(data) by len(classes) , where inner list i would have a 1 in the position of y_true[i], and 0 in the other len(classes)-1 slots, example:

#same configuration as the previous example
y_true = [0,2,1,0,1]  
result = [[1,0,0],[0,0,2],[0,1,0],[1,0,0],[0,1,0]]

Here's how I'm initilazing result :

result = np.zeros((len(y_true), max(y_true)+1))

However I haven't been able to make any further progress with this issue. I tried using add.at(result, y_true, 1) and this with y_true's shape flipped, but neither produced the result I wanted. What fuction(s) can achieve what I'm trying to do here?

Edit: For better clarity on what I want to achieve, I made it using a for loop:

result = np.zeros((len(y_true), max(y_true)+1))
for x in range(4):
  result[x][y_true[x]] = 1

You can use fancy indexing:

result = np.zeros((len(y_true), max(y_true)+1), dtype=int)
result[np.arange(len(y_true)), y_true] = 1

output:

array([[1, 0, 0],
       [0, 0, 1],
       [0, 1, 0],
       [1, 0, 0],
       [0, 1, 0]])

alternative

an interesting alternative might be to use pandas.get_dummies :

import pandas as pd
result = pd.get_dummies(y_true).to_numpy()

output:

array([[1, 0, 0],
       [0, 0, 1],
       [0, 1, 0],
       [1, 0, 0],
       [0, 1, 0]], dtype=uint8)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM