How to convert a list of ndarray of strings into floats

Question

How could one map aa list of ndarrays containing string objects into specific floats ? For instance, if the user decides to map orange to 1.0 and grapefruit to 2.0 ?

myList = [np.array([['orange'], ['orange'], ['grapefruit']], dtype=object), np.array([['orange'], ['grapefruit'], ['orange']], dtype=object)]

So one would have:

convList = [np.array([['1.0'], ['1.0'], ['2.0']], dtype=float), np.array([['1.0'], ['2.0'], ['1.0']], dtype=float)]

I tried to implement this function:

def map_str_to_float(iterator):
    d = {}
    for ndarr in iterator:
        for string_ in ndarr:
            d[string_] = float(input('Enter your map for {}: '.format(string_)))
    return d

test = map_str_to_float(myList)
print(test)

But I get the following error:

d[string_] = float(input('Enter your map for {}: '.format(string_)))
TypeError: unhashable type: 'numpy.ndarray'

I believe it's because the type of string_ is a numpy array instead of a string...

Answer 1

With that nested loop you will ask the user for an input 6 times (but you have 2 values grapefruit and orange ). I would suggest you to get the unique values first and ask for just unique values:

To do so:

unique_values = np.unique(np.array(myList))

Now as the user for each unique value for a number:

d = {}

for unique_value in unique_values:
    d[unique_value] = float(input(f"give me a number for {unique_value} "))

Now you got your map in variable d .

Update after a comment

Then you can write your own unique method. Please notice the code below would get all unique values regardless of the length of it as long as it's 1D.

unique_values = []
for each_ndarray in myList:
    for value in each_ndarray:
        if not value[0] in unique_values:
            unique_values.append(value[0])

Answer 2

For the error, on debugging string_ is an array ['orange'], cant be key of dictionary

As for How to convert a list of ndarray of strings into floats We use indices, get the indices of strings, and use those indices to print required new indices in same order. Basically np.array([1, 2])[0, 1, 0, 0] will give new array of size 4 with entries in order of indices. Same logic will apply which will skip dictionary mapping in python. Mapping operation will happen through indices in C, so should be fast.

Comments should explain what happens

import numpy as np

dataSet = np.array(['kevin', 'greg', 'george', 'kevin'], dtype='U21')

# Get all the unique strings, and their indices
# Values of indices are based on uniques ordering
uniques, indices = np.unique(dataSet, return_inverse=True)
# >>> uniques
# array(['george', 'greg', 'kevin'], dtype='<U21')
# >>> indices
# array([2, 1, 0, 2])
# Originial array
# >>> uniques[indices]
# array(['kevin', 'greg', 'george', 'kevin'], dtype='<U21')

new_indices = np.array([float(input()) for e in uniques])

# Get new indices indexed using original positions of unique strings in numpy array
print(new_indices[indices])

# You can do the same for multi dimensional arrays

How to convert a list of ndarray of strings into floats

Question

2 answers

solution1
1 2021-07-29 17:17:00

solution2
1 ACCPTED 2021-07-29 17:22:08

How to convert a list of ndarray of strings into floats

Question

2 answers

solution1 1 2021-07-29 17:17:00

solution2 1 ACCPTED 2021-07-29 17:22:08

solution1
1 2021-07-29 17:17:00

solution2
1 ACCPTED 2021-07-29 17:22:08