简体   繁体   中英

How to convert 2d np.array of lists of floats into a 2d np.array of floats, stacking the list values to rows

I have a huge 2d numpy array of lists (dtype object ) that I want to convert into a 2d numpy array of dtype float , stacking the dimension represented by lists onto the 0th axis (rows). The lists within each row always have the exact same length, and have at least one element.

Here is a minimal reproduction of the situation:

import numpy as np
current_array = np.array(
    [[[0.0], [1.0]], 
    [[2.0, 3.0], [4.0, 5.0]]]
)
desired_array = np.array(
    [[0.0, 1.0], 
    [2.0, 4.0],
    [3.0, 5.0]]
)

I looked around for solutions, and stack and dstack functions work only if the first level is a tuple. reshape would require the third level to be a part of the array. I wonder, is there any relatively efficient way to do it?

Currently, I am just counting the dimensions, creating empty array and filling the values one by one, which honestly does not seem like a good solution.

In [321]: current_array = np.array( 
     ...:     [[[0.0], [1.0]],  
     ...:     [[2.0, 3.0], [4.0, 5.0]]] 
     ...: )                                                                     
In [322]: current_array                                                         
Out[322]: 
array([[list([0.0]), list([1.0])],
       [list([2.0, 3.0]), list([4.0, 5.0])]], dtype=object)
In [323]: _.shape                                                               
Out[323]: (2, 2)

Rework the two rows:

In [328]: current_array[1,:]                                                    
Out[328]: array([list([2.0, 3.0]), list([4.0, 5.0])], dtype=object)
In [329]: np.stack(current_array[1,:],1)                                        
Out[329]: 
array([[2., 4.],
       [3., 5.]])

In [330]: np.stack(current_array[0,:],1)                                        
Out[330]: array([[0., 1.]])

combine them:

In [331]: np.vstack((_330, _329))                                               
Out[331]: 
array([[0., 1.],
       [2., 4.],
       [3., 5.]])

in one line:

In [333]: np.vstack([np.stack(row, 1) for row in current_array])                
Out[333]: 
array([[0., 1.],
       [2., 4.],
       [3., 5.]])

Author of the question here.

I found a slightly more elegant (and faster) way than filling the array one by one, which is:

desired = np.array([np.concatenate([np.array(d) for d in lis]) for lis in current.T]).T
print(desired)
'''
[[0. 1.]
 [2. 4.]
 [3. 5.]]
 '''

But it still does quite the number of operations. It transposes the table to be able to stack the neighboring 'dimensions' (one of them is the lists) with np.concatenate , and then converts the result to np.array and transposes it back.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM