简体   繁体   中英

How to deal with itertools.product when using numpy array?

I want to get tensor products of two numpy arrays. For example, given

a = np.random.uniform(-1,1,size=[10,2])
b = np.random.uniform(2,3,size=[20,3])

and I want to take

products = np.array(list(prod for prod in itertools.product(a,b)))

However, when I do this, the resulting array will be

[[array([-0.0954691 ,  0.36734629])
  array([2.20196909, 2.02029329, 2.29627849])]
  ...
 [array([-0.07571476,  0.95934329])
  array([2.46847944, 2.3456241 , 2.28091522])]]

I want to get rid of 'array' in the list to get

 [[[-0.0954691 ,  0.36734629],[2.20196909, 2.02029329, 2.29627849]]
  ...
  [[-0.07571476,  0.95934329],[2.46847944, 2.3456241 , 2.28091522]]]

Possibly I can use

for i in range(products.shape[0]):
    np.concatenate((products[i][0], products[i][1]))

But I think there is more clever way to do it. Can anyone help me?Thanks.

In [129]: products = np.array(list(prod for prod in itertools.product(a,b)))    

The result is a 2d array - but with object dtype:

In [130]: products.shape                                                        
Out[130]: (200, 2)

The first row of this array, is also object dtype, with 2 elements, each an array:

In [131]: products[0]                                                           
Out[131]: 
array([array([-0.38279696,  0.51916671]),
       array([2.26576386, 2.50428761, 2.1463347 ])], dtype=object)

It contains the first rows of a and b :

In [132]: a[0]                                                                  
Out[132]: array([-0.38279696,  0.51916671])
In [133]: b[0]                                                                  
Out[133]: array([2.26576386, 2.50428761, 2.1463347 ])

Since those arrays have different lengths, the resulting combination must be object dtype. If a and b had the same number of columns, you'd get numeric array

For example with a and a :

In [134]: arr = np.array(list(prod for prod in itertools.product(a,a)))         
In [135]: arr.shape                                                             
Out[135]: (100, 2, 2)
In [136]: arr.dtype                                                             
Out[136]: dtype('float64')

If we convert a and b to lists (nested), we again get a 2d object array - containing lists:

In [137]: products = np.array(list(prod for prod in itertools.product(a.tolist()
     ...: ,b.tolist())))                                                        
In [138]: products.shape                                                        
Out[138]: (200, 2)
In [139]: products[0,:]                                                         
Out[139]: 
array([list([-0.38279696426849363, 0.5191667144605163]),
       list([2.2657638604936015, 2.50428761464766, 2.1463346999537767])],
      dtype=object)

If we omit the array wrapper, we get a list of tuples (of lists):

In [140]: products = list(prod for prod in itertools.product(a.tolist(),b.tolist
     ...: ()))                                                                  
In [141]: len(products)                                                         
Out[141]: 200
In [142]: products[0]                                                           
Out[142]: 
([-0.38279696426849363, 0.5191667144605163],
 [2.2657638604936015, 2.50428761464766, 2.1463346999537767])
In [143]: type(products)                                                        
Out[143]: list

product produces tuples (see its docs), as seen in this simpler example:

In [145]: list(itertools.product('abc','def'))                                  
Out[145]: 
[('a', 'd'),
 ('a', 'e'),
 ('a', 'f'),
 ('b', 'd'),
 ('b', 'e'),
 ('b', 'f'),
 ('c', 'd'),
 ('c', 'e'),
 ('c', 'f')]

edit

Splitting the prod tuple as you comment:

In [147]: arr0 = np.array(list(prod[0] for prod in itertools.product(a,b)))     
In [148]: arr1 = np.array(list(prod[1] for prod in itertools.product(a,b)))     
In [149]: arr0.shape                                                            
Out[149]: (200, 2)
In [150]: arr1.shape                                                            
Out[150]: (200, 3)
In [151]: arr0[:3,:]                                                            
Out[151]: 
array([[-0.38279696,  0.51916671],
       [-0.38279696,  0.51916671],
       [-0.38279696,  0.51916671]])
In [152]: arr1[:3,:]                                                            
Out[152]: 
array([[2.26576386, 2.50428761, 2.1463347 ],
       [2.63018066, 2.64559639, 2.51747175],
       [2.14425882, 2.39274225, 2.6460254 ]])

These are two numeric arrays.

They could be joined on axis=1 to make an array with 5 columns:

In [153]: arr3 = np.hstack((arr0,arr1))                                         
In [154]: arr3[:3,:]                                                            
Out[154]: 
array([[-0.38279696,  0.51916671,  2.26576386,  2.50428761,  2.1463347 ],
       [-0.38279696,  0.51916671,  2.63018066,  2.64559639,  2.51747175],
       [-0.38279696,  0.51916671,  2.14425882,  2.39274225,  2.6460254 ]])

bonus

Making a structured array from these 2 arrays:

In [159]: dt=np.dtype([('a',float,2),('b',float,3)])                            
In [160]: arr3 = np.zeros(200,dt)                                               
In [161]: arr3['a']=arr0                                                        
In [162]: arr3['b']=arr1                                                        
In [163]: arr3[:3]                                                              
Out[163]: 
array([([-0.38279696,  0.51916671], [2.26576386, 2.50428761, 2.1463347 ]),
       ([-0.38279696,  0.51916671], [2.63018066, 2.64559639, 2.51747175]),
       ([-0.38279696,  0.51916671], [2.14425882, 2.39274225, 2.6460254 ])],
      dtype=[('a', '<f8', (2,)), ('b', '<f8', (3,))])

You can solve it with one for loop instead of two

a = np.random.uniform(-1,1,size=[1,2])
b = np.random.uniform(2,3,size=[1,3])

temp_array = []
result_array=[]
for prod in itertools.product(a,b):
    temp_array.append(list(prod[0]))
    temp_array.append(list(prod[1]))
    result_array.append(temp_array)
    temp_array=[]

Answer is result_array

> [[[0.5345439210605363, -0.3895013480686571],   [2.6760262824054353, > 2.1221940892354487, 2.4009406883314517]], 
 [[0.5345439210605363, -0.3895013480686571],   [2.367796128612561, 2.1553525177821724, 2.638708096912526]],
...
...

Try nested for loops that convert each array into a list:

import numpy as np 
import itertools

a = np.random.uniform(-1,1,size=[10,2])
b = np.random.uniform(2,3,size=[20,3])

products = np.array(list(prod for prod in itertools.product(a,b)))
print(products)

products = [list(i) for product in products for i in product]
print(products)

Example:

a = np.random.uniform(-1,1,size=[1,2])
b = np.random.uniform(2,3,size=[1,3])

Output of example:

# This is the first print output
[[array([-0.76844481, -0.77955549])
  array([ 2.73748408,  2.65023585,  2.49984462])]] 

# This is the second print output
[[-0.76844480922803649, -0.77955548831103427], [2.7374840778087144, 2.6502358496635754, 2.4998446233196443]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM