简体   繁体   中英

How to combine multiple numpy arrays into a dictionary list

I have the following array:

column_names = ['id', 'temperature', 'price']

And three numpy array as follows:

idArry = ([1,2,3,4,....])

tempArry = ([20.3,30.4,50.4,.....])

priceArry = ([1.2,3.5,2.3,.....])

I wanted to combine the above into a dictionary as follows:

table_dict = ( {'id':1, 'temperature':20.3, 'price':1.2 },
               {'id':2, 'temperature':30.4, 'price':3.5},...)

I can use a for loop together with append to create the dictionary but the list is huge at about 15000 rows. Can someone show me how to use python zip functionality or other more efficient and fast way to achieve the above requirement?

You can use a listcomp and the function zip() :

[{'id': i, 'temperature': j, 'price': k} for i, j, k in zip(idArry, tempArry, priceArry)]
# [{'id': 1, 'temperature': 20.3, 'price': 1.2}, {'id': 2, 'temperature': 30.4, 'price': 3.5}]

If your ids are 1, 2, 3... and you use a list you don't need ids in your dicts. This is a redundant information in the list.

[{'temperature': i, 'price': j} for i, j in zip(tempArry, priceArry)]

You can use also a dict of dicts. The lookup in the dict must be faster than in the list.

{i: {'temperature': j, 'price': k} for i, j, k in zip(idArry, tempArry, priceArry)}
# {1: {'temperature': 20.3, 'price': 1.2}, 2: {'temperature': 30.4, 'price': 3.5}}

This could work. Enumerate is used to create a counter that starts at 0 and then each applicable value is pulled out of your tempArry and priceArray. This also creates a generator expression which helps with memory (especially if your lists are really large).

new_dict = ({'id': i + 1 , 'temperature': tempArry[i], 'price': priceArry[i]} for i, _ in enumerate(idArry))

I'd take a look at the functionality of the pandas package. In particular there is a pandas.DataFrame.to_dict method.

I'm confident that for large arrays this method should be pretty fast (though I'm willing to have the zip method proved more efficient).

In the example below I first construct a pandas dataframe from your arrays and then use the to_dict method.

import numpy as np
import pandas as pd

column_names = ['id', 'temperature', 'price']

idArry = np.array([1, 2, 3])
tempArry = np.array([20.3, 30.4, 50.4])
priceArry = np.array([1.2, 3.5, 2.3])

df = pd.DataFrame(np.vstack([idArry, tempArry, priceArry]).T, columns=column_names)

table_dict = df.to_dict(orient='records')

你可以使用list-comprehension来迭代其中一个数组来实现这个目的:

[{'id': idArry[i], 'temperature': tempArry[i], 'price': priceArry[i]} for i in range(len(idArry))]

You could build a NumPy matrix then convert to a dictionary as follow. Given your data (I changed the values just for example):

import numpy as np

idArry = np.array([1,2,3,4])
tempArry = np.array([20,30,50,40])
priceArry = np.array([200,300,100,400])

Build the matrix:

table = np.array([idArry, tempArry, priceArry]).transpose()

Create the dictionary:

dict_table = [ dict(zip(column_names, values)) for values in table ]
#=> [{'id': 2, 'temperature': 30, 'price': 300}, {'id': 3, 'temperature': 50, 'price': 100}, {'id': 4, 'temperature': 40, 'price': 400}]


I don't know the purpose, but maybe you can also use the matrix as follow.

 temp_col = table[:,1] table[temp_col >= 40] # [[ 3 50 100] # [ 4 40 400]] 

A way to do it would be as follows:

column_names = ['id', 'temperature', 'price']

idArry = ([1,2,3,4])
tempArry = ([20.3,30.4,50.4, 4])
priceArry = ([1.2,3.5,2.3, 4.5])

You could zip all elements in the different list:

l = zip(idArry,tempArry,priceArry)

print(list(l))
[(1, 20.3, 1.2), (2, 30.4, 3.5), (3, 50.4, 2.3), (4, 4, 4.5)]

And append the inner dictionaries to a list using a list comprehension and by iterating over the elements in l as so:

[dict(zip(column_names, next(l))) for i in range(len(idArry))]

[{'id': 1, 'temperature': 20.3, 'price': 1.2},
 {'id': 2, 'temperature': 30.4, 'price': 3.5},
 {'id': 3, 'temperature': 50.4, 'price': 2.3},
 {'id': 4, 'temperature': 4, 'price': 4.5}]

The advantage of using this method is that it only uses built-in methods and that it works for an arbitrary amount of column_names .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM