简体   繁体   中英

create list of dictionary in a pythonic way

I wanted to create a list of dictionary with the random variables

I know how to create this dictionary with basic python code but I'm more interested to know if there is a faster and more pythonic way to achieve this since I've to create this for more than 10,000 random variables

Note: The size of list is same as # random variables

my random variables:

lat_lon = [(42.35,-121.73),(35.67,-71.19),(38.17,-74.83)]
id = [1,2,3]

Dictionary Structure:

{ origin_lat : 42.67
  origin_lon : -122.67
  id: 2
}

Expected Output using above random variables:

[
{ origin_lat : 42.35,
  origin_lon : -121.73
  id: 1
},
{ origin_lat : 35.67
  origin_lon : -71.19
  id: 2
},
{ origin_lat : 38.17
  origin_lon : -74.83
  id: 3
}

My code:

lst = []
for lat_lon, id in zip(lat_lon,id):
   lst.append(
      { origin_lat : lat_lon[0]
        origin_lon : lat_lon[1]
        id: id
      }
             )

Slightly nicer (and slightly faster) code would unpack completely to useful names (unpacking can unpack nested sequences as long as they're all of known length), eg:

for (lat, lon), id in zip(lat_lon, id):
   lst.append({'origin_lat': lat, 'origin_lon': lon, 'id': id})

Note: I'd recommend changing the name of the id sequence to ids or the like; using the same name for the iteration variable is going to bite you at some point, even if it happens to work in this case. Similarly, lat_lon sounds like a single value; for the collection, pluralize the name (even though unpacking means we aren't reusing it).

Of course, even better would be to dispense with dict emulating objects, and just make a useful class for your data; if you don't need mutability, using collections.namedtuple /typing.NamedTuple will generate most of the code for you, while for mutable data, you can use a dataclass . For an example of the former:

from collections import namedtuple

MapPoint = namedtuple('MapPoint', ['origin_lat', 'origin_lon', 'id'])

lst = [MapPoint(lat, lon, id) for (lat, lon), id in zip(lat_lon, id)]

This will save a non-trivial amount of memory too; on my CPython 3.8 x64 install, a three key dict incurs 232 bytes of overhead (ignoring cost of actual key/value objects), while the equivalent namedtuple only eats 64 bytes. Access is different (you use obj.origin_lat instead of obj['origin_lat'] ), but namedtuple s can be converted back to dict easily on an as-needed basis with the _asdict method .

If memory is a concern you can use generators ,

lat_lon = [(42.35,-121.73),(35.67,-71.19),(38.17,-74.83)]
id = [1,2,3]
def get_dict():
    for i in range(len(lat_lon)):
        yield {'origin_lat': lat_lon[i][0], 'origin_lon': lat_lon[i][1], 'id': id[i]}
print(*get_dict(), sep='\n')

Alternative methods

result [{'origin_lat': lat_lon[i][0], 'origin_lon': lat_lon[i][1], 'id': id[i]} for i in range(len(lat_lon))] 

OR Using zip with list comprehension,

result = [{'origin_lat': i[0][0], 'origin_lon': i[0][1], 'id': i[1]} for i in zip(lat_lon, id)]
result

output

[{'origin_lat': 42.35, 'origin_lon': -121.73, 'id': 1},
 {'origin_lat': 35.67, 'origin_lon': -71.19, 'id': 2},
 {'origin_lat': 38.17, 'origin_lon': -74.83, 'id': 3}]

Here is an example without zip as requested in the comments.

lat_lon = [(42.35,-121.73),(35.67,-71.19),(38.17,-74.83)]
id_ = [1,2,3]


output = []
for i in range(len(id_)):
    output.append(dict(id=id_[i], origin_lat=lat_lon[i][0], origin_lon=lat_lon[i][1]))
print(output)

You can try it on glot.io here.

It should be O(n). This assume id_ and lat_lon are the same length.

len is O(1) and range shouldn't make a structure in Python 3.0+

@Shibiraj's answer has a similar code but with generators - which is even better!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM