I have an array of object of class Person like the below, with thisRate
first set to None
:
class Person(object):
def __init__(self, id, name):
self.id = id
self.name = name
self.thisRate= None
I loaded around 21K Person
objects into an array, name
not sorted.
Then I loaded another array from data in a file which has data for thisRate
, about 13K of them, name
is not sorted as well:
person_data = []
# read from file
row['name'] = 'Peter'
row['thisRate'] = '0.12334'
person_data.append(row)
Now with these 2 sets of arrays, when the name
is matched between them, I will assign thisRate
from person_data
into Person.thisRate
.
What I am doing is a loop is like this:
for person in persons:
data = None
try:
data = next(personData for personData in person_data
if personData['name'] == person.name)
except StopIteration:
print("No rate for this person: {}".format(person.name))
if data:
person.thisRate = float( data['thisRate'] )
This loop
data = next(personData for personData in person_data
if personData['name'] == person.name)
is running fine and uses 21 seconds on my machine with Python 2.7.13.
My question is, is there a faster or better way to achieve the same thing with the 2 arrays I have?
Yes. Make an dictionary from name
to thisRate
:
nd = {}
with open(<whatever>) as f:
reader = csv.DictReader(<whatever>):
for row in reader:
nd[row['name']] = row['thisRate']
Now, use this dictionary to do a single pass over your Person
list:
for person in persons:
thisRate = nd.get(person.name, None)
person.thisRate = thisRate
if thisRate is None:
print("No rate for this person: {}".format(person.name))
Dictionaries have a .get
method which allows you to provide a default value in case the key is not in the dict
. I used None
(which is actually what is the default default value) but you can use whatever you want.
This is a linear-time solution. Your solution was quadratic time, because you are essentially doing:
for person in persons:
for data in person_data:
if data['name'] == person.name:
person.thisRate = data['thisRate']
break
else:
print("No rate for this person: {}".format(person.name))
Just in a fashion that obscures this fundamentally nested for-loop inside of a generator expression (not really a good use-case for a generator expression, you should have just used a for-loop to begin with, then you don't have to deal with try-catch
a StopIteration
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.