简体   繁体   中英

Making sense of Python

I am reading the book Programming Collective Intelligence, What exactly the following piece of python code do?

  # Add up the squares of all the differences 
  sum_of_squares=sum([pow(prefs[person1][item]-prefs[person2][item],2) 
                      for item in prefs[person1] if item in prefs[person2]]) 

I am trying to play with the examples in Java.

Prefs is a map of person to movie ratings, movie ratings is another map of names to ratings.

First it constructs a list containing the results from:

for each item in prefs for person1:
    if that is also an item in the prefs for person2:
        find the difference between the number of prefs for that item for the two people
        and square it (Math.pow(x,2) is "x squared")

Then it adds those up.

This might be a little more readable if the call to pow were replaced with an explicit use of '**' exponentiation operator:

sum_of_squares=sum([(prefs[person1][item]-prefs[person2][item])**2
                   for item in prefs[person1] if item in prefs[person2]])

Lifting out some invariants also helps readability:

p1_prefs = prefs[person1]
p2_prefs = prefs[person2]

sum_of_squares=sum([(p1_prefs[item]-p2_prefs[item])**2
                      for item in p1_prefs if item in p2_prefs])

Finally, in recent versions of Python, there is no need for the list comprehension notation, sum will accept a generator expression, so the []'s can also be removed:

sum_of_squares=sum((p1_prefs[item]-p2_prefs[item])**2
                      for item in p1_prefs if item in p2_prefs)

Seems a bit more straightforward now.

Ironically, in pursuit of readability, we have also done some performance optimization (two endeavors that are usually mutually exclusive):

  • lifted invariants out of the loop
  • replaced the function call pow with inline evaluation of '**' operator
  • removed unnecessary construction of a list

Is this a great language or what?!

01 sum_of_squares =
02 sum(
03  [
04      pow(
05         prefs[person1][item]-prefs[person2][item],
06         2
07      ) 
08    for
09       item
10    in
11       prefs[person1]
12    if
13       item in prefs[person2]
14  ]
15 )

Sum (line 2) a list, that consists of the values computed in lines 4-7 for each 'item' defined in the list specified on line 11 which the condition on line 13 holds true for.

It computes the sum of the squares of the difference between prefs[person1][item] and prefs[person2][item] , for every item in the prefs dictionary for person1 that is also in the prefs dictionary for person2 .

In other words, say both person1 and person2 have a rating for the film Ratatouille , with person1 rating it 5 stars, and person2 rating it 2 stars.

prefs[person1]['Ratatouille'] = 5
prefs[person2]['Ratatouille'] = 2

The square of the difference between person1 's rating and person2 's rating is 3^2 = 9 .

It's probably computing some kind of Variance .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM