简体   繁体   中英

Python evaluation of compound statement as if condition

I'm have trouble understanding the evaluation order of statements that are if statement conditions:

assume I have a dictionary like this, that maps words to a list of webpages:

index = { WORD, [url1,url2,url3] }

When inserting into this index there are two cases:

1) The key (WORD) does not exist in the index already, you need to create a list and set WORD as a
key in the map

2) The key (WORD) exists in the index already, I just need to append the current url to the list already in the dictionary

What I expected to work:

def update_index(word, url):
    if word in index and not(url in index[word]):
       index[word].append(url) # list already exists append to it
    else: 
       index[word] = [url] # new list with url as a single element

This however only every allows 1 url per word.

What did work:

def update_index(word, url):
    if word in index:                  # <- isnt having two consecutive if statements 
                                       # the same as an AND???
       if not(url in index[word]):
          index[word].append(url) # list already exists append to it
    else: 
       index[word] = [url] # new list with url as a single element

Any help clearing this up would be appreciated.

They're definitely different (since you have an else clause). In the first case, you enter else clause in the event that your dictionary has the key, and the element is already in the list (which you probably don't want).

In other words, when the url is already in the list, you replace the list with [url] instead of just doing nothing.

To understand the logic problem, look at the other answers. But as I said in the comments, you can end-run the whole problem with:

from collections import defaultdict

url_store = defaultdict(set)
url_store[word].add(url)

The problem is that you always overwrite the entire list of URLs whenever you find a URL that's already in the list.

Your condition checks whether the word is in the index and whether the URL is not yet in the list for that word. So if the word is in the index, and the URL is already in the list, the entire condition evaluates to false and the else-case is executed, overwriting the existing list for that word with a list holding only the duplicate URL.

Instead, you should try this:

if word not in index:
    index[word] = [] # create new empty list for word
# now we know that a list exists -> append
if url not in index[word]:
    index[word].append(url)

If you use a defaultdict , as suggested in another answer, the defaultdict will do this check (the first if -statement) for you.

Update: Got the composite if-condition wrong myself... First paragraph is now fixed.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM