简体   繁体   中英

Can someone help me understand this for loop in python

I am trying to recycle this code from another source but I am having trouble understand the for loop in the second line. Can someone please clarify what exactly this line title = [x for x in title if x not in stopWords] is doing? stopWords is a list of words.

def title_score(title, sentence):

    title = [x for x in title if x not in stopWords]
    count = 0.0
    for word in sentence:
        if (word not in stopWords and word in title):
            count += 1.0

    if len(title) == 0:
        return 0.0

    return count/len(title)
[x for x in title if x not in stopWords]

It's a list comprehension. It means construct a list of all items in title (that's the x for x in title bit) that are not also in stopWords (per the if x not in stopWords bit).


You can see a similar effect with the following snippets. The first creates a list of all number in the inclusive range 0..9 :

>>> [x for x in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The second adds an if clause to only include odd numbers:

>>> [x for x in range(10) if x % 2 != 0]
[1, 3, 5, 7, 9]

And here's perhaps a better example, more closely aligned to your code:

>>> stopWords = "and all but if of the".split() ; stopWords
['and', 'all', 'but', 'if', 'of', 'the']

>>> title = "the sum of all fears".split() ; title
['the', 'sum', 'of', 'all', 'fears']

>>> [x for x in title]
['the', 'sum', 'of', 'all', 'fears']

>>> [x for x in title if x not in stopWords]
['sum', 'fears']

There you can see the "noise" words being removed in the final step.

well, they say that python is like runnable pseudocode and I guess that applies here. it is creating a list and putting into it every item inside title where that item is not inside stopWords

That is a list comprehension, equivalent to this loop:

newtitle = []
for x in title:
    if x not in stopwords;
        newtitle.append(x)
title = newtitle

In other words, it effectively removes any words from title if they also appear in stopwords .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM