简体   繁体   中英

What is an equivalent list comprehension to these nested for loops?

I have a list of blog titles called lst and a list of stop words called stops .

This code does exactly what I want, removing every word which appears in both lists from lst :

for line in lst:
    for stop in stops:
        line = re.sub(r"\b" + stop.rstrip("\n") + r"\b", "", line.lower())
    print(line)

However, out of both curiosity and a desire to write more concise/efficient code, I want to turn this into a list comprehension.

I tried this:

lst = [[re.sub(r"\b" + stop.rstrip("\n") + r"\b", "", line.lower()) for stop in stops] for line in list]

...but to no avail. When executed, the code throws a ValueError exception as seen below:

Traceback (most recent call last):
  File "F:\Visual Studio Projects\RBTrends\RBTrends\main.py", line 55, in <module> prepData()
  File "F:\Visual Studio Projects\RBTrends\RBTrends\main.py", line 42, in prepData
    filelst = aps.stripStopWords(filelst, STOP_WORDS_PATH)
  File "F:\Visual Studio Projects\RBTrends\RBTrends\articleprocesses.py", line 34, in stripStopWords
    lst = [[re.sub(r"\b" + stop.rstrip("\n") + r"\b", "", line.lower()) for stop in stops] for line in list]
TypeError: 'type' object is not iterable

Could someone please explain the reason for this error, and how I can fix it by writing a different list comprehension?

You have a typo in your code here:

lst = [[.... for stop in stops] for line in list]
                                          ----^

Replace that list with lst . list is a type name in Python and hence the ValueError .

The only way to reproduce your inner loop would be with reduce :

from functools import reduce  # for Python 3

result = [reduce(lambda line, stop: re.sub(r'\b' + stop.rstrip('\n') + r'\b', '', line), stops, line.lower()) for line in lst]

Please don't do this. Your code is fine. If you want to speed it up, just pre-compile a regex that replaces all of the words at once.

stop_regex = re.compile(r'\b' + r'\b|\b'.join(map(re.escape, stops)) + r'\b')

for line in lst:
    print(stop_regex.sub('', line.lower()))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM