I have a list of blog titles called lst
and a list of stop words called stops
.
This code does exactly what I want, removing every word which appears in both lists from lst
:
for line in lst:
for stop in stops:
line = re.sub(r"\b" + stop.rstrip("\n") + r"\b", "", line.lower())
print(line)
However, out of both curiosity and a desire to write more concise/efficient code, I want to turn this into a list comprehension.
I tried this:
lst = [[re.sub(r"\b" + stop.rstrip("\n") + r"\b", "", line.lower()) for stop in stops] for line in list]
...but to no avail. When executed, the code throws a ValueError
exception as seen below:
Traceback (most recent call last):
File "F:\Visual Studio Projects\RBTrends\RBTrends\main.py", line 55, in <module> prepData()
File "F:\Visual Studio Projects\RBTrends\RBTrends\main.py", line 42, in prepData
filelst = aps.stripStopWords(filelst, STOP_WORDS_PATH)
File "F:\Visual Studio Projects\RBTrends\RBTrends\articleprocesses.py", line 34, in stripStopWords
lst = [[re.sub(r"\b" + stop.rstrip("\n") + r"\b", "", line.lower()) for stop in stops] for line in list]
TypeError: 'type' object is not iterable
Could someone please explain the reason for this error, and how I can fix it by writing a different list comprehension?
You have a typo in your code here:
lst = [[.... for stop in stops] for line in list]
----^
Replace that list
with lst
. list
is a type name in Python and hence the ValueError
.
The only way to reproduce your inner loop would be with reduce
:
from functools import reduce # for Python 3
result = [reduce(lambda line, stop: re.sub(r'\b' + stop.rstrip('\n') + r'\b', '', line), stops, line.lower()) for line in lst]
Please don't do this. Your code is fine. If you want to speed it up, just pre-compile a regex that replaces all of the words at once.
stop_regex = re.compile(r'\b' + r'\b|\b'.join(map(re.escape, stops)) + r'\b')
for line in lst:
print(stop_regex.sub('', line.lower()))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.