Removing Stop Word From a Text in Python Without Using NLTK

Question

I made a list of stopwords in my native language in Python. How can I remove them without using NLTK when I type a text?

Answer 1

Check this out (This only works if the language in question can be broken on spaces, but that hasn't been clarified – Thanks to Oso):

import numpy as np
your_stop_words = ['something','sth_else','and ...']
new_string = input()
words = np.array(new_string.split())
is_stop_word = np.isin(words,your_stop_words)
filtered_words = words[~is_stop_word]
clean_text = ' '.join(filtered_words)

If the language in question can not be broken to spaces, you can use this solution:

your_stop_words = ['something','sth_else','and ...']
new_string = input()
clean_text = new_string
for stop_word in your_stop_words :
    clean_text = clean_text.replace(stop_word,"")

In this case, you need to ensure that a stop word can not be a part of another word. you can do it based on your language. for example you can use spaces around your stop words.

Removing Stop Word From a Text in Python Without Using NLTK

Question

1 answers

solution1
0 ACCPTED 2021-01-22 20:15:50

Removing Stop Word From a Text in Python Without Using NLTK

Question

1 answers

solution1 0 ACCPTED 2021-01-22 20:15:50

solution1
0 ACCPTED 2021-01-22 20:15:50