简体   繁体   中英

How can I remove unwanted characters from a words list and put them cleared in another list using Python?

I'm new in Python and working on a lexicon database. I have three lists : the first one contains several words from the database I want to test, the second one contains prefixes and the third one contains suffixes. I need to make another list (called "radicals") that would contain the words from the first list that matched with the two other lists but with their prefixes or suffixes removed.

I'm sure I'm not using the right method here but here's my code :

#coding UTF-8
import re 
from re import search 


words = ["flore", "fleur", "fleuriste", "remaniement", "remanier", "manier", "maniable", "désaimer", "aimer", "aimant", "mêler", "emmêler", "désemmêler"]
radicals = []
i = 0
motifp = "^[re|em|dés]"
motifs = "[iste|ment|er|ant]$"

while i < len(words) : 
    if re.search(motifs, words[i]) : 
        del(motifp, words[i])
        del(motifs, words[i])
        radicals.append(words[i])
    i = i + 1
print(radicals)

It returns the following error :

['fleur']
Traceback (most recent call last):
  File "C:\Users\alice\OneDrive\Documents\Visual Studio 2017\Projects\PythonApplication4\PythonApplication4\PythonApplication4.py", line 14, in <module>
    del(motifp, words[i])
NameError: name 'motifp' is not defined
Press any key to continue . . .

I could really use your help... Thanks a lot !

What you want is to iterate over each word and remove any defined prefix or suffix. That's it. And since some radicals will be the same, eg, for fleur and fleuriste, use a set .

import re 

words = ["flore", "fleur", "fleuriste", "remaniement", "remanier", "manier", "maniable", "désaimer", "aimer", "aimant", "mêler", "emmêler", "désemmêler"]
radicals = set()
motifp = "^(re|em|dés)"
motifs = "(iste|ment|er|ant)$"

for word in words:
    word = re.sub(motifp, '', word)
    word = re.sub(motifs, '', word)
    radicals.add(word)
print(radicals)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM