简体   繁体   中英

Python package to find pre-defined keywords/tags in a file / url / string

Are there any python packages that can take a list of keywords / tags and match them up to a given string / file / url ?

Specifically using stemming and/or some other synonym way of matching.

ie my pre saved keywords:

Ski, Bike, Climb

my text:

Skiing in the mountains is great

Should get tagged with Ski

Skiing and mountain biking is fun

Should get tagged with Ski And Bike

And if I've got a synonyms file somewhere mapping Bike to MTB

MTB is a great way to spend the day

Should get tagged Bike

See Thesaurus (you can also try different modules, such as synonym module ).

Also you can test sentences for containing specific strings using in :

>>> 'Ski' in 'Skiing in the mountains is great'
True
>>> 'Bike' in 'Skiing in the mountains is great'
False

I don't know any package to do that but actually this is very simple with plain python. using re (regex) standard package. something like

import re
key_words =['ski','bike','climb'] 
input = "Skiing and mountain biking is fun"

input_words = input.split()#split on space
[word.lower() for word in input_words]
input_tags =[]
for word in input_words:
   for key in key_words:
      if re.search(key,word):
         input_tags.append(key)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM