简体   繁体   中英

Substituting Words for Synonyms

I have recently been trying to create a Python program to which you give a word, and it lists out all of its synonyms. Here is the code I'm using:

from urllib import quote_plus
import urllib2
import re

def get_search_result(key):
    page = urllib2.urlopen('http://www.synonyms.net/synonym/%s'%quote_plus(key)).read()
    words_ = []
    words = []
    for i in [re.sub('<.*?>', '', i) for i in re.findall('Synonyms:&nbsp;(.*?)Antonyms', page)]:
        words_.extend(i.split(', '))
    for i in words_:
        if i not in words:
            words.append(i)
    return words

if __name__ == '__main__':
    res = get_search_result('sack')
    print res, len(res)

The thing is, while it works, it is INCREDIBLY slow. It took a minute for it to answer for me. My question: is there a better way of doing this? Right now, it uses synonyms.net and checks the html of the page. The problem is, synonyms.net in itself is slow.

I have looked into the synonym.net API . It seemed to be exactly what I needed, as it was very fast (returned the list in 0.23 seconds). The only problem is that, at the bottom of the page, in small print, it says 'The Synonyms API service is free to use for up to 1,000 queries per day'. Now, that is circumvented, as they say, if you buy the product. The problem is that buying something requires money, and I don't really want to pay $10 a month for a program to give me synonyms.

I have also looked into http://thesaurus.com . Because the code is flexible, I modified it quickly to use that. It was better, taking only 10 seconds to respond. However, that is still not suitable. Thesaurus.com does not have an API to use, as far as a quick search on the website proved. Now, the final solution, the one that would be guaranteed to work, would be to make my own synonym list, and then have a program to parse it. However, this option seems messy and not very favorable. Does anyone have any alternatives, that would at least be faster then 10 seconds?

Thanks in advance!

Reposting my comment since it seems to fix the issue,

thesaurus.com also has an m-version at m.dictionary.com/t , using it should speed up the internet traffic and using mobile-versions also makes the parsing of the HTML much much easier.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM