I am developing a Python program in order to find the etymology of words in a text. I have found out there are basically two options: parsing an online dictionary that provides etymology or using an API. I found this reply here but I don't seem to understand how to link the Oxford API with my Python program.
Can anyone explain me how to look up a word in an english dictionary? Thank you in advance.
Link to the question here
Note that while WordNet does not have all English words, what about the Oxford English Dictionary? ( http://developer.oxforddictionaries.com/ ). Depending on the scope of your project, it could be a killer API. Have you tried looking at Grady Ward's Moby? [link] ( http://icon.shef.ac.uk/Moby/ ). You could add it as a lexicon in NLTK (see notes on "Loading your own corpus" in Section 2.1).
from nltk.corpus import PlaintextCorpusReader corpus_root = '/usr/share/dict' wordlists = PlaintextCorpusReader(corpus_root, '.*')
from nltk.corpus import BracketParseCorpusReader corpus_root = r"C:\\corpora\\penntreebank\\parsed\\mrg\\wsj" file_pattern = r".*/wsj_.*\\.mrg" ptb = BracketParseCorpusReader(corpus_root, file_pattern)
You could use the opensource ety
package. Disclosure: I'm a contributor to the project
It's based on the data used in the research " Etymological Wordnet: Tracing the History of Words ", which has already been pre-scraped from Wiktionary .
Some examples:
>>> import ety
>>> ety.origins("potato")
[Word(batata, language=Taino)]
>>> ety.origins('drink', recursive=True)
[Word(drync, language=Old English (ca. 450-1100)),
Word(drinken, language=Middle English (1100-1500)),
Word(drincan, language=Old English (ca. 450-1100))]
>>> print(ety.tree('aerodynamically'))
aerodynamically (English)
├── -ally (English)
└── aerodynamic (English)
├── aero- (English)
│ └── ἀήρ (Ancient Greek (to 1453))
└── dynamic (English)
└── dynamique (French)
└── δυναμικός (Ancient Greek (to 1453))
└── δύναμις (Ancient Greek (to 1453))
└── δύναμαι (Ancient Greek (to 1453))
使用PyDictionary
可能是一个不错的选择
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.