如何使用南非荷兰语单词作为nltk语料库访问文本文件

Question

I have a text file with plain text sentences in the Afrikaans language. 我有一个带有南非荷兰语语言的纯文本句子的文本文件。 I would like to be able to perform nltk corpus functions on this text file, but can't find any examples of how to do this. 我希望能够在此文本文件上执行nltk语料库功能，但是找不到如何执行此操作的任何示例。

I would like to do things such as: 我想做一些事情，例如：

mytext.concordance("woord")
mytext.similar("woord")

Can anyone help me? 谁能帮我？

Answer 1

Managed to figure something out: 设法弄清楚了一些事情：

# How to load a text file as a corpus.
import nltk
from nltk.corpus import PlaintextCorpusReader
from nltk.corpus.util import LazyCorpusLoader
afrikaans = LazyCorpusLoader('afrikaans', PlaintextCorpusReader, r'(?!\.).*\.txt')
afrikaans.sents()[1]
af = nltk.Text(afrikaans.words())
af.concordance("mense")

This assumes your corpora text file is in C:\\nltk_data\\corpora\\afrikaans\\afrikaans.txt 假设您的语料库文本文件位于C：\\ nltk_data \\ corpora \\ afrikaans \\ afrikaans.txt中

如何使用南非荷兰语单词作为nltk语料库访问文本文件

问题描述

1 个解决方案

解决方案1
1 已采纳 2013-01-10 21:10:25

如何使用南非荷兰语单词作为nltk语料库访问文本文件

问题描述

1 个解决方案

解决方案1 1 已采纳 2013-01-10 21:10:25

解决方案1
1 已采纳 2013-01-10 21:10:25