自动更正拼写检查程序

Question

I have a TSV (tab-separated value) file that I need to spell-check for misspellings and combined words (ie 'I love you' vs 'Iloveyou'). 我有一个TSV（制表符分隔值）文件，我需要拼写检查拼写错误和组合单词（即'我爱你'和'Iloveyou'）。

I've installed Aspell on my machine and can run it through R using the aspell() function. 我在我的机器上安装了Aspell，可以使用aspell（）函数在R中运行它。

files <- "train2.tsv"
 res <- aspell(files)
 str(res)
 summary(res)

However, the output from running it in R is just a list of misspelled words and possible suggestions. 但是，在R中运行它的输出只是拼写错误的单词列表和可能的建议。

>  summary(res)
Possibly mis-spelled words:
 [1] "amant"        "contaneir"    "creat"        "ddition"      "EssaySet"     "EssayText"    "experiament"  "expireiment"  "expirement"  
[10] "Fipst"        "infomation"   "Inorder"      "measureing"   "mintued"      "neccisary"    "officialy"    "renuminering" "rinsen"      
[19] "sticlenx"     "sucessfully"  "tipe"         "vineager"     "vinigar"      "yar"   

>  str(res)
Classes ‘aspell’ and 'data.frame':      27 obs. of  5 variables:
 $ Original   : chr  "EssaySet" "EssayText" "expirement" "expireiment" ...
 $ File       : chr  "train2.tsv" "train2.tsv" "train2.tsv" "train2.tsv" ...
 $ Line       : int  1 1 3 3 3 3 3 3 6 6 ...
 $ Column     : int  4 27 27 108 132 222 226 280 120 156 ...
 $ Suggestions:List of 27
  ..$ : chr  "Essay Set" "Essay-Set" "Essayist" "Essays" ...
  ..$ : chr  "Essay Text" "Essay-Text" "Essayist" "Sedatest" ...
  ..$ : chr  "experiment" "excrement" "excitement" "experiments" ...
  ..$ : chr  "experiment" "experiments" "experimenter" "excrement" ...
  ..$ : chr  "Amandy" "am ant" "am-ant" "Amanda" ...
  ..$ : chr  "year" "ya" "Yard" "yard" ...

Is there are way to have aspell (or any other spellchecker) automatically correct misspelled words? 是否有办法让aspell（或任何其他拼写检查程序）自动纠正拼写错误的单词？

Answer 1

It looks like you can do the following: 看起来您可以执行以下操作：

s = load_up_users_dictionary()

for word in text_to_check:
    if word not in s:
        new_words = s.suggest( word )
        replace_incorrect_word( word, new_words[0] )#Pick the first word from the returned list.

Just a quick glance over the documentation and that looks like what you would have to do to automatically use the suggested correct spelling. 只需快速浏览一下文档，看起来就像要自动使用建议的正确拼写一样。

http://0x80.pl/proj/aspell-python/index-c.html http://0x80.pl/proj/aspell-python/index-c.html

Edit: Realize that you may not be looking for python code, but this would be the easiest way to do it with python as the question was tagged with python. 编辑：意识到你可能不会在寻找python代码，但这是使用python执行它的最简单方法，因为问题是用python标记的。 There is probably a more efficient method of doing it, but it's getting late and this came to mind first. 可能有一种更有效的方法，但它已经很晚了，首先想到了这一点。

自动更正拼写检查程序

问题描述

1 个解决方案

解决方案1
8 已采纳 2012-07-07 06:42:58

自动更正拼写检查程序

问题描述

1 个解决方案

解决方案1 8 已采纳 2012-07-07 06:42:58

解决方案1
8 已采纳 2012-07-07 06:42:58