繁体   English   中英

给定长度的字母表中的haskell单词

[英]haskell words from alphabet of a given length

我有这个函数,它生成一个最小长度为 0 和最大长度为 n 的所有单词的列表,等于作为函数的输入:

import Data.List

words :: Int -> String -> [String]
words 0 alph = [[]]
words n alph = words (n-1) alph ++ [ ch:w | w <-words (n-1) alph, ch <- alph]

当我运行它时,输出如下:

> words 3 "AB"
["","A","B","A","B","AA","BA","AB","BB","A","B","AA","BA","AB","BB","AA","BA","AB","BB","AAA","BAA","ABA","BBA","AAB","BAB","ABB","BBB"]

这里的问题是,有一些单词重复,在这个例子中,特别是长度为 2 的单词(“AA”在那里是 3 次)。 你能看出我在我的函数中做错了什么,或者你知道如何解决它吗?

这是因为列表words (n-1) alph中的words (n-1) alph不仅会生成长度为n-1单词,还会生成n-2n-3等,因为这就是您定义words函数的方式。

最好制作一个仅生成长度为n 的单词的辅助函数,然后在构造长度为n 的字符串的额外函数中使用它:

words :: Int -> String -> [String]
words 0 alph = [[]]
words n alph = [ ch:w | w <-words (n-1) alph, ch <- alph]

wordsUpTo :: Int -> String -> [String]
wordsUpTo n alph = concatMap (flip words alph) [0 .. n]

然而words已经存在,这只是replicateM :: Applicative m = > Int -> ma -> m [a]一个特例,所以我们可以把它写成:

import Control.Monad(replicateM)

wordsUpTo :: Int -> String -> [String]
wordsUpTo n alph = [0 .. n] >>= (`replicateM` alph)

这将产生:

Prelude Control.Monad> wordsUpTo 3 "AB"
["","A","B","AA","AB","BA","BB","AAA","AAB","ABA","ABB","BAA","BAB","BBA","BBB"]

列表的Applicative实例有效地计算了一个叉积,

> (,) <$> ["A", "B"] <*> ["C", "D"]
[("A","C"),("A","D"),("B","C"),("B","D")]

其中的元素可以用(++)而不是(,)

> (++) <$> ["A", "B"] <*> ["C", "D"]
["AC","AD","BC","BD"]

如果你反复应用这个操作,你会得到你想要的字符串:

> (++) <$> ["A", "B"] <*> [""] -- base case
["A","B"]
> (++) <$> ["A", "B"] <*> ["A","B"]
["AA","AB","BA","BB"]
> (++) <$> ["A", "B"] <*> ["AA","AB","BA","BB"]
["AAA","AAB","ABA","ABB","BAA","BAB","BBA","BBB"]

您要重复的函数是((++) <$> ["A", "B"] <*>) ,从现在开始我们将其称为f

> f = ((++) <$> ["A", "B"] <*>)

这个重复的应用程序被iterate函数捕获,它重复地将一个函数应用程序的输出作为下一个函数应用程序的输入。

> take 3 $ iterate f [""]
[[""],["A","B"],["AA","AB","BA","BB"]]

我们希望将结果连接到一个列表中:

> take 7 $ concat $ iterate f [""]
["","A","B","AA","AB","BA","BB"]

所以所有的组合都只是

allWords alph = concat $ iterate f [""]
  where f = ((++) <$> alph <*>)

要获得具有最大长度的元素,我们可以

  1. 使用takeWhile (\\x -> length x <= n) ,或
  2. 使用take (2^(n+1) - 1) (给定项目生成的顺序,给定长度的所有字符串都出现在更长的字符串之前,我们可以计算给定最大长度的字符串总数)

所以我们可以定义

words n = takeWhile p . allWords
  where p x = length x < 4

或者

words n = take n' . allWords

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM