簡體   English   中英

給定長度的字母表中的haskell單詞

[英]haskell words from alphabet of a given length

我有這個函數,它生成一個最小長度為 0 和最大長度為 n 的所有單詞的列表,等於作為函數的輸入:

import Data.List

words :: Int -> String -> [String]
words 0 alph = [[]]
words n alph = words (n-1) alph ++ [ ch:w | w <-words (n-1) alph, ch <- alph]

當我運行它時,輸出如下:

> words 3 "AB"
["","A","B","A","B","AA","BA","AB","BB","A","B","AA","BA","AB","BB","AA","BA","AB","BB","AAA","BAA","ABA","BBA","AAB","BAB","ABB","BBB"]

這里的問題是,有一些單詞重復,在這個例子中,特別是長度為 2 的單詞(“AA”在那里是 3 次)。 你能看出我在我的函數中做錯了什么,或者你知道如何解決它嗎?

這是因為列表words (n-1) alph中的words (n-1) alph不僅會生成長度為n-1單詞,還會生成n-2n-3等,因為這就是您定義words函數的方式。

最好制作一個僅生成長度為n 的單詞的輔助函數,然后在構造長度為n 的字符串的額外函數中使用它:

words :: Int -> String -> [String]
words 0 alph = [[]]
words n alph = [ ch:w | w <-words (n-1) alph, ch <- alph]

wordsUpTo :: Int -> String -> [String]
wordsUpTo n alph = concatMap (flip words alph) [0 .. n]

然而words已經存在,這只是replicateM :: Applicative m = > Int -> ma -> m [a]一個特例,所以我們可以把它寫成:

import Control.Monad(replicateM)

wordsUpTo :: Int -> String -> [String]
wordsUpTo n alph = [0 .. n] >>= (`replicateM` alph)

這將產生:

Prelude Control.Monad> wordsUpTo 3 "AB"
["","A","B","AA","AB","BA","BB","AAA","AAB","ABA","ABB","BAA","BAB","BBA","BBB"]

列表的Applicative實例有效地計算了一個叉積,

> (,) <$> ["A", "B"] <*> ["C", "D"]
[("A","C"),("A","D"),("B","C"),("B","D")]

其中的元素可以用(++)而不是(,)

> (++) <$> ["A", "B"] <*> ["C", "D"]
["AC","AD","BC","BD"]

如果你反復應用這個操作,你會得到你想要的字符串:

> (++) <$> ["A", "B"] <*> [""] -- base case
["A","B"]
> (++) <$> ["A", "B"] <*> ["A","B"]
["AA","AB","BA","BB"]
> (++) <$> ["A", "B"] <*> ["AA","AB","BA","BB"]
["AAA","AAB","ABA","ABB","BAA","BAB","BBA","BBB"]

您要重復的函數是((++) <$> ["A", "B"] <*>) ,從現在開始我們將其稱為f

> f = ((++) <$> ["A", "B"] <*>)

這個重復的應用程序被iterate函數捕獲,它重復地將一個函數應用程序的輸出作為下一個函數應用程序的輸入。

> take 3 $ iterate f [""]
[[""],["A","B"],["AA","AB","BA","BB"]]

我們希望將結果連接到一個列表中:

> take 7 $ concat $ iterate f [""]
["","A","B","AA","AB","BA","BB"]

所以所有的組合都只是

allWords alph = concat $ iterate f [""]
  where f = ((++) <$> alph <*>)

要獲得具有最大長度的元素,我們可以

  1. 使用takeWhile (\\x -> length x <= n) ,或
  2. 使用take (2^(n+1) - 1) (給定項目生成的順序,給定長度的所有字符串都出現在更長的字符串之前,我們可以計算給定最大長度的字符串總數)

所以我們可以定義

words n = takeWhile p . allWords
  where p x = length x < 4

或者

words n = take n' . allWords

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM