繁体   English   中英

显示在haskell中重复的单词列表

[英]Show a list of words repeated in haskell

我需要能够编写一个函数来显示字符串中重复的单词并按顺序返回字符串列表并忽略非字母

例如,在拥抱提示

repetitions :: String -> [String]

repetitions > "My bag is is action packed packed."
output> ["is","packed"]
repetitions > "My name  name name is Sean ."
output> ["name","name"]
repetitions > "Ade is into into technical drawing drawing ."
output> ["into","drawing"]

要将字符串拆分为单词,请使用words function(在Prelude中)。 要消除非单词字符,请使用Data.Char.isAlphaNum filter 将列表与其尾部一起压缩以获得相邻的对(x, y) 折叠列表,建立一个包含x == y所有x的新列表。

喜欢:

repetitions s = map fst . filter (uncurry (==)) . zip l $ tail l
  where l = map (filter isAlphaNum) (words s)

我不确定它是否有效,但它应该给你一个粗略的想法。

我是这种语言的新手,所以我的解决方案在Haskell退伍军人眼中可能是一种丑陋,但无论如何:

let repetitions x = concat (map tail (filter (\x -> (length x) > 1) (List.group (words (filter (\c -> (c >= 'a' && c <= 'z') || (c>='A' && c <= 'Z') ||  c==' ') x)))))

这部分将删除字符串s中的所有非字母和非空格:

filter (\c -> (c >= 'a' && c <= 'z') || (c>='A' && c <= 'Z') ||  c==' ') s

这个将字符串s拆分为单词并将相同的单词组合成列表返回列表列表:

List.group (words s)

当此部分将删除少于两个元素的所有列表:

filter (\x -> (length x) > 1) s

之后,我们将所有列表连接到一个从中移除一个元素的列表

concat (map tail s)

这可能是不合理的,但它在概念上非常简单。 我假设它正在寻找像示例一样的连续重复单词。

-- a wrapper that allows you to give the input as a String
repititions :: String -> [String]
repititions s = repititionsLogic (words s)
-- dose the real work 
repititionsLogic :: [String] -> [String]
repititionsLogic [] = []
repititionsLogic [a] = []
repititionsLogic (a:as) 
    | ((==) a (head as)) = a : repititionsLogic as
    | otherwise = repititionsLogic as

以Alexander Prokofyev回答的为基础:

repetitions x = concat (map tail (filter (\\x -> (length x) > 1) (List.group (word (filter (\\c -> (c >= 'a' && c <= 'z') || (c>='A' && c <= 'Z') || c==' ') x)))))

删除不必要的括号:

repetitions x = concat (map tail (filter (\\x -> length x > 1) (List.group (word (filter (\\c -> c >= 'a' && c <= 'z' || c>='A' && c <= 'Z' || c==' ') x)))))

使用$删除更多括号(如果结束括号位于表达式的末尾,则每个$可以替换左括号):

repetitions x = concat $ map tail $ filter (\\x -> length x > 1) $ List.group $ word $ filter (\\c -> c >= 'a' && c <= 'z' || c>='A' && c <= 'Z' || c==' ') x

用Data.Char中的函数替换字符范围,合并concat和map:

repetitions x = concatMap tail $ filter (\\x -> length x > 1) $ List.group $ word $ filter (\\c -> isAlpha c || isSeparator c) x

使用一个部分并以无点样式进行曲线处理以简化(\\x -> length x > 1) to ((>1) . length) ) in a right-to-left pipeline. 这将length与(> 1)(部分应用的运算符或 )组合在一个从右到左的管道中。

repetitions x = concatMap tail $ filter ((>1) . length) $ List.group $ word $ filter (\\c -> isAlpha c || isSeparator c) x

消除显式“x”变量以使整个表达式无点:

repetitions = concatMap tail . filter ((>1) . length) . List.group . word . filter (\\c -> isAlpha c || isSeparator c)

现在整个函数,从右到左阅读,是一个管道,只过滤字母或分隔符字符,将其拆分为单词,将其分成组,过滤那些具有多于1个元素的组,然后将剩余的组减少到第一个每个元素。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM