[英]unique elements in a haskell list
okay, this is probably going to be in the prelude, but: is there a standard library function for finding the unique elements in a list?好的,这可能会出现在序曲中,但是:是否有标准库 function 用于查找列表中的唯一元素? my (re)implementation, for clarification, is:
为了澄清起见,我的(重新)实施是:
has :: (Eq a) => [a] -> a -> Bool
has [] _ = False
has (x:xs) a
| x == a = True
| otherwise = has xs a
unique :: (Eq a) => [a] -> [a]
unique [] = []
unique (x:xs)
| has xs x = unique xs
| otherwise = x : unique xs
The nub
function from Data.List
(no, it's actually not in the Prelude) definitely does something like what you want, but it is not quite the same as your unique
function. Data.List
的nub
函数(不,实际上它不在Prelude中)绝对可以实现您想要的功能,但是它与您的unique
函数并不完全相同。 They both preserve the original order of the elements, but unique
retains the last occurrence of each element, while nub
retains the first occurrence. 它们都保留元素的原始顺序,但
unique
保留每个元素的最后一次出现,而nub
保留第一次的出现。
You can do this to make nub
act exactly like unique
, if that's important (though I have a feeling it's not): 如果很重要,您可以执行此操作以使
nub
行为完全像unique
一样(尽管我觉得不是):
unique = reverse . nub . reverse
Also, nub
is only good for small lists. 另外,
nub
仅适用于小型列表。 Its complexity is quadratic, so it starts to get slow if your list can contain hundreds of elements. 它的复杂度是二次的,因此如果您的列表可以包含数百个元素,则它开始变慢。
If you limit your types to types having an Ord instance, you can make it scale better. 如果将类型限制为具有Ord实例的类型,则可以使其扩展性更好。 This variation on
nub
still preserves the order of the list elements, but its complexity is O(n * log n)
: 在
nub
上的这种变化仍然保留了列表元素的顺序,但是其复杂度为O(n * log n)
:
import qualified Data.Set as Set
nubOrd :: Ord a => [a] -> [a]
nubOrd xs = go Set.empty xs where
go s (x:xs)
| x `Set.member` s = go s xs
| otherwise = x : go (Set.insert x s) xs
go _ _ = []
In fact, it has been proposed to add nubOrd
to Data.Set
. 实际上,已经建议将
nubOrd
添加到Data.Set
。
import Data.Set (toList, fromList)
uniquify lst = toList $ fromList lst
I think that unique should return a list of elements that only appear once in the original list; 我认为unique应该返回仅在原始列表中出现一次的元素列表; that is, any elements of the orginal list that appear more than once should not be included in the result.
也就是说,原始列表中出现多次的任何元素都不应包含在结果中。
May I suggest an alternative definition, unique_alt: 我可以建议一个替代定义unique_alt:
unique_alt :: [Int] -> [Int]
unique_alt [] = []
unique_alt (x:xs)
| elem x ( unique_alt xs ) = [ y | y <- ( unique_alt xs ), y /= x ]
| otherwise = x : ( unique_alt xs )
Here are some examples that highlight the differences between unique_alt and unqiue: 以下是一些示例,这些示例突出了unique_alt和unqiue之间的区别:
unique [1,2,1] = [2,1]
unique_alt [1,2,1] = [2]
unique [1,2,1,2] = [1,2]
unique_alt [1,2,1,2] = []
unique [4,2,1,3,2,3] = [4,1,2,3]
unique_alt [4,2,1,3,2,3] = [4,1]
I think this would do it. 我认为这可以做到。
unique [] = []
unique (x:xs) = x:unique (filter ((/=) x) xs)
Another way to remove duplicates: 删除重复项的另一种方法:
unique :: [Int] -> [Int]
unique xs = [x | (x,y) <- zip xs [0..], x `notElem` (take y xs)]
Algorithm in Haskell to create a unique list: Haskell中创建唯一列表的算法:
data Foo = Foo { id_ :: Int
, name_ :: String
} deriving (Show)
alldata = [ Foo 1 "Name"
, Foo 2 "Name"
, Foo 3 "Karl"
, Foo 4 "Karl"
, Foo 5 "Karl"
, Foo 7 "Tim"
, Foo 8 "Tim"
, Foo 9 "Gaby"
, Foo 9 "Name"
]
isolate :: [Foo] -> [Foo]
isolate [] = []
isolate (x:xs) = (fst f) : isolate (snd f)
where
f = foldl helper (x,[]) xs
helper (a,b) y = if name_ x == name_ y
then if id_ x >= id_ y
then (x,b)
else (y,b)
else (a,y:b)
main :: IO ()
main = mapM_ (putStrLn . show) (isolate alldata)
Output: 输出:
Foo {id_ = 9, name_ = "Name"}
Foo {id_ = 9, name_ = "Gaby"}
Foo {id_ = 5, name_ = "Karl"}
Foo {id_ = 8, name_ = "Tim"}
We can use that style of Haskell programming where all looping and recursion activities are pushed out of user code and into suitable library functions.我们可以使用 Haskell 编程风格,其中所有循环和递归活动都被推出用户代码并进入合适的库函数。 Said library functions are often optimized in ways that are way beyond the skills of a Haskell beginner.
所述库函数通常以超出 Haskell 初学者技能的方式进行优化。
A way to decompose the problem into two passes goes like this:将问题分解为两遍的方法如下:
For the first step, duplicate elements don't need a value at all, so we can use [Maybe a]
as the type of the second list.对于第一步,重复元素根本不需要值,所以我们可以使用
[Maybe a]
作为第二个列表的类型。 So we need a function of type:所以我们需要一个 function 类型:
pass1 :: Eq a => [a] -> [Maybe a]
Function pass1
is an example of stateful list traversal where the state is the list (or set) of distinct elements seen so far. Function
pass1
是有状态列表遍历的示例,其中state是到目前为止看到的不同元素的列表(或集合)。 For this sort of problem, the library provides the mapAccumL:: (s -> a -> (s, b)) -> s -> [a] -> (s, [b])
function.对于这类问题,库提供了
mapAccumL:: (s -> a -> (s, b)) -> s -> [a] -> (s, [b])
function。
Here the mapAccumL
function requires, besides the initial state and the input list, a step function argument, of type s -> a -> (s, Maybe a)
.这里的
mapAccumL
function 除了初始的 state 和输入列表之外,还需要一个类型为s -> a -> (s, Maybe a)
的步骤 function参数。
If the current element x
is not a duplicate, the output of the step function is Just x
and x
gets added to the current state. If x is a duplicate, the output of the step function is Nothing
, and the state is passed unchanged.如果当前元素
x
不是重复项,则步骤 function 的 output 是Just x
并且x
被添加到当前 state。如果 x 是重复项,则步骤 function 的 output 是Nothing
,并且 8827141483740 未更改。
Testing under the ghci
interpreter:在
ghci
解释器下测试:
$ ghci
GHCi, version 8.8.4: https://www.haskell.org/ghc/ :? for help
λ>
λ> stepFn s x = if (elem x s) then (s, Nothing) else (x:s, Just x)
λ>
λ> import Data.List(mapAccumL)
λ>
λ> pass1 xs = mapAccumL stepFn [] xs
λ>
λ> xs2 = snd $ pass1 "abacrba"
λ> xs2
[Just 'a', Just 'b', Nothing, Just 'c', Just 'r', Nothing, Nothing]
λ>
Writing a pass2
function is even easier.写一个
pass2
function 就更容易了。 To filter out Nothing
non-values, we could use:要过滤掉
Nothing
非值,我们可以使用:
import Data.Maybe( fromJust, isJust)
pass2 = (map fromJust) . (filter isJust)
but why bother at all?但为什么要打扰呢? - as this is precisely what the
catMaybes
library function does. - 因为这正是
catMaybes
库 function 所做的。
λ>
λ> import Data.Maybe(catMaybes)
λ>
λ> catMaybes xs2
"abcr"
λ>
Overall, the source code can be written as:总的来说,源码可以写成:
import Data.Maybe(catMaybes)
import Data.List(mapAccumL)
uniques :: (Eq a) => [a] -> [a]
uniques = let stepFn s x = if (elem x s) then (s, Nothing) else (x:s, Just x)
in catMaybes . snd . mapAccumL stepFn []
This code is reasonably compatible with infinite lists, something occasionally referred to as being “laziness-friendly”:这段代码与无限列表相当兼容,有时被称为“惰性友好”:
λ>
λ> take 5 $ uniques $ "abacrba" ++ (cycle "abcrf")
"abcrf"
λ>
Efficiency note: If we anticipate that it is possible to find many distinct elements in the input list and we can have an Ord a
instance, the state can be implemented as a Set
object rather than a plain list, this without having to alter the overall structure of the solution.效率说明:如果我们预计可以在输入列表中找到许多不同的元素并且我们可以有一个
Ord a
实例,则state可以实现为一个Set
object 而不是一个普通列表,这无需改变整体解决方案的结构。
Here's a solution that uses only Prelude functions:这是一个仅使用 Prelude 函数的解决方案:
uniqueList theList =
if not (null theList)
then head theList : filter (/= head theList) (uniqueList (tail theList))
else []
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.