简体   繁体   中英

unique elements in a haskell list

okay, this is probably going to be in the prelude, but: is there a standard library function for finding the unique elements in a list? my (re)implementation, for clarification, is:

has :: (Eq a) => [a] -> a -> Bool
has [] _ = False
has (x:xs) a
  | x == a    = True
  | otherwise = has xs a

unique :: (Eq a) => [a] -> [a]
unique [] = []
unique (x:xs)
  | has xs x  = unique xs
  | otherwise = x : unique xs

I searched for (Eq a) => [a] -> [a] on Hoogle .

First result was nub (remove duplicate elements from a list).

Hoogle is awesome.

The nub function from Data.List (no, it's actually not in the Prelude) definitely does something like what you want, but it is not quite the same as your unique function. They both preserve the original order of the elements, but unique retains the last occurrence of each element, while nub retains the first occurrence.

You can do this to make nub act exactly like unique , if that's important (though I have a feeling it's not):

unique = reverse . nub . reverse

Also, nub is only good for small lists. Its complexity is quadratic, so it starts to get slow if your list can contain hundreds of elements.

If you limit your types to types having an Ord instance, you can make it scale better. This variation on nub still preserves the order of the list elements, but its complexity is O(n * log n) :

import qualified Data.Set as Set

nubOrd :: Ord a => [a] -> [a] 
nubOrd xs = go Set.empty xs where
  go s (x:xs)
   | x `Set.member` s = go s xs
   | otherwise        = x : go (Set.insert x s) xs
  go _ _              = []

In fact, it has been proposed to add nubOrd to Data.Set .

import Data.Set (toList, fromList)
uniquify lst = toList $ fromList lst

I think that unique should return a list of elements that only appear once in the original list; that is, any elements of the orginal list that appear more than once should not be included in the result.

May I suggest an alternative definition, unique_alt:

    unique_alt :: [Int] -> [Int]
    unique_alt [] = []
    unique_alt (x:xs)
        | elem x ( unique_alt xs ) = [ y | y <- ( unique_alt xs ), y /= x ]
        | otherwise                = x : ( unique_alt xs )

Here are some examples that highlight the differences between unique_alt and unqiue:

    unique     [1,2,1]          = [2,1]
    unique_alt [1,2,1]          = [2]

    unique     [1,2,1,2]        = [1,2]
    unique_alt [1,2,1,2]        = []

    unique     [4,2,1,3,2,3]    = [4,1,2,3]
    unique_alt [4,2,1,3,2,3]    = [4,1]

I think this would do it.

unique [] = []
unique (x:xs) = x:unique (filter ((/=) x) xs)

Another way to remove duplicates:

unique :: [Int] -> [Int]
unique xs = [x | (x,y) <- zip xs [0..], x `notElem` (take y xs)]

Algorithm in Haskell to create a unique list:

data Foo = Foo { id_ :: Int
               , name_ :: String
               } deriving (Show)

alldata = [ Foo 1 "Name"
          , Foo 2 "Name"
          , Foo 3 "Karl"
          , Foo 4 "Karl"
          , Foo 5 "Karl"
          , Foo 7 "Tim"
          , Foo 8 "Tim"
          , Foo 9 "Gaby"
          , Foo 9 "Name"
          ]

isolate :: [Foo] -> [Foo]
isolate [] = []
isolate (x:xs) = (fst f) : isolate (snd f)
  where
    f = foldl helper (x,[]) xs
    helper (a,b) y = if name_ x == name_ y
                     then if id_ x >= id_ y
                          then (x,b)
                          else (y,b)
                     else (a,y:b)

main :: IO ()
main = mapM_ (putStrLn . show) (isolate alldata)

Output:

Foo {id_ = 9, name_ = "Name"}
Foo {id_ = 9, name_ = "Gaby"}
Foo {id_ = 5, name_ = "Karl"}
Foo {id_ = 8, name_ = "Tim"}

A library-based solution:

We can use that style of Haskell programming where all looping and recursion activities are pushed out of user code and into suitable library functions. Said library functions are often optimized in ways that are way beyond the skills of a Haskell beginner.

A way to decompose the problem into two passes goes like this:

  1. produce a second list that is parallel to the input list, but with duplicate elements suitably marked
  2. eliminate elements marked as duplicates from that second list

For the first step, duplicate elements don't need a value at all, so we can use [Maybe a] as the type of the second list. So we need a function of type:

pass1 :: Eq a => [a] -> [Maybe a]

Function pass1 is an example of stateful list traversal where the state is the list (or set) of distinct elements seen so far. For this sort of problem, the library provides the mapAccumL:: (s -> a -> (s, b)) -> s -> [a] -> (s, [b]) function.

Here the mapAccumL function requires, besides the initial state and the input list, a step function argument, of type s -> a -> (s, Maybe a) .

If the current element x is not a duplicate, the output of the step function is Just x and x gets added to the current state. If x is a duplicate, the output of the step function is Nothing , and the state is passed unchanged.

Testing under the ghci interpreter:

$ ghci
 GHCi, version 8.8.4: https://www.haskell.org/ghc/  :? for help
 λ> 
 λ> stepFn s x = if (elem x s) then (s, Nothing) else (x:s, Just x)
 λ> 
 λ> import Data.List(mapAccumL)
 λ> 
 λ> pass1 xs = mapAccumL stepFn [] xs
 λ> 
 λ> xs2 = snd $ pass1 "abacrba"
 λ> xs2
 [Just 'a', Just 'b', Nothing, Just 'c', Just 'r', Nothing, Nothing]
 λ> 

Writing a pass2 function is even easier. To filter out Nothing non-values, we could use:

import Data.Maybe( fromJust, isJust)
pass2 = (map fromJust) . (filter isJust)

but why bother at all? - as this is precisely what the catMaybes library function does.

 λ> 
 λ> import Data.Maybe(catMaybes)
 λ> 
 λ> catMaybes xs2
 "abcr"
 λ> 

Putting it all together:

Overall, the source code can be written as:

import Data.Maybe(catMaybes)
import Data.List(mapAccumL)

uniques :: (Eq a) => [a] -> [a]
uniques = let  stepFn s x = if (elem x s) then (s, Nothing) else (x:s, Just x)
          in   catMaybes . snd . mapAccumL stepFn []

This code is reasonably compatible with infinite lists, something occasionally referred to as being “laziness-friendly”:

 λ> 
 λ> take 5 $ uniques $ "abacrba" ++ (cycle "abcrf")
 "abcrf"
 λ> 

Efficiency note: If we anticipate that it is possible to find many distinct elements in the input list and we can have an Ord a instance, the state can be implemented as a Set object rather than a plain list, this without having to alter the overall structure of the solution.

Here's a solution that uses only Prelude functions:

uniqueList theList =
if not (null theList)
    then head theList : filter (/= head theList) (uniqueList (tail theList))
    else []

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM