简体   繁体   中英

How do you filter a list of strings by a string in haskell?

I have a string containing letters I want to be sure are in the words in the list. Running it however results in it still leaving behind words that contain the undesired letters.

Here's my function:

import Data.List    

filterWords :: String -> [String]
filterWords str =
  let strs      = words str
      letters   = concat . words . nub $ "poultry outwits ants"
      predicate = dropWhile (`elem` letters) ['a' .. 'z']
  in  dropWhile (any (`elem` predicate)) strs

What do I need to change to make this work?

To make it clear, I want to filter away any words that contain letters not in "poultry outwits ants", meaning a word like "years" would be dropped because despite containing 'y' , 'a' , 'r' , and 's' which all satisfy the predicate, it also contains 'e' which doesn't.

A good way to filter a list of things (eg words) is to use the filter function. What you need to provide is a predicate which tells whether a string should be included or not. You commented that you want to include those strings which consists of letters in "poultry outwits ants" , so that would be

filterWords :: String -> [String]
filterWords str = filter acceptableWord (words str)
  where
    acceptableWord = all (`elem` "poultry outwits ants")

Now, in another comment you wrote that

Some of the words I get have more copies of the same letter than there are in the original.

So I suspect what you really want is to figure out which words can be formed out of the letters in "poultry outwits ants" .

To do this, you could count how often each character appears in the given word (and in the mgic string poultry outwits ants ) and then verify that not only each letter in the word appears in the magic string but also that the letter doesn't appear more often than in the magic string.

I'd start by defining a function which calculates 'character frequency table', ie it counts how often each character occurs in the given string:

freq :: String -> [(Char, Int)]
freq = map (\s -> (head s, length s)) . group . sort

Furthermore, I'd define a function which tells whether one frequency table x is a "subset" of another table y , ie it verifies that each character in x also appears in y , but it doesn't occur more often:

subset :: [(Char, Int)] -> [(Char, Int)] -> Bool
subset x y = all f x
  where
    f (ch, occ) = case lookup ch y of
                      Just occ' -> occ <= occ'
                      Nothing   -> False

You can then use this to define acceptableWord such that it only accepts words whose frequency table is a subset of the frequency table of the magic string, so we get:

filterWords :: String -> [String]
filterWords str = filter acceptableWord (words str)
  where
    acceptableWord w = subset (freq w) (freq "poultry outwits ants")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM