简体   繁体   English

Haskell Bytestrings:如何模式匹配?

[英]Haskell Bytestrings: How to pattern match?

I'm a Haskell newbie, and having a bit of trouble figuring out how to pattern match a ByteString . 我是一个Haskell新手,并且在弄清楚如何模式匹配ByteString遇到了一些麻烦。 The [Char] version of my function looks like: 我的函数的[Char]版本如下:

dropAB :: String -> String
dropAB []       = []
dropAB (x:[])   = x:[]
dropAB (x:y:xs) = if x=='a' && y=='b'
                  then dropAB xs
                  else x:(dropAB $ y:xs) 

As expected, this filters out all occurrences of "ab" from a string. 正如所料,这会过滤掉字符串中出现的所有“ab”。 However, I have problems trying to apply this to a ByteString . 但是,我在尝试将其应用于ByteString遇到问题。

The naive version 天真的版本

dropR :: BS.ByteString -> BS.ByteString
dropR []         = []
dropR (x:[])     = [x]
<...>

yields 产量

Couldn't match expected type `BS.ByteString'
       against inferred type `[a]'
In the pattern: []
In the definition of `dropR': dropR [] = []

[] is clearly the culprit, as it is for a regular String not a ByteString . []显然是罪魁祸首,因为它是常规String而不是ByteString Subbing in BS.empty seems like the right thing but gives "Qualified name in the binding position: BS.empty." BS.empty中的BS.empty似乎是正确的,但是在绑定位置给出了“合格的名称:BS.empty”。 Leaving us to try 离开我们去尝试

dropR :: BS.ByteString -> BS.ByteString
dropR empty              = empty        
dropR (x cons empty)     = x cons empty
<...>

this gives "parse error in pattern" for (x cons empty) . 这为(x cons empty)提供了“模式中的解析错误”。 I don't really know what else I can do here. 我真的不知道我还能在这做什么。

As a side note, what I'm trying to do with this function is to filter out a specific UTF16 character from some text. 作为旁注,我正在尝试使用此函数来从某些文本中过滤掉特定的UTF16字符。 If there's a clean way to accomplish that, I'd love to hear it, but this pattern matching error seems like something that a newbie haskeller should really understand. 如果有一个干净的方法来实现这一点,我很乐意听到它,但这种模式匹配错误似乎是新手haskeller应该真正理解的东西。

You can use view patterns for such things 您可以使用视图模式进行此类操作

{-# LANGUAGE ViewPatterns #-}    
import Data.ByteString (ByteString, cons, uncons, singleton, empty)
import Data.ByteString.Internal (c2w) 

dropR :: ByteString -> ByteString
dropR (uncons -> Nothing) = empty
dropR (uncons -> Just (x,uncons -> Nothing)) = singleton x
dropR (uncons -> Just (x,uncons -> Just(y,xs))) =
    if x == c2w 'a' && y == c2w 'b'
    then dropR xs
    else cons x (dropR $ cons y xs)

The latest version of GHC (7.8) has a feature called pattern synonyms which can be added to gawi's example: 最新版本的GHC(7.8)有一个名为模式同义词的功能,可以添加到gawi的例子中:

{-# LANGUAGE ViewPatterns, PatternSynonyms #-}

import Data.ByteString (ByteString, cons, uncons, singleton, empty)
import Data.ByteString.Internal (c2w)

infixr 5 :<

pattern b :< bs <- (uncons -> Just (b, bs))
pattern Empty   <- (uncons -> Nothing)

dropR :: ByteString -> ByteString
dropR Empty          = empty
dropR (x :< Empty)   = singleton x
dropR (x :< y :< xs)
  | x == c2w 'a' && y == c2w 'b' = dropR xs
  | otherwise                    = cons x (dropR (cons y xs))

Going further you can abstract this to work on any type class (this will look nicer when/if we get associated pattern synonyms ). 更进一步,您可以将其抽象为适用于任何类型类(如果我们获得关联的模式同义词,这将看起来更好)。 The pattern definitions stay the same: 模式定义保持不变:

{-# LANGUAGE ViewPatterns, PatternSynonyms, TypeFamilies #-}

import qualified Data.ByteString as BS
import Data.ByteString (ByteString, singleton)
import Data.ByteString.Internal (c2w)
import Data.Word

class ListLike l where
  type Elem l

  empty  :: l
  uncons :: l -> Maybe (Elem l, l)
  cons   :: Elem l -> l -> l

instance ListLike ByteString where
  type Elem ByteString = Word8

  empty  = BS.empty
  uncons = BS.uncons
  cons   = BS.cons

instance ListLike [a] where
  type Elem [a] = a

  empty         = []
  uncons []     = Nothing
  uncons (x:xs) = Just (x, xs)
  cons          = (:)

in which case dropR can work on both [Word8] and ByteString : 在这种情况下, dropR可以在[Word8]ByteString

-- dropR :: [Word8]    -> [Word8]
-- dropR :: ByteString -> ByteString
dropR :: (ListLike l, Elem l ~ Word8) => l -> l
dropR Empty          = empty
dropR (x :< Empty)   = cons x empty
dropR (x :< y :< xs)
  | x == c2w 'a' && y == c2w 'b' = dropR xs
  | otherwise                    = cons x (dropR (cons y xs))

And for the hell of it: 对于它的地狱:

import Data.ByteString.Internal (w2c)

infixr 5 :•    
pattern b :• bs <- (w2c -> b) :< bs

dropR :: (ListLike l, Elem l ~ Word8) => l -> l
dropR Empty              = empty
dropR (x   :< Empty)     = cons x empty
dropR ('a' :• 'b' :• xs) = dropR xs
dropR (x   :< y   :< xs) = cons x (dropR (cons y xs))

You can see more on my post on pattern synonyms. 你可以在我关于模式同义词的帖子上看到更多。

Patterns use data constructors. 模式使用数据构造函数。 http://book.realworldhaskell.org/read/defining-types-streamlining-functions.html http://book.realworldhaskell.org/read/defining-types-streamlining-functions.html

Your empty is just a binding for the first parameter, it could have been x and it would not change anything. 你的empty是第一个参数的绑定,它可能是x ,它不会改变任何东西。

You can't reference a normal function in your pattern so (x cons empty) is not legal. 您无法在模式中引用正常函数,因此(x cons empty)不合法。 Note: I guess (cons x empty) is really what you meant but this is also illegal. 注意:我猜(cons x empty)真的是你的意思,但这也是非法的。

ByteString is quite different from String . ByteStringString完全不同。 String is an alias of [Char] , so it's a real list and the : operator can be used in patterns. String[Char]的别名,因此它是一个真实的列表,而:运算符可以用于模式。

ByteString is Data.ByteString.Internal.PS !(GHC.ForeignPtr.ForeignPtr GHC.Word.Word8) !Int !Int (ie a pointer to a native char* + offset + length). ByteString是Data.ByteString.Internal.PS !(GHC.ForeignPtr.ForeignPtr GHC.Word.Word8) !Int !Int (即指向本地字符* +偏移量+长度的指针)。 Since the data constructor of ByteString is hidden, you must use functions to access the data, not patterns. 由于隐藏了ByteString的数据构造函数,因此必须使用函数来访问数据,而不是模式。


Here a solution (surely not the best one) to your UTF-16 filter problem using the text package: 这里使用text包解决您的UTF-16过滤器问题的解决方案(当然不是最好的解决方案):

module Test where

import Data.ByteString as BS
import Data.Text as T
import Data.Text.IO as TIO
import Data.Text.Encoding

removeAll :: Char -> Text -> Text
removeAll c t =  T.filter (/= c) t

main = do
  bytes <- BS.readFile "test.txt"
  TIO.putStr $ removeAll 'c' (decodeUtf16LE bytes)

For this, I would pattern match on the result of uncons :: ByteString -> Maybe (Word8, ByteString) . 为此,我会在uncons :: ByteString -> Maybe (Word8, ByteString)的结果上进行模式匹配。

Pattern matching in Haskell only works on constructors declared with 'data' or 'newtype.' Haskell中的模式匹配仅适用于使用'data'或'newtype'声明的构造函数。 The ByteString type doesn't export its constructors you cannot pattern match. ByteString类型不会导出您无法模式匹配的构造函数。

Just to address the error message you received and what it means: 只是为了解决您收到的错误消息及其含义:

Couldn't match expected type `BS.ByteString'
       against inferred type `[a]'
In the pattern: []
In the definition of `dropR': dropR [] = []

So the compiler expected your function to be of type: BS.ByteString -> BS.ByteString because you gave it that type in your signature. 所以编译器期望你的函数是类型: BS.ByteString -> BS.ByteString因为你在签名中给了它那个类型。 Yet it inferred (by looking at the body of your function) that the function is actually of type [a] -> [a] . 然而,它(通过查看函数的主体) 推断该函数实际上是[a] -> [a] There is a mismatch there so the compiler complains. 那里有一个不匹配,所以编译器抱怨。

The trouble is you are thinking of (:) and [] as syntactic sugar, when they are actually just the constructors for the list type (which is VERY different from ByteString). 麻烦的是你正在考虑(:)和[]作为语法糖,当它们实际上只是列表类型的构造函数(它与ByteString非常不同)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM