简体   繁体   English

切换到ByteStrings

[英]Switching to ByteStrings

EDIT: I followed Yuras and Dave4420's advices (Thanks). 编辑:我遵循了Yuras和Dave4420的建议(谢谢)。 I still have some errors. 我仍然有一些错误。 Updated the question. 更新了问题。 Finally I will use meiersi's version (Thanks) but I still want to find my errors... 最后,我将使用meiersi的版本(谢谢),但我仍然想找到我的错误...

I have a simple script that goes like this: 我有一个简单的脚本,如下所示:

import System.Environment

getRow :: Int -> String -> String
getRow n = (!!n) . lines

getField :: Int -> String -> String
getField n = (!!n) . words'

words' :: String -> [String]
words' str = case str of
                        [] -> []
                        _ -> (takeHead " ; " str) : (words' (takeTail " ; " str))

takeHead :: String -> String -> String
takeHead st1 st2 = case st2 of
                                [] -> []
                                _ -> if st1 == (nHead (length st1) st2) then [] else (head st2):(takeHead st1 (tail st2))

takeTail :: String -> String -> String
takeTail st1 st2 = case st2 of
                                [] -> []
                                _ -> if st1 == (nHead (length st1) st2) then nTail (length st1) st2 else takeTail st1 (tail st2)

nTail :: Int -> String -> String
nTail n str = let rec n str = if n == 0 then str else rec (n - 1) (tail str)
              in if (length str) < n then str else rec n str

nHead :: Int -> String -> String
nHead n str = let rec n str = if n == 0 then [] else (head str):(rec (n - 1) (tail str))
              in if (length str) < n then str else rec n str

getValue :: String -> String -> String -> String
getValue row field src = getField (read field) $ getRow (read row) src

main :: IO ()
main = do
    args <- getArgs
    case args of
        (path: opt1: opt2: _) -> do
            src <- readFile path
            putStrLn $ getValue opt1 opt2 src
        (path: _) -> do
            src <- readFile path
            putStrLn $ show $ length $ lines src

It compiles and works. 它可以编译和工作。 Then I wanted to switch to ByteString s. 然后我想切换到ByteString Here is my attempt: 这是我的尝试:

import qualified Data.ByteString.Lazy as B
import qualified Data.ByteString.Lazy.Char8 as Bc (cons, empty,unpack)
import qualified Data.ByteString.Lazy.UTF8 as Bu (lines)
import qualified System.Posix.Env.ByteString as Bg (getArgs)

separator :: B.ByteString
separator = (Bc.cons ' ' (Bc.cons ';' (Bc.cons ' ' Bc.empty)))

getRow :: Int -> B.ByteString -> B.ByteString
getRow n = (`B.index` n) $ Bu.lines

getCol :: Int -> B.ByteString -> B.ByteString
getCol n = (`B.index` n) $ wordsWithSeparator

wordsWithSeparator :: B.ByteString -> [B.ByteString]
wordsWithSeparator str = if B.null str then [] else (takeHead separator str):(wordsWithSeparator (takeTail separator str))

takeHead :: B.ByteString -> B.ByteString -> B.ByteString
takeHead st1 st2 = if B.null st2 then B.empty else if st1 == (nHead (toInteger (B.length st1)) st2) then B.empty else B.cons (B.head st2) (takeHead st1 (B.tail st2))

takeTail :: B.ByteString -> B.ByteString -> B.ByteString
takeTail st1 st2 = if B.null st2 then B.empty else if st1 == (nHead (toInteger (B.length st1)) st2) then nTail (toInteger (B.length st1)) st2 else takeTail st1 (B.tail st2)

nTail :: Integer -> B.ByteString -> B.ByteString
nTail n str = let rec n str = if n == 0 then str else rec (n - 1) (B.tail str)
              in if (toInteger (B.length str)) < n then str else rec n str

nHead :: Integer -> B.ByteString -> B.ByteString
nHead n str = let rec n str = if n == 0 then B.empty else B.cons (B.head str)(rec (n - 1) (B.tail str))
              in if (toInteger (B.length str)) < n then str else rec n str

getValue :: B.ByteString -> B.ByteString -> B.ByteString -> B.ByteString
getValue row field = getCol (read (Bc.unpack field)) . getRow (read (Bc.unpack row))

main = do args <- Bg.getArgs
          case (map (B.fromChunks . return) args) of
                                                    (path:opt1:opt2:_) -> do src <- B.readFile (Bc.unpack path)
                                                                             B.putStrLn $ getValue opt1 opt2 src

                                                    (path:_)           -> do src <- B.readFile (Bc.unpack path)
                                                                             putStrLn $ show $ length $ Bu.lines src

It doesn't work. 没用 I could not debug it. 我无法调试。 Here is what GHC tells me: 这是GHC告诉我的:

BETA_getlow2.hs:10:23:
    Couldn't match expected type `GHC.Int.Int64' with actual type `Int'
    In the second argument of `B.index', namely `n'
    In the expression: (`B.index` n)
    In the expression: (`B.index` n) $ Bu.lines

BETA_getlow2.hs:13:23:
    Couldn't match expected type `GHC.Int.Int64' with actual type `Int'
    In the second argument of `B.index', namely `n'
    In the expression: (`B.index` n)
    In the expression: (`B.index` n) $ wordsWithSeparator

Any tips would be appreciated. 任何提示将不胜感激。

getRow n = (!!n) . lines

Compare with 与之比较

getRow n = B.index . Bu.lines

In the second version you don't use n at all, so it is the same as 在第二个版本中,您根本不使用n ,因此它与

getRow _ = B.index . Bu.lines

In the fist example you use n as an argument to the (!!) operator. 在第一个示例中,您使用n作为(!!)运算符的参数。 You need to do the same in the second version. 您需要在第二个版本中执行相同的操作。

Looks like it is not the only issue in your code, but I hope it is a good point to start ;) 看起来这不是代码中的唯一问题,但我希望这是一个好起点;)

I'm taking the liberty to interpret the following two sub-questions into your original question. 我谨将以下两个子问题解释为您的原始问题。

  1. What Haskell code would one typically write for a script like the one you posted. 一个Haskell代码通常会像您发布的脚本那样为脚本编写。
  2. What are the right data structures to efficiently perform the desired functionality. 什么是有效执行所需功能的正确数据结构。

The following code gives one answer to these two sub-questions. 以下代码为这两个子问题提供了一个答案。 It uses the text library to represent sequences of Unicode characters. 它使用text库来表示Unicode字符序列。 Moreover, it exploits the text library's high-level API to implement the desired functionality. 此外,它利用text库的高级API来实现所需的功能。 This makes the code easier to grasp and thereby avoids potential mistakes in the implementation of low-level functions. 这使代码更易于掌握,从而避免了在执行低级功能时可能发生的错误。

{-# LANGUAGE OverloadedStrings #-}

import qualified Data.Text    as T
import qualified Data.Text.IO as T

import System.Environment (getArgs)

type Table a = [[a]]

-- | Split a text value into a text table.
toTable :: T.Text -> Table T.Text
toTable = map (T.splitOn " ; ") . T.lines

-- | Retrieve a cell from a table.
cell :: Int -> Int -> Table a -> a
cell row col = (!! col) . (!! row)

main :: IO ()
main = do
    (path:rest) <- getArgs
    src <- T.readFile path
    case rest of
        row : col : _ -> T.putStrLn $ cell (read row) (read col) $ toTable src
        _             -> putStrLn $ show $ length $ T.lines src

The first two errors Yuras has resolved for you, I think. 我认为Yuras为您解决了前两个错误。


Re the 3rd error: 重新出现第三个错误:

words' :: B.ByteString -> [B.ByteString]
words' str = if B.null str then B.empty else ...

The B.empty should be [] . B.empty应该为[] B.empty :: B.ByteString , but the result is supposed to have type [B.ByteString] . B.empty :: B.ByteString ,但结果应为[B.ByteString]类型。


Re the 4th-7th errors: 重新出现第4-7个错误:

  • length :: [a] -> Int
  • B.length :: B.ByteString -> Int64

In this case I would change the type signatures of nTail and nHead to use Int64 instead of Int . 在这种情况下,我将nTailnHead的类型签名更改为使用Int64而不是Int If that didn't work, I'd use Integer on all Integral types, using toInteger to do the conversion. 如果那不起作用,我将在所有Integral类型上使用Integer ,并使用toInteger进行转换。


Re the 8th error: 重新出现第8个错误:

The input to read must be a String . read的输入必须是String There's no getting round that. 没有回合。 You'll have to convert the B.ByteString to a String and pass that to read . 您必须将B.ByteString转换为String并将其传递为read

(Incidently, are you sure you want to switch to ByteString and not Text?) (顺便说一句,您确定要切换到ByteString而不是Text吗?)


Re the 9th (final) error: 重新出现第9个(最终)错误:

args :: [Data.ByteString.ByteString] (nb a list of strict bytestrings, not the lazy bytestrings you use elsewhere) but in the pattern match you expect args :: B.ByteString for some reason. args :: [Data.ByteString.ByteString] (nb个严格字节args :: [Data.ByteString.ByteString]的列表,而不是您在其他地方使用的惰性字节args :: B.ByteString ),但是在模式匹配中,出于某种原因,您期望args :: B.ByteString

You should pattern match on a [ByteString] the same way you pattern match on a [String] : they are both lists. 您应该在[ByteString]上进行模式匹配,就像在[String]上进行模式匹配一​​样:它们都是列表。

Convert args to something of type [B.ByteString] with map (B.fromChunks . return) args . 使用map (B.fromChunks . return) args将args转换为[B.ByteString]类型的东西。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM