在ByteString上拆分ByteString（而不是Word8或Char）

Question

I know I already have the Haskell Data.ByteString.Lazy function to split a CSV on a single character, such as: 我知道我已经有了Haskell Data.ByteString.Lazy函数，可以在单个字符上分割CSV，例如：

split :: Word8 -> ByteString -> [ByteString]

But I want to split on a multi-character ByteString (like splitting on a String instead of a Char): 但是我想在一个多字符的ByteString上拆分（就像在一个String而不是一个Char上拆分）：

split :: ByteString -> ByteString -> [ByteString]

I have multi-character separators in a csv-like text file that I need to parse, and the individual characters themselves appear in some of the fields, so choosing just one separator character and discarding the others would contaminate the data import. 我需要解析类似csv的文本文件中的多个字符分隔符，并且各个字符本身会出现在某些字段中，因此仅选择一个分隔符并丢弃其他分隔符会污染数据导入。

I've had some ideas on how to do this, but they seem kind of hacky (eg take three Word8s, test if they're the separator combination, start a new field if they are, recurse further), and I imagine I would be reinventing a wheel anyway. 我对如何执行此操作有一些想法，但是它们似乎有些怪异（例如，使用三个Word8，测试它们是否是分隔符组合，如果是，请启动一个新字段，再递归），我想我会无论如何都要重新发明轮子。 Is there a way to do this without rebuilding the function from scratch? 有没有办法从头开始重建功能的方法？

Answer 1

There are a few functions in bytestring for splitting on subsequences: 字节串中有一些函数可用于拆分子序列：

breakSubstring :: ByteString -> ByteString -> (ByteString,ByteString)

There's also a 还有一个

bytestring-csv package, http://hackage.haskell.org/package/bytestring-csv bytestring-csv软件包， http：//hackage.haskell.org/package/bytestring-csv
a split package: http://hackage.haskell.org/package/split for strings though. 一个拆分包： http : //hackage.haskell.org/package/split用来获取字符串。

Answer 2

The documentation of Bytestrings breakSubstring contains a function that does what you are asking for: Bytestrings breakSubstring的文档包含一个功能，该功能可满足您的要求：

tokenise x y = h : if null t then [] else tokenise x (drop (length x) t)
    where (h,t) = breakSubstring x y

在ByteString上拆分ByteString（而不是Word8或Char）

问题描述

2 个解决方案

解决方案1
2 2009-09-09 10:48:36

解决方案2
2 已采纳 2009-09-09 11:24:18

在ByteString上拆分ByteString（而不是Word8或Char）

问题描述

2 个解决方案

解决方案1 2 2009-09-09 10:48:36

解决方案2 2 已采纳 2009-09-09 11:24:18

解决方案1
2 2009-09-09 10:48:36

解决方案2
2 已采纳 2009-09-09 11:24:18