简体   繁体   English

将字符串切割成 Haskell 中的列表?

[英]Cutting a string into a list in Haskell?

is it possible to cut a string eg是否可以剪断一根绳子,例如

"one , Two"

to a list到一个列表

["one", "two"]

or just要不就

"one", "two"

thanks谢谢

There's a whole module of functions for different strategies to split a list (such as a string, which is just a list of characters): Data.List.Split有一个完整的函数模块用于不同的策略来拆分列表(例如字符串,它只是一个字符列表): Data.List.Split

Using this, you could do使用这个,你可以做

import Data.List.Split

> splitOn " , " "one , Two"
["one","Two"]

Regular old list operations are sufficient here,常规的旧列表操作在这里就足够了,

import Data.Char

> [ w | w <- words "one , Two", all isAlpha w ]
["one","Two"]

aka又名

> filter (all isAlpha) . words $ "one , Two"
["one","Two"]

List hacking, parsing and design列表黑客、解析和设计

There is a scale of power and weight in text processing.文本处理有一定的权力和权重。 At the simplest, list-based solutions, such as the one above, offer very little syntactic noise, for quick results (in the same spirit as quick'n'dirty text processing in shell scripts).在最简单的情况下,基于列表的解决方案,例如上面的解决方案,提供非常少的句法噪音,以获得快速的结果(与 shell 脚本中的快速'n'dirty 文本处理相同)。

List manipulation can get quite sophisticated, and you might consider, eg the generalized split library, for splitting lists on arbitrary text,列表操作可能会变得非常复杂,您可能会考虑,例如通用拆分库,用于在任意文本上拆分列表,

> splitOn " , " "one , Two"
["one","Two"]

For harder problems, or for code that is not likely to be thrown away, more robust techniques make sense.对于更难的问题,或者不太可能被丢弃的代码,更强大的技术是有意义的。 In particular, you can avoid fragile pattern matching by describing the problem as a grammar with parser combinators, such as parsec or uu-parsinglib .特别是,您可以通过使用解析器组合器(例如parsecuu-parsinglib )将问题描述为语法来避免脆弱的模式匹配。 String-processing described via parsers tends to lead to more robust code over time, as it is relatively easy to modify parsers written in a combinator style, as requirements change.随着时间的推移,通过解析器描述的字符串处理往往会导致代码更加健壮,因为随着需求的变化,修改以组合器样式编写的解析器相对容易。

Note on regular expressions: list matching and regular expressions are approximately equivalent in ease of use and (un)safety, so for the purposes of this discussion, you can substitute "regex" for "list splitting".关于正则表达式的注意事项:列表匹配和正则表达式在易用性和(不)安全性方面大致相同,因此出于本讨论的目的,您可以将“正则表达式”替换为“列表拆分”。 Parsing is almost always the right approach, if the code is intended to be long lived.如果代码打算长期存在,解析几乎总是正确的方法。

If you'd rather not install the split package ( see Frerich Raabe's answer ), here's an implementation of the splitOn function that's light on dependencies:如果您不想安装拆分 package请参阅 Frerich Raabe 的回答),这里是splitOn function 的实现,它对依赖关系很清楚:

import Data.List

splitOn :: Eq a => [a] -> [a] -> [[a]]
splitOn []    _  = error "splitOn: empty delimiter"
splitOn delim xs = loop xs
    where loop [] = [[]]
          loop xs | delim `isPrefixOf` xs = [] : splitOn delim (drop len xs)
          loop (x:xs) = let (y:ys) = splitOn delim xs
                         in (x:y) : ys
          len = length delim

Untested, using Parsec.未经测试,使用 Parsec。 Theres probably a regex separator too.也可能有一个正则表达式分隔符。

firstElement :: Parser String
firstElement = many $ noneOf ' '

otherElement :: Parser String
otherElement = do many $ char ' '
                  char ','
                  many $ char ' '
                  firstElement

elements :: Parser [String]
elements = liftM2 (:) firstElement (many otherElement)

parseElements :: String -> [String]
parseElements = parse elements "(unknown)"

It would be nice to clean up otherElement somehow, similar to how I managed to collapse elements using liftM2 .以某种方式清理otherElement会很好,类似于我如何使用liftM2设法折叠elements

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM