[英]Convert String into List of List
Input(String):输入(字符串):
[[0_busswvan, 24.0, 2019-09-05 20:15:33],[05f9acb08d7c11e89e8fede614b72917, 20.0, 2019-09-05 14:06:32], [0_h2qbu9h3, 28.0, 2019-09-05 14:01:20],[2_busswvan, 24.0, 2019-09-05 20:15:33],[05f9acb08d7c11e89e8fede614b72917, 25.0, 2019-08-12 14:06:32], [1442qbu9h3, 28.0, 2019-09-05 14:01:20]]
I want to convert this string into a list of lists with type: List[List[String,Double,String]]
我想将此字符串转换为类型为列表的列表: List[List[String,Double,String]]
What's the best possible way to do it?最好的方法是什么?
So far i've tried:到目前为止,我已经尝试过:
var a : String = "[[0_busswvan, 24.0, 2019-09-05 20:15:33], [05f9acb08d7c11e89e8fede614b72917, 20.0, 2019-09-05 14:06:32], [0_h2qbu9h3, 28.0, 2019-09-05 14:01:20]]"
var b : String = "[[2_busswvan, 24.0, 2019-09-05 20:15:33],[05f9acb08d7c11e89e8fede614b72917, 25.0, 2019-08-12 14:06:32], [1442qbu9h3, 28.0, 2019-09-05 14:01:20]]"
a = a.substring(2,a.length-1).concat(",")
b = b.substring(1,b.length-2)
var res = a.concat(b)
var res1 = res.split("\\] ?, ?\\[").map(List(_):List[Any]).toList
But the problem is its of type: List[List[String]]
但问题在于它的类型: List[List[String]]
Alternatively, to pme's solution, you could try to use the parser combinators module .或者,对于 pme 的解决方案,您可以尝试使用解析器组合器模块。
First, you'd need to add it as a dependency since additional features were moved to separate modules:首先,您需要将其添加为依赖项,因为其他功能已移至单独的模块:
libraryDependencies += "org.scala-lang.modules" %% "scala-parser-combinators" % "1.1.2"
Then you could prepare parser:然后你可以准备解析器:
import java.time.format.DateTimeFormatter;
import java.time._
import scala.util.parsing.combinator._
val r = "[[0_busswvan, 24.0, 2019-09-05 20:15:33],[05f9acb08d7c11e89e8fede614b72917, 20.0, 2019-09-05 14:06:32], [0_h2qbu9h3, 28.0, 2019-09-05 14:01:20],[2_busswvan, 24.0, 2019-09-05 20:15:33],[05f9acb08d7c11e89e8fede614b72917, 25.0, 2019-08-12 14:06:32], [1442qbu9h3, 28.0, 2019-09-05 14:01:20]]"
object Parser extends RegexParsers {
val formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")
def text: Parser[String] = """\w+""".r //parser for text
def number: Parser[Double] = """\d+(\.\d*)?""".r ^^ { _.toDouble } //parser for numbers
def datetime: Parser[LocalDateTime] = """\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}""".r ^^ { p => LocalDateTime.from(formatter.parse(p)) } //parser for date
def glue: Parser[String] = """\w*,\w*""".r //parser for comma separators
def term : Parser[List[Any]] = "[" ~ text ~ ", " ~ number ~ ", " ~ datetime ~ "]" ~ opt(glue) ^^ { //parser for matching whole sublist
case _ ~ text ~ _ ~ number ~ _ ~ datetime ~ _ ~ _ => {
List(text, number, datetime)
}
}
def expr : Parser[List[List[Any]]] = "[" ~> rep(term) <~ "]" //parser for whole list containing arbitrary number of sublist
def apply(input: String): List[Any] = parseAll(expr, input) match {
case Success(result, _) => result
case failure : NoSuccess => scala.sys.error(failure.msg)
}
}
println(Parser(r))
//List(List(0_busswvan, 24.0, 2019-09-05T20:15:33), List(05f9acb08d7c11e89e8fede614b72917, 20.0, 2019-09-05T14:06:32), List(0_h2qbu9h3, 28.0, 2019-09-05T14:01:20), List(2_busswvan, 24.0, 2019-09-05T20:15:33), List(05f9acb08d7c11e89e8fede614b72917, 25.0, 2019-08-12T14:06:32), List(1442qbu9h3, 28.0, 2019-09-05T14:01:20))
There's also an issue with your approach, that when you use List
to store value for Double, String and LocalDateTime then compiler widens the type of list to List[Any]
.您的方法还有一个问题,即当您使用List
存储 Double、String 和 LocalDateTime 的值时,编译器会将列表的类型扩大到List[Any]
。 You could consider using tuple (String, Double, LocalDateTime)
instead.您可以考虑改用元组(String, Double, LocalDateTime)
。 In this case parser becomes:在这种情况下,解析器变为:
import java.time.format.DateTimeFormatter;
import java.time._
import scala.util.parsing.combinator._
val r = "[[0_busswvan, 24.0, 2019-09-05 20:15:33],[05f9acb08d7c11e89e8fede614b72917, 20.0, 2019-09-05 14:06:32], [0_h2qbu9h3, 28.0, 2019-09-05 14:01:20],[2_busswvan, 24.0, 2019-09-05 20:15:33],[05f9acb08d7c11e89e8fede614b72917, 25.0, 2019-08-12 14:06:32], [1442qbu9h3, 28.0, 2019-09-05 14:01:20]]"
object Parser extends RegexParsers {
val formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")
def text: Parser[String] = """\w+""".r //parser for text
def number: Parser[Double] = """\d+(\.\d*)?""".r ^^ { _.toDouble } //parser for numbers
def datetime: Parser[LocalDateTime] = """\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}""".r ^^ { p => LocalDateTime.from(formatter.parse(p)) } //parser for date
def glue: Parser[String] = """\w*,\w*""".r //parser for comma separators
def term : Parser[(String, Double, LocalDateTime)] = "[" ~ text ~ ", " ~ number ~ ", " ~ datetime ~ "]" ~ opt(glue) ^^ { //parser for matching whole sublist
case _ ~ text ~ _ ~ number ~ _ ~ datetime ~ _ ~ _ => {
(text, number, datetime)
}
}
def expr : Parser[List[(String, Double, LocalDateTime)]] = "[" ~> rep(term) <~ "]" //parser for whole list containing arbitrary number of sublist
def apply(input: String): List[Any] = parseAll(expr, input) match {
case Success(result, _) => result
case failure : NoSuccess => scala.sys.error(failure.msg)
}
}
println(Parser(r))
//List((0_busswvan,24.0,2019-09-05T20:15:33), (05f9acb08d7c11e89e8fede614b72917,20.0,2019-09-05T14:06:32), (0_h2qbu9h3,28.0,2019-09-05T14:01:20), (2_busswvan,24.0,2019-09-05T20:15:33), (05f9acb08d7c11e89e8fede614b72917,25.0,2019-08-12T14:06:32), (1442qbu9h3,28.0,2019-09-05T14:01:20))
The only small mistake is that you use map
instead of flatMap
唯一的小错误是您使用map
而不是flatMap
This works as expected:这按预期工作:
var res1 = res.split("\\] ?, ?\\[").flatMap(List(_):List[Any]).toList
Here is an answer that explains the difference, if you are interested: https://stackoverflow.com/a/45319928/2750966如果您有兴趣,这里有一个解释差异的答案: https://stackoverflow.com/a/45319928/2750966
Another way is to flatten
everything in the end:另一种方法是最终将所有内容flatten
:
var res1 = res.split("\\] ?, ?\\[").map(List(_):List[Any]).toList.flatten
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.