[英]How should I match a pattern in Scala?
I need to do a pattern in Scala, this is a code: 我需要在Scala中做一个模式,这是一个代码:
object Wykonaj{
val doctype = DocType("html", PublicID("-//W3C//DTD XHTML 1.0 Strict//EN","http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"), Nil)
def main(args: Array[String]) {
val theUrl = "http://axv.pl/rss/waluty.php"
val xmlString = Source.fromURL(new URL(theUrl)).mkString
val xml = XML.loadString(xmlString)
val zawartosc= (xml \\ "description")
val pattern="""<descrition> </descrition>""".r
for(a <-zawartosc) yield a match{
case pattern=>println(pattern)
}
}
}
The problem is, I need to do val pattern=any
pattern, to get from 问题是,我需要做
val pattern=any
模式,来自
<description><![CDATA[ <img src="http://youbookmarks.com/waluty/pic/waluty/AUD.gif"> dolar australijski 1AUD | 2,7778 | 210/A/NBP/2010 ]]> </description>
only it dolar australijski 1AUD | 只有它dolar australijski 1AUD | 2,7778 |
2,7778 | 210/A/NBP/2010.
210 / A / NBP / 2010。
val zawartosc = (xml \\ "description")
val pattern = """.*(dolar australijski.*)""".r
val allMatches = (for (a <- zawartosc; text = a.text) yield {text}) collect {
case pattern(value) => value }
val result = allMatches.headOption // or .head
This is mostly a matter of using the right regular expression. 这主要是使用正确的正则表达式。 In this case you want to match the string that contains
dolar australijski
. 在这种情况下,您要匹配包含
dolar australijski
的字符串。 It has to allow for extra characters before dolar
. 在
dolar
之前它必须允许额外的字符。 So use .*
. 因此,请使用
.*
。 Then use the parens to mark the start and end of what you need. 然后使用括号标记所需内容的开始和结束。 Refer to the Java api for the full doc .
有关完整文档,请参阅Java api 。
With respect to the for
comprehension, I convert the XML element into text before doing the match and then collect the ones that match the pattern by using the collect
method. 关于
for
理解,我在进行匹配之前将XML元素转换为文本,然后使用collect
方法collect
与模式匹配的元素。 Then the desired result should be the first and only element. 然后,期望的结果应该是第一个也是唯一的元素。
Try 尝试
import scala.util.matching.Regex
//...
val Pattern = new Regex(""".*; ([^<]*) </description>""")
//...
for(a <-zawartosc) yield a match {
case Pattern(p) => println(p)
}
It's a bit of a kludge (I don't use REs with Scala very often), but it seems to work. 它有点像kludge(我不经常使用RE与Scala),但它似乎工作。 The CDATA is stringified as
>
CDATA被字符串化为
>
entities, so the RE tries to find text after a semicolon and before a closing description tag. 实体,因此RE尝试在分号之后和结束描述标签之前查找文本。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.