简体   繁体   中英

Scala: Removing nodes from XML at different levels

My xml looks like this: (it's a NodeSeq )

<first>...</first>
<second>...</second>
<third>
    <foo>
        <keepattr> ... </keepattr>
        <otherattr1> ... </otherattr1>
    </foo>
    <otherattr2> ... </otherattr2>
</third>

I need to keep <first> , remove <second> and anything inside it, and only keep <keepattr> inside <third> , while keeping the data architecture (keeping the foo tag)

how can I do that in Scala?

I tried this but I'm stuck for going one level down

val removeJunk = new RewriteRule {
  override def transform(node: Node): NodeSeq = node match {
    case e: Elem => e.label match {
      case "second" => NodeSeq.Empty
      case "third" => //?
    }
    case o => o

  }
}

And I am possibly interested in going couple levels down in the scheme

Edit: I am looking to keep data while not compromising the data model

<third>
    <foo>
      <keepattr> ... </keepattr> 
      <otherattr1> ... </otherattr1>
    </foo>
    <otherattr2> ... </otherattr2>
</third>

should become

<third>
    <foo>
      <keepattr> ... </keepattr> 
    </foo>
</third>

You could use a combination of filterNot and a RewriteRule . This might be inefficient due to the use of the \\\\ operator at every step, but I can't think of any other solution right now:

val input: NodeBuffer = <first>foo</first>
  <second>remove me</second>
  <third>
    <foo>
      <keepattr>meh</keepattr>
      <otherattr1>bar</otherattr1>
    </foo>
    <otherattr2>quux</otherattr2>
  </third>

val extractKeepAttr = new RewriteRule {
  override def transform(node: Node): NodeSeq = node match {
    case e: Elem => e.label match {
      case "keepattr" => e
      case _ if (e \\ "keepattr").nonEmpty => 
        e copy (child = e.child.filter(c => (c \\ "keepattr").nonEmpty) flatMap transform)
      case _ => e
    }
  }
}

// returns <first>foo</first>, <third><foo><keepattr>meh</keepattr></foo></third>
val updatedXml = input.filterNot(_.label == "second").transform(extractKeepAttr)

EDIT : updated answer

我想指出另一个答案,该答案消除了很多复杂性,但不是那么漂亮...从XML中提取您需要的所有信息,将其存储在val中,如果事先知道结构,则手动重建XML 。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM