简体   繁体   中英

XQuery / BaseX - Limit depth of result

When using XPath or XQuery, is there a way to limit the depth of the result?

I am using BaseX, which supports XQuery 3.1 and XSLT 2.0.

For example, given this input document:

<country name="United States">
  <state name="California">
    <county name="Alameda" >
      <city name="Alameda" />
      <city name="Oakland" />
      <city name="Piedmont" />
    </county>
    <county name="Los Angeles">
      <city name="Los Angeles" />
      <city name="Malibu" />
      <city name="Burbank" />
    </county>
    <county name="Marin">
      <city name="Fairfax" />
      <city name="Larkspur" />
      <city name="Ross" />
    </county>
    <county name="Sacramento">
      <city name="Folsom" />
      <city name="Elk Grove" />
      <city name="Sacramento" />
    </county>
  </state>
</country>

If I execute this query: /country/state , I get the following result:

<state name="California">
  <county name="Alameda">
    <city name="Alameda"/>
    <city name="Oakland"/>
    <city name="Piedmont"/>
  </county>
  <county name="Los Angeles">
    <city name="Los Angeles"/>
    <city name="Malibu"/>
    <city name="Burbank"/>
  </county>
  <county name="Marin">
    <city name="Fairfax"/>
    <city name="Larkspur"/>
    <city name="Ross"/>
  </county>
  <county name="Sacramento">
    <city name="Folsom"/>
    <city name="Elk Grove"/>
    <city name="Sacramento"/>
  </county>
</state>

I would like to limit the depth of the result. Ideally, there'd be a way for me to specify the depth, rather than hard-coding an XPath query.

As an example, I would like to limit the result to the result nodes and its children, but not including the grandchildren, so the result would be:

<state name="California">
  <county name="Alameda" />
  <county name="Los Angeles" />
  <county name="Marin" />
  <county name="Sacramento" />
</state>

One easy and straightforward way is to use XSLT-2.0 with an empty template cancelling all children of <county> . The <xsl:strip-space> removes the space that would have been used by the children.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
  <xsl:strip-space elements="*" />
 
  <!-- Identity template -->
  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()" />
    </xsl:copy>
  </xsl:template>
  
  <xsl:template match="/">
      <xsl:apply-templates select="/country/state" />
  </xsl:template>
  
  <xsl:template match="county/*" />
  
</xsl:stylesheet>

Output is:

<?xml version="1.0" encoding="UTF-8"?>
<state name="California">
    <county name="Alameda"/>
    <county name="Los Angeles"/>
    <county name="Marin"/>
    <county name="Sacramento"/>
</state>

With XQuery, a solution could look like this:

for $st in doc("b.xml")/country/state return
  element { node-name($st) } { $st/@*,
  for $ct in $st/county return 
    element { node-name($ct) } { $ct/@* }
  }

The output is the same.

Actually the result of your query is a single node, the state node in the source document. Some software is then displaying the results of the query - that is, the state node - in some particular format, but in principle the results could be displayed in a different way without changing the query. For example, I'm aware of software that would display the results of this query as

/country[1]/state[1]

So you need to separate two questions: what nodes does the query return, and how are they displayed? In some cases it might make sense to create a processing pipeline where the first step selects the nodes of interest, and the second step controls the presentation of the results.

Personally I would always do the second step in XSLT, but some people prefer XQuery. Take your pick.

@zx845's post got me on the right track. My ultimate goal was to limit the depth of the result, with the intent of getting a "summary" and the metadata I need to get deeper results if necessary.

BaseX has a function "db:node-id" which will return the internal node ID of any given node. There's another function, "db:open-id" which returns the node with a given ID.

Suppose this given input:

<country name="United States">
  <state name="California">
    <county name="Alameda">
      <city name="Alameda"/>
      <city name="Oakland"/>
      <city name="Piedmont"/>
    </county>
    <county name="Los Angeles">
      <city name="Los Angeles"/>
      <city name="Malibu"/>
      <city name="Burbank"/>
    </county>
    <county name="Marin">
      <city name="Fairfax"/>
      <city name="Larkspur"/>
      <city name="Ross"/>
    </county>
    <county name="Sacramento">
      <city name="Folsom"/>
      <city name="Elk Grove"/>
      <city name="Sacramento"/>
    </county>
  </state>
  <state name="New York">
    <county name="Albany">
      <city name="Albany"/>
      <city name="Cohoes"/>
      <city name="Watervliet"/>
    </county>
    <county name="Erie">
      <city name="Buffalo"/>
      <city name="Lackawanna"/>
      <city name="Tonawanda"/>
    </county>
  </state>
</country>

I defined this function, which lets me control the depth, and return the node-id for each node.

declare function local:abbreviated($input, $depth as xs:integer)
{
  if($depth = 0) then
    element node {
      db:node-id($input)
    }
  else
    element { node-name($input) } { 
      attribute node-id {
        db:node-id($input)
      },
      $input/@*,
      $input/text(),
      for $child in $input/*
        return local:abbreviated($child, $depth - 1)
    }
};

If I execute the following:

declare variable $input := /country/state;
for $result in $input
  return local:abbreviated($result, 1)

Then I get this result:

<state node-id="3" name="California">
  <node>5</node>
  <node>13</node>
  <node>21</node>
  <node>29</node>
</state>
<state node-id="37" name="New York">
  <node>39</node>
  <node>47</node>
</state>

Now, when I process the results, if the user wants more details for a state element, I can process each 'node' element and execute this query to get the actual contents of the node

local:abbreviated(db:open-id('states', 5), 2)

Resulting in:

<county node-id="5" name="Alameda">
  <city node-id="7" name="Alameda"/>
  <city node-id="9" name="Oakland"/>
  <city node-id="11" name="Piedmont"/>
</county>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM