简体   繁体   中英

XQuery: Separate a sequence into multiple ones

I want to seperate a sequence into multiple ones.

So eg I have this sequence

let $allNumbers := (1,2,3,4,5)

And I want as an result one sequence with all number less than 3 and one sequence with equal or more than 3.

let $lessThanThree := (1,2)
let $moreEqalThree := (3,4,5)

How would one best achieve this? The isse use, that you can not return multiple values from a FLWOR expression.

I am currently aware of multiple ways to do this, but all of them do not seem like a good solution. Keep in mind that I am currently Using MarkLogic 10 with the dialect

xquery version "1.0-ml";

Using multiple maps

You could loop over the sequence and the simply at the different element with map:put() . But this seems kind of wrong since we are not actually storing any Key-value-pairs.

Using xdmp:set()

With xdmp:set() you could modify multiple variables, but this feels like kind of bad practice and not a function you should use. But maybe I am wrong here?

Executing the FLWOR multiple times

This is an obvious solution however the issue is that sometime the loop may need quite a bit of time. Ff one loop takes 10 minutes I do not really want to execute it multiple times.

You say you want the result of the query to be "two sequences". Well, the result of a query is always an XDM value, so you'll have to think about what kind of XDM value can hold two sequences. In 3.1 that's easy - use an array, so you're returning [(1,2), (3,4,5)] , and you could get that with the query

[$in[. le 3], $in[. ge 3]] 

Before maps and arrays were introduced you would need to find some other representation, for example XML.

As regards performance, the devil is always in the detail. What form does the actual input sequence take, how is it computed? Normally I wouldn't expect that processing a sequence twice takes any longer than processing it once and doing twice as much work with each item. But it depends. And where is the output going?

Generally, applying a predicate filter $allNumbers[. lt 3] $allNumbers[. lt 3] and $allNumbers[. ge 3] $allNumbers[. ge 3] , as Marting Honnen suggested, would be the most straightforward and natural way of doing things.

There are sometimes cases in which the sequence is really large or is otherwise an expensive operation and you want to minimize the number of times that a sequence is processed and iterated over.

For that, as you suggested in your question, you could look to use xdmp:set() or putting items into a map:map (or different maps).

But if it's taking 10 minutes to iterate over the set, you might want to take a step back and see if you are doing something "the hard way" with brute force, where you might instead be able to leverage indexes to get facets, ranges, frequency, etc. or to run a batch process instead of trying to do it all in one query. A "boil the ocean" type query won't be fast, even if you optimize some of the work.

Since you brought the map option up:

fold-left($seq, map{ "lt3": [], "other": [] }, function ($res, $next) {
  if ($next < 3) then map:put($res, "lt3", array:append($res?lt3, $next))
  else map:put($res, "other", array:append($res?other, $next))
})

see it in action here https://xqueryfiddle.liberty-development.net/nc4P6yn/1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM