简体   繁体   English

如何将 JSON 文件导入 JSONiq 集合?

[英]How do I import a JSON file into a JSONiq collection?

I have looked everywhere, and even the JSONiq documentation says "this is beyond the scope of this document."我到处找,甚至 JSONiq 文档都说“这超出了本文档的范围”。 I have a JSON file (an array of JSON objects) I want to import into JSONiq (particularly Zorba, which by the way is a terrible name because it makes Internet searches for information futile) to use as a collection to query.我有一个 JSON 文件(一组 JSON 对象),我想导入到 JSONiq(尤其是 Zorba,顺便说一句,这是一个糟糕的名字,因为它使 Internet 搜索信息无效)用作查询的集合。 Is there a tutorial, or spec, or anything anywhere that tells me how to do this?是否有教程或规范或任何告诉我如何执行此操作的任何地方?

Zorba supports adding documents to a collection. Zorba 支持将文档添加到集合中。 The framework for doing so is documented here . 此处记录了这样做的框架。 Note, however, that Zorba is a memory store and will not persist anything beyond the scope of one query, so that this is of limited use without a persistence layer.但是请注意,Zorba 是一种内存存储,不会持久化超出一个查询范围的任何内容,因此在没有持久层的情况下,它的用途是有限的。

If the use case is simply to query a JSON file stored on your local drive, then it may be simpler to use EXPath's file module as well as parse-json , like so:如果用例只是查询存储在本地驱动器上的 JSON 文件,那么使用EXPath 的文件模块parse-json可能更简单,如下所示:

jsoniq version "1.0";

import module namespace file = "http://expath.org/ns/file";

let $my-object := parse-json(file:read-text("/path/to/document.json"))
return $my-object.foo

The above query returns "bar" if /path/to/document.json contains如果/path/to/document.json包含,则上述查询返回"bar"

{ "foo" : "bar" } 

parse-json gives you additional options to parse documents with multiple objects (JSON lines, etc). parse-json为您提供了其他选项来解析具有多个对象(JSON 行等)的文档。

For advanced users, this is how to use collections to avoid reading the file(s) every time:对于高级用户,这是使用集合避免每次读取文件的方法:

jsoniq version "1.0";

import module namespace file = "http://expath.org/ns/file";
import module namespace ddl = "http://zorba.io/modules/store/dynamic/collections/ddl";
import module namespace dml = "http://zorba.io/modules/store/dynamic/collections/dml";

(: Populating the collection :)
variable $my-collection := QName("my-collection");
ddl:create($my-collection, parse-json(file:read-text("/tmp/doc.json")));

(: And now the query :)

for $object in dml:collection($my-collection)
group by $value := $object.foo
return {
  "value" : $value,
  "count" : count($object)
}

This is /tmp/doc.json :这是/tmp/doc.json

{ "foo" : "bar" }
{ "foo" : "bar" }
{ "foo" : "foo" }
{ "foo" : "foobar" }
{ "foo" : "foobar" }

And the query above returns:上面的查询返回:

{ "value" : "bar", "count" : 2 }
{ "value" : "foobar", "count" : 2 }
{ "value" : "foo", "count" : 1 }

为完整起见,对于Rumble (Spark 上的分布式 JSONiq 实现),使用 json-doc()(当分布在多行上时)或 json-line()(其中每行有一个 JSON 值,在可能有数十亿行)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM