简体   繁体   English

Clojure&ClojureScript:clojure.core / read-string,clojure.edn / read-string和cljs.reader / read-string

[英]Clojure & ClojureScript: clojure.core/read-string, clojure.edn/read-string and cljs.reader/read-string

I am not clear about the relationship between all these read-string functions. 我不清楚所有这些读字符串函数之间的关系。 Well, it is clear that clojure.core/read-string can read any serialized string that is output by pr[n] or even print-dup . 好吧,很明显clojure.core/read-string可以读取由pr[n]甚至print-dup输出的任何序列化字符串。 It is also clear that clojure.edn/read-string does read strings that are formatted according to the EDN specification . 很明显, clojure.edn/read-string确实读取了根据EDN规范格式化的字符串。

However, I am starting with Clojure Script, and it is not clear if cljs.reader/read-string comply with. 但是,我从Clojure脚本开始,并不清楚cljs.reader/read-string是否符合。 This question has been triggered by the fact that I had a web service that was emiting clojure code serialized that way: 这个问题是由于我有一个Web服务正在发出以这种方式序列化的clojure代码:

(with-out-str (binding [*print-dup* true] (prn tags)))

That was producing the object serialization which includes the datatypes. 那就是产生包含数据类型的对象序列化。 However, this was not readable by cljs.reader/read-string . 但是, cljs.reader/read-string无法读取。 I was always getting error of this type: 我总是得到这种类型的错误:

Could not find tag parser for = in ("inst" "uuid" "queue" "js")  Format should have been EDN (default)

At first, I thought that this error was thrown by cljs-ajax but after testing the cljs.reader/read-string in a rhino REPL, I got the same error, which means it is thrown by cljs.reader/read-string itself. 起初,我认为这个错误是由cljs-ajax引发的,但是在rhino REPL中测试cljs.reader/read-string之后,我得到了同样的错误,这意味着它被cljs.reader/read-string本身抛出。 It is thrown by the maybe-read-tagged-type function in cljs.reader but it is not clear if this is because the reader only works with EDN data, or if...? 它是由cljs.readermaybe-read-tagged-type函数cljs.reader但不清楚这是因为读者只能使用EDN数据,还是......?

Also, from the Differences from Clojure document, the only thing that is said is: 另外,根据与Clojure文档的差异 ,唯一可以说的是:

The read and read-string functions are located in the cljs.reader namespace

Which suggests that they should exactly have the same behavior. 这表明他们应该完全具有相同的行为。

Summary: Clojure is a superset of EDN. 简介:Clojure是EDN的超集。 By default, pr , prn and pr-str , when given Clojure data structures, produce valid EDN. 默认情况下, prprnpr-str在给定Clojure数据结构时会生成有效的EDN。 *print-dup* changes that and makes them use the full power of Clojure to give stronger guarantees about the "sameness" of the objects in memory after a round-trip. *print-dup*改变了它并使它们使用Clojure的全部功能,以便在往返后更好地保证内存中对象的“相同性”。 ClojureScript can only read EDN, not full Clojure. ClojureScript只能读取EDN,而不是完整的Clojure。

Easy solution: do not set *print-dup* to true, and only pass pure data from Clojure to ClojureScript. 简单的解决方案:不要将*print-dup*为true,只将纯数据从Clojure传递给ClojureScript。

Harder solution: use tagged literals, with a (possibly shared) associated reader on both sides. 更难的解决方案:使用带标记的文字,两边都有(可能是共享的)相关阅读器。 (This will still not involve *print-dup* , though.) (但这仍然不涉及*print-dup* 。)

Tangentially related: most use-cases for EDN are covered by Transit , which is faster, especially on the ClojureScript side. 切向相关:大多数用于EDN的用例由Transit覆盖,这更快,特别是在ClojureScript方面。


Let's start with the Clojure part. 让我们从Clojure部分开始。 Clojure had, from the start, a clojure.core/read-string function, which read sa string in the old Lispy sense of the Read-Eval-Print-Loop, ie it gives access to the actual reader used in the compilation of Clojure.[0] Clojure从一开始就有一个clojure.core/read-string函数,它read了Read-Eval-Print-Loop旧Lispy意义上的sa字符串,即它可以访问Clojure编译中使用的实际读者。[0]

Later on, Rich Hickey & co decided to promote the data notation of Clojure and published the EDN spec . 后来,Rich Hickey&co决定推广Clojure的数据符号,并发布了EDN规范 EDN is a subset of Clojure; EDN是Clojure的一个子集 ; it is limited to the data elements of the Clojure language. 它仅限于Clojure语言的数据元素。

As Clojure is a Lisp and, like all lisps, touts the "code is data is code" philosophy, the actual implications of the above paragraph may not be completely clear. 由于Clojure是一个Lisp,并且像所有lisps一样,吹嘘“代码是数据是代码”的哲学,上段的实际含义可能并不完全清楚。 I am not sure there is a detailed diff anywhere, but a careful examination of the Clojure Reader description and the previously mentioned EDN spec shows a few differences. 我不确定在任何地方都有详细的差异,但仔细检查Clojure Reader描述和前面提到的EDN规范会显示一些差异。 The most obvious differences are around macro characters and in particular the # dispatch symbol, which has many more targets in Clojure than in EDN. 最明显的区别是围绕宏观人物,尤其是#派遣符号,它在Clojure中有更多的目标比EDN。 For example, the #(* % %) notation is valid Clojure, which the Clojure reader will turn into the equivalent of the following EDN: (fn [x] (* xx)) . 例如, #(* % %)表示法是有效的Clojure,Clojure读者将转换为以下EDN的等价物: (fn [x] (* xx)) Of particular importance for this question is the scarcely documented #= special reader macro, which can be used to execute arbitrary code right inside the reader. 对于这个问题特别重要的是几乎没有记录的#=特殊读取器宏,它可以用于在读者内部执行任意代码。

As the complete language is available to the Clojure reader, it is possible to embed code into the character string that the reader is reading and have it evaluated right then and there in the reader. 由于Clojure阅读器可以使用完整的语言,因此可以将代码嵌入到阅读器正在阅读的字符串中,然后在阅读器中对其进行评估。 A few examples can be found here . 可以在这里找到一些例子。

The clojure.edn/read-string function is strictly limited to the EDN format, not the whole Clojure language. clojure.edn/read-string函数严格限于EDN格式,而不是整个Clojure语言。 In particular, its operation is not influenced by the *read-eval* variable and it cannot read all of the valid Clojure code fragments possible. 特别是,它的操作不受*read-eval*变量的影响,并且它无法读取所有可能的有效Clojure代码片段。

It turns out that the Clojure reader is, for mostly historical reasons, written in Java. 事实证明,由于历史原因,Clojure读者用Java编写。 As it is a significant piece of software, works well, and has been largely debugged and battle-tested by a few years of active Clojure usage in the wild, Rich Hickey decided to reuse it in the ClojureScript compiler (this is the main reason why the ClojureScript compiler runs on the JVM). 由于它是一个重要的软件,运行良好,并且经过大量的调试和经过多年积极的Clojure在野外使用的战斗测试,Rich Hickey决定在ClojureScript编译器中重用它(这是主要的原因) ClojureScript编译器在JVM上运行。 The ClojureScript compilation process happens mostly on the JVM, where the Clojure reader is available, and thus ClojureScript code is parsed by the clojure.core/read-string (or rather its close cousin clojure.core/read ) function. ClojureScript编译过程主要发生在JVM上,其中Clojure读取器可用,因此ClojureScript代码由clojure.core/read-string (或者更接近它的近似表兄弟clojure.core/read )函数解析。

But your web application does not have access to a running JVM. 但是您的Web应用程序无法访问正在运行的JVM。 Requiring a Java applet for ClojureScript applications did not look like a very promising idea, especially as the main objective of ClojureScript was to extend the reach of the Clojure language beyond the confines of the JVM (and the CLR). 要求ClojureScript应用程序的Java applet看起来并不是一个非常有前途的想法,特别是因为ClojureScript的主要目标是将Clojure语言的范围扩展到JVM(和CLR)的范围之外。 So the decision was taken that ClojureScript would not have access to its own reader, and consequently would not have access to its own compiler either (ie there is no eval nor read nor read-string in ClojureScript). 因此,决定ClojureScript无法访问自己的阅读器,因此也无法访问自己的编译器(即ClojureScript中没有eval也没有readread-string )。 This decision and its implications are discussed in greater details here , by someone who actually knows how things happened (I was not there, so there may be some inaccuracies in the historical perspective of this explanation). 这个决定及其影响将在这里更详细地讨论,实际上知道事情是如何发生的(我不在那里,所以在这个解释的历史视角中可能存在一些不准确之处)。

So ClojureScript has no equivalent of clojure.core/read-string (and some would argue that it is therefore not a true lisp). 所以ClojureScript没有相当于clojure.core/read-string (有些人认为它不是真正的lisp)。 Still, it would be nice to have some way to communicate Clojure data structures between a Clojure server and a ClojureScript client, and indeed that was one of the motivating factors in the EDN effort. 仍然,有一些方法可以在Clojure服务器和ClojureScript客户端之间传递Clojure数据结构,这确实是EDN工作中的激励因素之一。 Just as Clojure got a restricted (and safer ) reading function ( clojure.edn/read-string ) after the publication of the EDN spec, ClojureScript also got an EDN reader in the standard distribution as cljs.reader/read-string . 就像Clojure在EDN规范发布后得到一个受限制(且更安全 )的阅读功能( clojure.edn/read-string )一样,ClojureScript也在标准发行版中得到了一个EDN阅读器cljs.reader/read-string It may be argued that a little more consistency between the names of these two functions (or rather their namespace) would have been good. 可能有人认为,这两个函数的名称(或者更确切地说是它们的名称空间)之间的一致性会更好。

Before we can finally answer your original question, we need one more little piece of context regarding *print-dup* . 在我们最终回答您的原始问题之前,我们还需要一个关于*print-dup*一小部分背景信息。 Remember that *print-dup* was part of Clojure 1.0, which means it predates EDN, the notion of tagged literals, and records. 请记住, *print-dup*是Clojure 1.0的一部分,这意味着它早于EDN,标记文字的概念和记录。 I would argue that EDN and tagged literals offer a better alternative for most of the use-cases of *print-dup* . 我认为EDN和标记文字为*print-dup*大多数用例提供了更好的选择。 As Clojure is generally built on top of a few data abstractions (list, vector, set, map, and the usual scalars), the default behaviour of the print/read cycle is to preserve the abstract shape of the data (a map is a map), but not especially its concrete type. 由于Clojure通常建立在一些数据抽象(列表,向量,集合,映射和通常的标量)之上,因此打印/读取循环的默认行为是保留数据的抽象形状(映射是一个地图),但不是特别是它的具体类型。 For example, Clojure has multiple implementations of the map abstraction, such as PersistentArrayMap for small maps and PersistentHashMap for bigger one. 例如,Clojure的具有地图多个抽象实现方式中,如PersistentArrayMap用于小型地图和PersistentHashMap更大的一个。 The default behaviour of the language assumes that you do not care about the concrete type. 该语言的默认行为假定您不关心具体类型。

For the rare cases where you do, or for the more specialized types (defined with deftype or defstruct, at the time), you might want more control about how these are read, and that is what print-dup is for. 对于您所做的极少数情况,或者对于更专业的类型(当时使用deftype或defstruct定义),您可能希望更多地控制如何读取这些类型,这就是print-dup的用途。

The point is, with *print-dup* set to true , pr and family will not produce valid EDN, but actually Clojure data including some explicit #=(eval build-my-special-type) forms, which are not valid EDN. 关键是,如果*print-dup*设置为truepr和family将不会产生有效的EDN,但实际上Clojure数据包括一些显式的#=(eval build-my-special-type)形式,这些形式不是有效的EDN。

[0]: In "lisps", the compiler is explicitly defined in terms of data structures, rather than in terms of character strings. [0]:在“lisps”中,编译器是根据数据结构明确定义的,而不是根据字符串定义的。 While that may seem like a small difference with usual compilers (which do indeed transform the character stream into data structures during their processing), the defining characteristic of Lisp is that the data structures that are emitted by the reader are the data structures commonly used in the language. 虽然这看起来与通常的编译器(在处理过程中确实将字符流转换为数据结构)有些不同,但Lisp的定义特征是读者发出的数据结构是常用的数据结构。语言。 In other words, the compiler is basically just a function available at all times in the language. 换句话说,编译器基本上只是该语言中始终可用的函数。 This is not as unique as it used to be, as most dynamic languages support some form of eval ; 这不像过去那样独特,因为大多数动态语言都支持某种形式的eval ; what is unique to Lisp is that eval takes a data structure, not a character string, which makes dynamic code generation and evaluation much easier. Lisp的独特之处在于eval采用的是数据结构,而不是字符串,这使得动态代码生成和评估变得更加容易。 One important implication of the compiler being "just another function" is that the compiler actually runs with the whole language already defined and available, and all of the code read so far also available, which opens up the door to the Lisp macro system. 编译器“只是另一个函数”的一个重要含义是编译器实际上运行时已经定义并可用的整个语言,并且到目前为止读取的所有代码也可用,这为Lisp宏系统打开了大门。

cljs.reader/read only supports EDN, but pr etc. will output tags (in particular, for protocols and records) which won't read. cljs.reader/read仅支持EDN,但pr等将输出无法读取的标签(特别是用于协议和记录)。

In general, if on the Clojure side you can verify that (= value (clojure.edn/read-string (pr-str value))) , your cljs interop should work. 一般来说,如果在Clojure方面你可以验证(= value (clojure.edn/read-string (pr-str value))) ,你的cljs interop应该可以工作。 This can be limiting, and there is some discussion of workarounds or fixes to the EDN library. 这可能是限制性的,并且对EDN库的变通方法或修复程序进行了一些讨论。

Depending on what your data looks like, you might take a look at the tagged library as described in the Clojure Cookbook . 根据您的数据的样子,您可以查看Clojure Cookbook中描述的tagged库。

Actually, it is possible to register custom tag parser via cljs.reader/register-tag-parser! 实际上,可以通过cljs.reader / register-tag-parser注册自定义标签解析器!

for a record I have it looks like this: (register-tag-parser! (s/replace (pr-str m/M1) "/" ".") m/map->M1) 对于记录我看起来像这样:( (register-tag-parser! (s/replace (pr-str m/M1) "/" ".") m/map->M1)

@Gary — quite nice answer @Gary - 相当不错的答案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM