简体   繁体   English

在编译时使用 serde_json 反序列化文件

[英]Deserialize file using serde_json at compile time

At the beginning of my program, I read data from a file:在我的程序开始时,我从一个文件中读取数据:

let file = std::fs::File::open("data/games.json").unwrap();
let data: Games = serde_json::from_reader(file).unwrap();

I would like to know how it would be possible to do this at compile time for the following reasons:我想知道如何在编译时执行此操作,原因如下:

  1. Performance: no need to deserialize at runtime性能:运行时无需反序列化
  2. Portability: the program can be run on any machine without the need to have the json file containing the data with it.可移植性:该程序可以在任何机器上运行,而不需要包含包含数据的 json 文件。

I might also be useful to mention that, the data can be read only which means the solution can store it as static.我可能还需要提及的是,数据可以只读,这意味着解决方案可以将其存储为 static。

This is straightforward, but leads to some potential issues.这很简单,但会导致一些潜在问题。 First, we need to deal with something: do we want to load the tree of objects from a file, or parse that at runtime?首先,我们需要处理一些事情:我们是想从文件中加载对象树,还是在运行时解析它?

99% of the time, parsing on boot into a static ref is enough for people, so I'm going to give you that solution; 99% 的时间,在启动时解析为static ref对人们来说就足够了,所以我会给你这个解决方案; I will point you to the "other" version at the end, but that requires a lot more work and is domain-specific.最后我会指出你的“其他”版本,但这需要更多的工作并且是特定于域的。

The macro (because it has to be a macro) you are looking for to be able to include a file at compile-time is in the standard library: std::include_str!您正在寻找能够在编译时包含文件的宏(因为它必须是宏)位于标准库中: std::include_str! . . As the name suggests, it takes your file at compile-time and generates a &'static str from it for you to use.顾名思义,它在编译时获取您的文件并从中生成一个&'static str供您使用。 You are then free to do whatever you like with it (such as parsing it).然后你可以自由地用它做任何你喜欢的事情(比如解析它)。

From there, it is a simple matter to then use lazy_static!从那里开始,使用lazy_static! to generate a static ref to our JSON Value (or whatever it may be that you decide to go for) for every part of the program to use.为要使用的程序的每个部分生成static ref我们的 JSON Value (或任何您决定 go 的值)。 In your case, for instance, it could look like this:例如,在您的情况下,它可能如下所示:

const GAME_JSON: &str = include_str!("my/file.json");

#[derive(Serialize, Deserialize, Debug)]
struct Game {
    name: String,
}

lazy_static! {
    static ref GAMES: Vec<Game> = serde_json::from_str(&GAME_JSON).unwrap();
}

You need to be aware of two things when doing this:执行此操作时需要注意两件事:

  1. This will massively bloat your file size, as the &str isn't compressed in any way.这将大大增加您的文件大小,因为&str不会以任何方式压缩。 Consider gzip考虑 gzip
  2. You'll need to worry about the usual concerns around multiple, threaded access to the same static ref , but since it isn't mutable you only really need to worry about a portion of it您需要担心对同一static ref的多个线程访问的常见问题,但由于它不是可变的,您只需要担心其中的一部分

The other way requires dynamically generating your objects at compile-time using a procedural macro.另一种方式需要在编译时使用过程宏动态生成对象。 As stated, I wouldn't recommend it unless you really have a really expensive startup cost when parsing that JSON;如前所述,除非您在解析 JSON 时确实非常昂贵的启动成本,否则我不会推荐它; most people will not, and the last time I had this was when dealing with deeply-nested multi-GB JSON files.大多数人不会,我最后一次遇到这种情况是在处理深度嵌套的多 GB JSON 文件时。

The crates you want to look out for are proc_macro2 and syn for the code generation;您要注意的 crate 是用于代码生成的proc_macro2syn the rest is very similar to how you would write a normal method. rest 与您编写普通方法的方式非常相似。

When you are deserializing something at runtime, you're essentially building some representation in program memory from another representation on disk.当您在运行时反序列化某些东西时,您实际上是在从磁盘上的另一个表示中构建程序 memory 中的一些表示。 But at compile-time, there's no notion of "program memory" yet - where will this data deserialize too?但是在编译时,还没有“程序内存”的概念——这些数据也将在哪里反序列化?

However, what you're trying to achieve is, in fact, possible.但是,实际上,您要实现的目标是可能的。 The main idea is like following: to create something in program memory, you must write some code which will create the data.主要思想如下:要在程序 memory 中创建一些东西,您必须编写一些将创建数据的代码。 What if you're able to generate the code automatically, based on the serialized data?如果您能够根据序列化数据自动生成代码怎么办? That's what uneval crate does (disclaimer: I'm the author, so you're encouraged to look through the source to see if you can do better).这就是uneval crate 所做的(免责声明:我是作者,所以我们鼓励您查看源代码,看看您是否可以做得更好)。

To use this approach, you'll have to create build.rs with approximately the following content:要使用这种方法,您必须创建包含以下内容的build.rs

// somehow include the Games struct with its Serialize and Deserialize implementations
fn main() {
    let games: Games = serde_json::from_str(include_str!("data/games.json")).unwrap();
    uneval::to_out_dir(games, "games.rs");
}

And in you initialization code you'll have the following:在您的初始化代码中,您将拥有以下内容:

let data: Games = include!(concat!(env!("OUT_DIR"), "/games.rs"));

Note, however, that this might be fairly hard to do in ergonomic way, since the necessary struct definitions now must be shared between the build.rs and the crate itself, as I mentioned in the comment.但是请注意,这可能很难以符合人体工程学的方式进行,因为正如我在评论中提到的,现在必须在build.rs和 crate 本身之间共享必要的结构定义。 It might be a little easier if you split your crate in two, keeping struct definitions (and only them) in one crate, and the logic which uses them - in another one.如果您将板条箱分成两部分,将结构定义(并且只有它们)保存在一个板条箱中,并将使用它们的逻辑保存在另一个板条箱中,这可能会更容易一些。 There's some other ways - with include!还有其他一些方法 - include! trickery, or by using the fact that the build script is an ordinary Rust binary and can include other modules as well, - but this will complicate things even more.诡计,或者使用构建脚本是普通的 Rust 二进制文件并且还可以包含其他模块的事实,但这会使事情变得更加复杂。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM