简体   繁体   中英

What is the idiomatic way to encode an iterator with serde_json?

I'm trying to drain() a vec in Rust and encode the results as a JSON string. What's the best, idiomatic way to do this?

#![feature(custom_derive, plugin)]
#![plugin(serde_macros)]

extern crate serde;
extern crate serde_json;

#[derive(Serialize, Deserialize, Debug)]
struct Point {
    x: i32,
    y: i32,
}

impl Point {
    pub fn new(x: i32, y: i32) -> Point {
        Point {
            x: x,
            y: y
        }
    }
}

fn main() {
    let mut points = vec![Point::new(1,2), Point::new(-2,-1), Point::new(0, 0)];
    let mut drain = points.drain(..);

    println!("{}", serde_json::to_string(&drain).unwrap());
}

Draining iterators are an interesting beast. They allow you to chunk out a part of a collection, taking ownership of some but not necessarily all of the items in the collection. They also allow you to do this in a reasonably efficient manner. For example, a vector could move the trailing data en masse with a single memcpy .

However, serde doesn't natively support serializing iterators (for a good reason, keep reading). You can look at the the Serialize trait to see the types of things it supports.

You'd have to implement this yourself:

use serde::ser::impls::SeqIteratorVisitor;
use std::cell::RefCell;
use std::vec;

struct DrainIteratorAdapter<'a, T: 'a>(RefCell<vec::Drain<'a, T>>);

impl<'a, T: 'a> serde::Serialize for DrainIteratorAdapter<'a, T>
    where T: serde::Serialize
{
    fn serialize<S>(&self, serializer: &mut S) -> Result<(), S::Error>
        where S: serde::Serializer
    {
        let mut iter = self.0.borrow_mut();
        // Use `size_hint` here?
        serializer.visit_seq(SeqIteratorVisitor::new(iter.by_ref(), None))
    }
}

fn main() {
    let mut points = vec![Point::new(1, 2), Point::new(-2, -1), Point::new(0, 0)];
    let adapter = DrainIteratorAdapter(RefCell::new(points.drain(..)));

    println!("{}", serde_json::to_string(&adapter).unwrap());
}

The core hard part is that serialization is supposed to not have any side-effects. This is a very reasonable decision. However, whenever you call next on an iterator, you have to mutate it in order to update the state. To combine these two mismatched concepts, we have to use something like a RefCell .

Beyond that, it's just a matter of implementing the serde::Serialize trait . Since we own neither serde::Serialize or vec::Drain , we have to create a newtype to place the implementation on.

We can generalize this solution to apply to any iterator. This happens to make it read a bit nicer, in my opinion:

use serde::ser::impls::SeqIteratorVisitor;
use std::cell::RefCell;

struct IteratorAdapter<I>(RefCell<I>);

impl<I> serde::Serialize for IteratorAdapter<I>
    where I: Iterator,
          I::Item: serde::Serialize,
{
    fn serialize<S>(&self, serializer: &mut S) -> Result<(), S::Error>
        where S: serde::Serializer
    {
        let mut iter = self.0.borrow_mut();
        // Use `size_hint` here?
        serializer.visit_seq(SeqIteratorVisitor::new(iter.by_ref(), None))
    }
}

What's the downside to this solution? Serializing the same value twice has different results! If we simply serialize and print the value twice, we get:

[{"x":1,"y":2},{"x":-2,"y":-1},{"x":0,"y":0}]
[]

This is because iterators are transient beasts - once they have read one value, it's gone! This is a nice trap waiting for you to fall into it.


In your example, none of this really makes sense. You have access to the entire Vec , so you might as well serialize it (or a slice of it) at that point. Additionally, there's no reason (right now) to drain the entire collection. That would be equivalent to just calling into_iter .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM