简体   繁体   中英

Is there a simple way to convert JavaRDD<String> containing JSON to a custom Java object

I have a spark streaming context getting Streams of data from a Kafka Consumer. The data contains JSON objects. I need to convert this to a Custom Java Object so that I can do some processing. Is there a simple way to do this? Basically I want a way to convert JavaRDD to a normal string so that I can use gson.fromJSON to convert it int my simple POJO class object.

I tried some method but am getting Serilaization issues

JavaDStream jds = stream.map(x -> x.value());

    jds.foreachRDD(x -> System.out.println(x.count()));

    jds.foreachRDD(new VoidFunction<JavaRDD<String>>() {

        private static final long serialVersionUID = 1L;

        @Override
        public void call(JavaRDD<String> rdd) {
            rdd.foreach(a -> {
                TransactionData tr = gson.fromJson(a, TransactionData.class);
            }
            );
        }

TransactionData is a normal Java bean class with two fields id and amount and their getter/setter method

In the above code, I am getting an error with respect to Serialization. This is the error: org.apache.spark.SparkException: Task not serializable Caused by: java.io.NotSerializableException: com.google.gson.Gson Serialization stack: - object not serializable (class: com.google.gson.Gson, value: {serializeNulls:falsefactories:[Factory[typeHierarchy=com.google.gson.JsonElement,adapter=com.google.gson.internal.bind.TypeAdapters$25@35c645ea]....

Any ideas on how to solve this?

Issue here that Gson is not serializable, can be fixed by avoiding Gson serialization, and creating instance only during processing. Wrapper class for Gson can be created, and used in main code; Car class is used in example instead of TransactionData:

public class CarConverter implements Serializable {
transient Gson gson;

private Gson getGson() {
    if (gson == null) {
        gson = new Gson();
    }
    return gson;
}

public JavaRDD<Car> convert(JavaRDD<String> rdd) {
    return rdd.map(a -> getGson().fromJson(a, Car.class));
}
}

Usage example:

    List<String> data = Lists.newArrayList("{\"brand\":\"Jeep\", \"doors\": 3}", "{\"brand\":\"Slavuta\", \"doors\": 4}");
    JavaRDD<String> rdd = jsc().parallelize(data);
    CarConverter converter = new CarConverter();
    JavaRDD<Car> result = converter.convert(rdd);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM