Apache Flink: can't use writeAsCsv() with a datastream of subclass tuple

Question

As recommended here: Best Practices - Naming large TupleX types . I'm using a POJO instead of a Tuple for my data stream.

This is how defined my POJO:

public class PositionEvent extends Tuple8<Integer, String, Integer, 
    Integer, Integer, Integer, Integer, Integer>

If I try to save a data stream of PositionEvent to a csv file, an exception is thrown:

source.filter((PositionEvent e) -> e.speed > MAXIMUM_SPEED)
            .writeAsCsv(String.format("%s/%s", outputFolder, SPEED_RADAR_FILE))

Exception in thread "main" java.lang.IllegalArgumentException: The writeAsCsv() method can only be used on data streams of tuples.

However, if I explicitly cast PositionEvent to Tuple8, it works:

source.filter((PositionEvent e) -> e.speed > MAXIMUM_SPEED)
            .map((PositionEvent e) ->
                    (Tuple8<Integer, String, Integer, Integer,
                            Integer, Integer, Integer, Integer>) e)
            .writeAsCsv(String.format("%s/%s", outputFolder, SPEED_RADAR_FILE))

Should not Flink detect that the objects in the data stream are of the Tuple subclass?

====================

Edit: (thanks to twalthr)

Ok, this is my POJO now:

import org.apache.flink.api.java.tuple.Tuple8;

public class PositionEvent extends Tuple8<Integer, String, Integer,
        Integer, Integer, Integer, Integer, Integer> {

    public PositionEvent() {
    }

    public PositionEvent(int timestamp, String vid, int speed, int xway,
                         int lane, int dir, int seg, int pos) {
        super(timestamp, vid, speed, xway, lane, dir, seg, pos);
    }

    public int getSpeed() {
        return f2;
    }
}

This was my POJO before:

public class PositionEvent extends Tuple8<Integer, String, Integer,
        Integer, Integer, Integer, Integer, Integer> {

    public int timestamp;

    public String vid;

    public int speed;

    public int xway;

    public int lane;

    public int dir;

    public int seg;

    public int pos;

    public PositionEvent() {
    }

    public PositionEvent(int timestamp, String vid, int speed, int xway,
                         int lane, int dir, int seg, int pos) {
        super(timestamp, vid, speed, xway, lane, dir, seg, pos);
    }
}

Now I don't need to explicitly cast my POJO.

Answer 1

It seems that you not only extended Tuple8 but also added additional fields like e.speed . This implicitly makes your type a POJO. For naming your fields and remain a efficient tuple type, you can simply implement a getter but don't add additional fields. Otherwise you can simply use a POJO instead of tuple.

It might also be worth to look into Flink's Table & SQL API . It aims to ease the development by handling all types automatically.

Apache Flink: can't use writeAsCsv() with a datastream of subclass tuple

Question

1 answers

solution1
1 ACCPTED 2017-11-27 13:28:21

Apache Flink: can't use writeAsCsv() with a datastream of subclass tuple

Question

1 answers

solution1 1 ACCPTED 2017-11-27 13:28:21

solution1
1 ACCPTED 2017-11-27 13:28:21