简体   繁体   中英

ClassCastException when reading nested list of records

I am reading in a BigQuery table from Dataflow where one of the fields is a "record" and "repeated" field. So I expected the resulting data type in Java to be List<TableRow> .

However when I try to iterate over the list I get the following exception:

java.lang.ClassCastException: java.util.LinkedHashMap cannot be cast to com.google.api.services.bigquery.model.TableRow

The table schema looks something like this:

{
    "id": "my_id",
    "values": [
        {
            "nested_record": "nested"
        }
    ]
}

The code to iterate over values looks something like this:

String id = (String) row.get("id");
List<TableRow> values = (List<TableRow>) row.get("values");

for (TableRow nested : values) {
    // more  logic
}

The exception is thrown right where the loop begins. The obvious fix here is to just cast values as a List of LinkedHashMaps but that doesn't feel right.

Why does Dataflow throw this kind of error for nested "records"?

I faced the same ClassCastException when I try to use google cloud DataFlow to read Nested tables from BigQuery. And finally solved by casting TableRow to different data structure depends on which DataFlow runner I use:

  • if use DirectRunner : cast into LinkedHashMap
  • if use DataflowRunner : cast into TableRow .

example:

Object valuesList = row.get("values");
// DirectRunner
for (TableRow v : (List<LinkedHashMap>) valuesList) {
   String name = v.get("name");
   String age = v.get("age");
}

// DataflowRunner
for (TableRow v : (List<TableRow>) valuesList) {
   String name = v.get("name");
   String age = v.get("age");
}

Have a look at BEAM-2767

The underlying cause of this is due to the encoding round trip performed by the DirectRunner between steps, which is not usually performed in Dataflow. Accessing the repeated record (or any record) as a Map field will execute successfully on both of these runners, as a TableRow implements the Map interface. Records are read as type "TableRow", but when they are encoded they are encoded as a simple JSON map. Because the JSON coder does not recognize the types of the fields of the map, it deserializes the record as a simple map type.

TableRow is a Map so you can treat both cases as Map:

    String id = (String) row.get("id");
    List<? extends Map> values = row.get("values");

    for (Map nested : values) {
        // more  logic
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM