简体   繁体   中英

Flink Schema vs Table Schema

I am using Flink SQL API and I am a bit lost between all the 'schema' types: TableSchema , Schema (from org.apache.flink.table.descriptors.Schema ) and TypeInformation .

A TableSchema can be created from a TypeInformation , a TypeInformation can be created from a TableSchema and a Schema can be created from a TableSchema

But it looks like a Schema cannot be converted back to TypeInformation or TableSchema (?)

Why is there 3 different type of objects to store the same kind of information?

For example, let's say that I have a string Schema coming from an Avro schema file, and that I want to add a new field to it. To do so, the only solution I have found is:

String mySchemaRaw = ...;
TypeInformation<Row> typeInfo = AvroSchemaConverter.convertToTypeInfo(mySchemaRaw);
Schema newSchema = new Schema().schema(TableSchema.fromTypeInfo(typeInfo));
newSchema = newSchema.field("nexField",...);


// Need the newSchema as a TableSchema 

Is this the normal way to use these objects? (looks weird to me)

TypeInformation and TableSchema solve different things. TypeInformation is physical information how to ship a record class (eg a row or a POJO) from one operator to the other.

TableSchema describes the schema of a table independent of the underlying per-record type. It is similar to the schema part of a CREATE TABLE name (a INT, b BIGINT) DDL statement. In SQL one also doesn't define a table like CREATE TABLE name ROW(a INT, B BIGINT) . But it is true that schema and row type are related which is why converter methods are provided. The differences become bigger once concepts like PRIMARY KEY etc. are introduced.

Schema is the current way of specifying non-SQL concepts such as time attributes and field mappings.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM