![](/img/trans.png)
[英]Apache Flink: Why does sortPartition transformation not support KeySelector functions
[英]Apache Flink: how to consume dynamic type in transformation (map, reduce, join, etc) functions
我創建了一個自定義的csv讀取器,該讀取器生成一個用env.readCsvFile(location).pojoType(dynClass, arr);
指定的動態返回類型env.readCsvFile(location).pojoType(dynClass, arr);
,其中dynClass
是使用ByteBuddy創建的,而arr
是列名稱的數組。 然后,我嘗試使用以下命令將我的pojo映射到一個元組:
public class PojoToTupleRichMapFunction extends RichMapFunction<I, O> implements ResultTypeQueryable {
Class tupleClass = null;
Class pojoClass = null;
Config.Schema schema = null;
transient List<Field> fields = null;
PojoToTupleRichMapFunction(DynDataSet dynSet) {
this.schema = dynSet.dataDef.schema;
// Create a map from pojo to tuple
this.tupleClass = O.getTupleClass(schema.columns.size());
this.pojoClass = dynSet.recType;
}
@Override
public void open(Configuration parameters) {
fields = new ArrayList<>(schema.columns.size());
for (int i = 0; i < schema.columns.size(); i++) {
try {
fields.add(pojoClass.getField(schema.columns.get(i).name));
} catch (NoSuchFieldException | SecurityException ex) {
Logger.getLogger(PojoToTupleRichMapFunction.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
@Override
public TupleTypeInfo getProducedType() {
// build list of types
List<BasicTypeInfo<?>> types = new ArrayList<>(schema.columns.size());
for (int i = 0; i < schema.columns.size(); i++) {
BasicTypeInfo bt = null;
String typeName = schema.columns.get(i).type.getName();
switch (typeName) {
case "java.lang.Integer":
bt = BasicTypeInfo.INT_TYPE_INFO;
break;
case "java.lang.String":
bt = BasicTypeInfo.STRING_TYPE_INFO;
break;
case "java.lang.Long":
bt = BasicTypeInfo.LONG_TYPE_INFO;
break;
case "java.lang.Short":
bt = BasicTypeInfo.SHORT_TYPE_INFO;
break;
default:
Logger.getLogger(Config.class.getName()).log(Level.SEVERE, "Unknown type: {0}", typeName);
}
types.add(bt);
}
return new TupleTypeInfo(tupleClass, types.toArray(new BasicTypeInfo[0]));
}
@Override
public O map(I pojo) throws Exception {
O ret;
ret = (O) tupleClass.newInstance();
for (int i = 0; i < schema.columns.size(); i++) {
ret.setField(fields.get(i).get(pojo), i);
}
return ret;
}
}
我遇到的挑戰是此運行時錯誤片段: org.apache.flink.api.common.functions.InvalidTypesException: Input mismatch: POJO type 'com.me.dynamic.FlinkPojo$ByteBuddy$zQ9VllB1' expected but was 'com.me.dynamic.I'.
函數聲明指定基本類型。 實際的輸入類型是動態子類。 輸出類型由getProducedType提供。
如何制作MapFunction來處理動態輸入類型?
為了提供至少一個解決方案(也許不是最好的),我將類定義更改為:
public class PojoToTupleRichMapFunction<I extends FlinkPojo, O extends Tuple> extends RichMapFunction<I, O> implements ResultTypeQueryable {
}
然后,我使用ByteBuddy對包含通用參數的已編譯類進行了重新分類。
static private DataSet<?> mapPojoToTuple(DataSet ds, DynDataSet dynSet) {
Class<?> clazz = new ByteBuddy()
.subclass(TypeDescription.Generic.Builder.parameterizedType(PojoToTupleRichMapFunction.class, dynSet.recType, Tuple.class).build())
.make()
.load(PojoToTupleRichMapFunction.class.getClassLoader())
.getLoaded();
Constructor<?> ctr = clazz.getConstructors()[0];
RichMapFunction fcn = null;
try {
fcn = (RichMapFunction) ctr.newInstance(dynSet);
} catch (InstantiationException | IllegalAccessException | IllegalArgumentException | InvocationTargetException ex) {
Logger.getLogger(dyn_demo.class.getName()).log(Level.SEVERE, null, ex);
}
return ds.map(fcn);
}
這似乎使Flink滿意。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.