[英]Apache Flink integration with Elasticsearch
我正在嘗試將 Flink 與 Elasticsearch 2.1.1 集成,我正在使用 maven 依賴項
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-elasticsearch2_2.10</artifactId>
<version>1.1-SNAPSHOT</version>
</dependency>
這是我從 Kafka 隊列中讀取事件的 Java 代碼(工作正常),但不知何故,事件沒有在 Elasticsearch 中發布,也沒有錯誤,如果我更改任何相關設置,則在下面的代碼中到 ElasticSearch 的端口、主機名、集群名稱或索引名稱,然后我立即看到一個錯誤,但目前它沒有顯示任何錯誤,也沒有在 ElasticSearch 中創建任何新文檔
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// parse user parameters
ParameterTool parameterTool = ParameterTool.fromArgs(args);
DataStream<String> messageStream = env.addSource(new FlinkKafkaConsumer082<>(parameterTool.getRequired("topic"), new SimpleStringSchema(), parameterTool.getProperties()));
messageStream.print();
Map<String, String> config = new HashMap<>();
config.put(ElasticsearchSink.CONFIG_KEY_BULK_FLUSH_MAX_ACTIONS, "1");
config.put(ElasticsearchSink.CONFIG_KEY_BULK_FLUSH_INTERVAL_MS, "1");
config.put("cluster.name", "FlinkDemo");
List<InetSocketAddress> transports = new ArrayList<>();
transports.add(new InetSocketAddress(InetAddress.getByName("localhost"), 9300));
messageStream.addSink(new ElasticsearchSink<String>(config, transports, new TestElasticsearchSinkFunction()));
env.execute();
}
private static class TestElasticsearchSinkFunction implements ElasticsearchSinkFunction<String> {
private static final long serialVersionUID = 1L;
public IndexRequest createIndexRequest(String element) {
Map<String, Object> json = new HashMap<>();
json.put("data", element);
return Requests.indexRequest()
.index("flink").id("hash"+element).source(json);
}
@Override
public void process(String element, RuntimeContext ctx, RequestIndexer indexer) {
indexer.add(createIndexRequest(element));
}
}
我確實在本地機器上運行它並進行調試,但是,我唯一缺少的是正確配置日志記錄,因為大多數彈性問題都在“log.warn”語句中描述。 問題是 elasticsearch-2.2.1 客戶端 API 中“BulkRequestHandler.java”中的異常,它拋出錯誤 -“org.elasticsearch.action.ActionRequestValidationException: Validation Failed: 1: type is missing;” 因為我創建了索引但不是我覺得很奇怪的類型,因為它應該主要關注索引並默認創建類型。
我找到了一個很好的 Flink & Elasticsearch Connector例子
第一個 Maven 依賴項:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-elasticsearch2_2.10</artifactId>
<version>1.1-SNAPSHOT</version>
</dependency>
第二個示例 Java 代碼
public static void writeElastic(DataStream<String> input) {
Map<String, String> config = new HashMap<>();
// This instructs the sink to emit after every element, otherwise they would be buffered
config.put("bulk.flush.max.actions", "1");
config.put("cluster.name", "es_keira");
try {
// Add elasticsearch hosts on startup
List<InetSocketAddress> transports = new ArrayList<>();
transports.add(new InetSocketAddress("127.0.0.1", 9300)); // port is 9300 not 9200 for ES TransportClient
ElasticsearchSinkFunction<String> indexLog = new ElasticsearchSinkFunction<String>() {
public IndexRequest createIndexRequest(String element) {
String[] logContent = element.trim().split("\t");
Map<String, String> esJson = new HashMap<>();
esJson.put("IP", logContent[0]);
esJson.put("info", logContent[1]);
return Requests
.indexRequest()
.index("viper-test")
.type("viper-log")
.source(esJson);
}
@Override
public void process(String element, RuntimeContext ctx, RequestIndexer indexer) {
indexer.add(createIndexRequest(element));
}
};
ElasticsearchSink esSink = new ElasticsearchSink(config, transports, indexLog);
input.addSink(esSink);
} catch (Exception e) {
System.out.println(e);
}
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.