简体   繁体   English

响应式 Spring Boot API 包装了 Elasticsearch 的异步批量索引

[英]Reactive Spring Boot API wrapping Elasticsearch's async bulk indexing

I am developing prototype for a new project.我正在为一个新项目开发原型。 The idea is to provide a Reactive Spring Boot microservice to bulk index documents in Elasticsearch.这个想法是为 Elasticsearch 中的批量索引文档提供一个 Reactive Spring Boot 微服务。 Elasticsearch provides a High Level Rest Client which provides an Async method to bulk process indexing requests. Elasticsearch 提供了一个高级 Rest 客户端,它提供了一种异步方法来批量处理索引请求。 Async delivers callbacks using listeners are mentioned here . 此处提到使用侦听器的异步传递回调。 The callbacks receive index responses (per requests) in batches.回调分批接收索引响应(每个请求)。 I am trying to send this response back to the client as Flux.我正在尝试将此响应作为 Flux 发送回客户端。 I have come up with something based on this blog post .我想出了一些基于这篇博文的东西。


public class AppController {

    @RequestMapping(value = "/test3", method = RequestMethod.GET)
    public Flux<String> index3() {
        ElasticAdapter es = new ElasticAdapter();
        JSONObject json = new JSONObject();
        json.put("TestDoc", "Stack123");
        Flux<String>  fluxResponse = es.bulkIndex(json);
        return fluxResponse;


class ElasticAdapter {
String indexName = "test2"; 
    private final RestHighLevelClient client;
    private final ObjectMapper mapper;
    private int processed = 1;

    Flux<String> bulkIndex(JSONObject doc) {
        return bulkIndexDoc(doc)
                .doOnError(e -> System.out.print("Unable to index {}" + doc+ e));

    private Flux<String> bulkIndexDoc(JSONObject doc) {
        return Flux.create(sink -> {
            try {
                doBulkIndex(doc, bulkListenerToSink(sink));
            } catch (JsonProcessingException e) {

    private void doBulkIndex(JSONObject doc, BulkProcessor.Listener listener) throws JsonProcessingException {

        System.out.println("Going to submit index request");
        BiConsumer<BulkRequest, ActionListener<BulkResponse>> bulkConsumer =
                (request, bulkListener) ->
                    client.bulkAsync(request, RequestOptions.DEFAULT, bulkListener);
                    BulkProcessor.Builder builder =
                            BulkProcessor.builder(bulkConsumer, listener);
        BulkProcessor bulkProcessor = builder.build();
        // Submitting 5,000 index requests ( repeating same JSON)
        for (int i = 0; i < 5000; i++) {
            IndexRequest indexRequest = new IndexRequest(indexName, "person", i+1+"");
             String json = doc.toJSONString();
            indexRequest.source(json, XContentType.JSON);
        System.out.println("Submitted all docs

    private BulkProcessor.Listener bulkListenerToSink(FluxSink<String> sink) {
        return new BulkProcessor.Listener() {

            public void beforeBulk(long executionId, BulkRequest request) {

            public void afterBulk(long executionId, BulkRequest request, BulkResponse response) {

                for (BulkItemResponse bulkItemResponse : response) {
                    JSONObject json = new JSONObject();
                    json.put("id", bulkItemResponse.getResponse().getId());
                    json.put("status", bulkItemResponse.getResponse().getResult

                if(processed >= 5000) {

            public void afterBulk(long executionId, BulkRequest request, Throwable failure) {

    public ElasticAdapter() {
    // Logic to initialize  Elasticsearch Rest Client 

I used FluxSink to create the Flux of Responses to send back to the Client.我使用 FluxSink 创建了 Flux of Responses 以发送回客户端。 At this point, I have no idea whether this correct or not.在这一点上,我不知道这是否正确。

My expectation is that the calling client should receive the responses in batches of 10 ( because bulk processor processess it in batches of 10 - builder.setBulkActions(10); ).我的期望是调用客户端应该以 10 个批次接收响应(因为批量处理器以 10 个批次处理它 - builder.setBulkActions(10); )。 I tried to consume the endpoint using Spring Webflix Client.我尝试使用 Spring Webflix Client 使用端点。 But unable to work it out.但无法解决。 This is what I tried这是我试过的


public class FluxClient {

    public static void main(String[] args) {
        WebClient client = WebClient.create("http://localhost:8080");
        Flux<String> responseFlux = client.get()

Nothing is printing on console as I expected.正如我所料,控制台上没有打印任何内容。 I tried to use System.out.println(responseFlux.blockFirst());我尝试使用System.out.println(responseFlux.blockFirst()); . . It prints all the responses as a single batch at the end and not in batches at .它在最后将所有响应作为单个批次打印,而不是在 .

If my approach is correct, what is the correct way to consume it?如果我的方法是正确的,那么正确的消费方式是什么? For the solution in my mind, this client will reside is another Webapp.对于我心中的解决方案,这个客户端将驻留在另一个 Web 应用程序中。

Notes: My understanding of Reactor API is limited.注意:我对 Reactor API 的理解是有限的。 The version of elasticsearch used is 6.8.使用的 elasticsearch 版本是 6.8。

So made the following changes to your code.因此对您的代码进行了以下更改。

In ElasticAdapter,在 ElasticAdapter 中,

public Flux<Object> bulkIndex(JSONObject doc) {
    return bulkIndexDoc(doc)
            .subscribeOn(Schedulers.elastic(), true)
            .doOnError(e -> System.out.print("Unable to index {}" + doc+ e));

Invoked subscribeOn(Scheduler, requestOnSeparateThread) on the Flux, Got to know about it from, https://github.com/spring-projects/spring-framework/issues/21507在 Flux 上调用 subscribeOn(Scheduler, requestOnSeparateThread),从https://github.com/spring-projects/spring-framework/issues/21507了解它

In FluxClient,在 FluxClient 中,

Flux<String> responseFlux = client.get()
              .headers(httpHeaders -> {
                  httpHeaders.set("Accept", "text/event-stream");

Added "Accept" header as "text/event-stream" and delayed Flux elements.添加了“接受”标题作为“文本/事件流”和延迟的 Flux 元素。

With the above changes, was able to get the response in real time from the server.通过上述更改,能够从服务器实时获取响应。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM