简体   繁体   English

Elasticsearch滚动API的rxjs流产生空结果集

[英]rxjs stream of elasticsearch scroll API yields empty result set

My goal is to transform the elasticsearch result to an rxjs stream and thought of doing so using the scroll API fetching 1 data point on every call. 我的目标是变换elasticsearch结果的rxjs流,这样使用的思想scroll API获取每次通话1个数据点。 However it seems that my rxjs stream returns no results for the second elastic query ( searchElastic ). 但是,似乎我的rxjs流没有为第二个弹性查询( searchElastic )返回任何结果。

Below is a sample of my code: 以下是我的代码示例:

import * as Rx from 'rxjs';
import {elasticClient} from '../Helpers.ts';

function searchElastic({query, sort}) {
    const body: any = {
        size: 1,
        query,
        _source: { excludes: ['logbookType', 'editable', 'availabilityTag'] },
        sort
    };
    // keep the search results "scrollable" for 30 secs
    const scroll = '30s';

    return Rx.Observable
        .fromPromise(elasticClient.search({ index: 'data', body, scroll }))
        .concatMap(({_scroll_id, hits: {hits}}) => {
            const subject = new Rx.Subject();

            if(hits.length) {
                // initial data
                subject.onNext(hits[0]._source as ElasticDoc);
                console.log(hits[0]._id);

                const handleDoc = (err, res) => {
                    if(err) {
                        subject.onError(err);
                        return;
                    }

                    const {_scroll_id, hits: {hits}} = res;

                    if(!hits.length) {
                        subject.onCompleted();
                    } else {
                        subject.onNext(hits[0]._source as ElasticDoc);
                        console.log(hits[0]._id);
                        setImmediate(() =>
                            elasticClient.scroll({scroll, scrollId: _scroll_id},
                                handleDoc));
                    }
                };

                setImmediate(() =>
                    elasticClient.scroll({scroll, scrollId: _scroll_id},
                        handleDoc));
            } else {
                subject.onCompleted();
            }

            return subject.asObservable();
        });
}

function getEntries() {
    const entriesQuery = {
        query: {
            filtered: {
                filter: {
                    bool: {
                        must: [{
                            range: {
                                creationTimestamp: {
                                    gte: "2018-04-01T07:55:59.915Z",
                                    lte: "2018-04-01T07:57:08.915Z"
                                }
                            }
                        }, {
                            query: {
                                query_string: {
                                    query: "+type:*scan*"
                                }
                            }
                        }]
                    }
                }
            }
        },
        sort: [{
            creationTimestamp: {
                order: "asc"
            },
            id: {
                order: "asc"
            }
        }]
    };
    return searchElastic(entriesQuery)
        .concatMap(entry => {
            // all entries are logged correctly
            console.log(entry.id);
            // array that contains MongoDB _ids as strings
            const ancestors = entry.ancestors || [];

            // if no parents => doesn't match
            if(!ancestors.length) {
                return Rx.Observable.empty();
            }

            const parentsQuery = {
                query: {
                    filtered: {
                        filter: {
                            bool: {
                                must: [{
                                    range: {
                                        creationTimestamp: {
                                            gte: "2018-04-01T07:55:59.915Z",
                                            lte: "2018-04-01T07:57:08.915Z"
                                        }
                                    }
                                }, {
                                    query: {
                                        query_string: {
                                            query: "+type:*block* +name:*Report*"
                                        }
                                    }
                                }]
                            }
                        }
                    }
                },
                sort: [{
                    creationTimestamp: {
                        order: "asc"
                    },
                    id: {
                        order: "asc"
                    }
                }]
            };
            parentsQuery.query.filtered.filter.bool.must.push({
                terms: {
                    id: ancestors
                }
            });

            // fetch parent entries
            return searchElastic(parentsQuery)
                .count()
                .concatMap(count => {
                    // count is always 0 even though entries are logged
                    // in the searchElastic console.logs
                    console.log(entry.id, count);
                    return Rx.Observable.just(entry);
                });
        });
}

function executeQuery() {
    try {
       getEntries()
            .subscribe(
                (x) => console.log(x.id),
                err => console.error(err),
                () => {}
            )
    } catch(e) {
        console.error(e);
    }
}

Looks like it's an rxjs problem, since all ancestors entries get logged but count always returns 0. 看起来这是一个rxjs问题,因为所有祖先条目都已记录,但count始终返回0。

PS using elasticsearch v1.7 使用elasticsearch v1.7的PS

After playing with a couple rxjs examples with subjects, it seems like the subject is being completed ( onCompleted ) before an observer subscribes to it. 在玩了几个带有主题的rxjs示例之后,似乎在观察者订阅主题之前,主题已经完成( onCompleted )。

Working example 工作实例

 var subject = new Rx.Subject(); var subscription = subject.subscribe( function(x) { console.log('onNext: ' + x); }, function(e) { console.log('onError: ' + e.message); }, function() { console.log('onCompleted'); }); subject.onNext(1); // => onNext: 1 subject.onNext(2); // => onNext: 2 subject.onCompleted(); // => onCompleted 
 <!DOCTYPE html> <html> <head> <script src="//code.jquery.com/jquery-2.1.0.min.js"></script> <title>JS Bin</title> <script src="//cdn.jsdelivr.net/rsvp/3.0.6/rsvp.js"></script> <script src="//cdnjs.cloudflare.com/ajax/libs/rxjs/2.2.28/rx.all.min.js"></script> </head> </html> 

Broken example 破例

 var subject = new Rx.Subject(); subject.onNext(1); // => onNext: 1 subject.onNext(2); // => onNext: 2 subject.onCompleted(); // => onCompleted var subscription = subject.subscribe( function(x) { console.log('onNext: ' + x); }, function(e) { console.log('onError: ' + e.message); }, function() { console.log('onCompleted'); }); 
 <!DOCTYPE html> <html> <head> <script src="//code.jquery.com/jquery-2.1.0.min.js"></script> <title>JS Bin</title> <script src="//cdn.jsdelivr.net/rsvp/3.0.6/rsvp.js"></script> <script src="//cdnjs.cloudflare.com/ajax/libs/rxjs/2.2.28/rx.all.min.js"></script> </head> </html> 

So I fixed it by changing searchElastic to the following: 所以我通过将searchElastic更改为以下内容来修复它:

function searchElasticStream({query, sort}) {
    const body: any = {
        size: 1,
        query,
        _source: { excludes: ['logbookType', 'editable', 'availabilityTag'] },
        sort
    };
    // keep the search results "scrollable" for 30 secs
    const scroll = '30s';

    return Rx.Observable
        .fromPromise(elasticClient.search({ index: 'data', body, scroll }))
        .flatMap(({_scroll_id, hits: {hits}}) => {
            const subject = new Rx.Subject();

            // this made the difference
            setImmediate(() => {
                if(hits.length) {
                    // initial data
                    subject.onNext(hits[0]._source as ElasticDoc);

                    const handleDoc = (err, res) => {
                        if(err) {
                            subject.onError(err);
                            return;
                        }

                        const {_scroll_id, hits: {hits}} = res;

                        if(!hits.length) {
                            subject.onCompleted();
                        } else {
                            subject.onNext(hits[0]._source as ElasticDoc);
                            setImmediate(() =>
                                elasticClient.scroll({scroll, scrollId: _scroll_id},
                                    handleDoc));
                        }
                    };

                    setImmediate(() =>
                        elasticClient.scroll({scroll, scrollId: _scroll_id},
                            handleDoc));
                } else {
                    subject.onCompleted();
                }
            });

            return subject.asObservable();
        });
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM