简体   繁体   English

Elasticsearch使用JEST API扫描和滚动

[英]Elasticsearch Scan&scroll with JEST API

I am currently working with JEST: https://github.com/searchbox-io/Jest 我目前正在与JEST合作: https//github.com/searchbox-io/Jest

Is it possible to do scan&scroll with this API? 是否可以使用此API进行扫描和滚动?

http://www.elasticsearch.org/guide/reference/api/search/search-type/ http://www.elasticsearch.org/guide/reference/api/search/search-type/

I am currently using the Search command: 我目前正在使用搜索命令:

Search search = new Search("{\"size\" : "+RESULT_SIZE+", \"query\":{\"match_all\":{}}}");

but am worried about large result sets. 但我担心大的结果集。 If you use the Search command for this how do you set the "search_type=scan&scroll=10m&size=50" arguments? 如果您使用搜索命令,如何设置“search_type = scan&scroll = 10m&size = 50”参数?

Is it possible to do scan&scroll with this API? 是否可以使用此API进行扫描和滚动?

Yes it is. 是的。 My implementation it's working like this. 我的实现就像这样工作。

Start the scroll search on elastic search: 在弹性搜索上开始滚动搜索:

    public SearchResult startScrollSearch (String type, Long size) throws IOException {

            String query = ConfigurationFactory.loadElasticScript("my_es_search_script.json");

            Search search = new Search.Builder(query)
                                            // multiple index or types can be added.
                                            .addIndex("myIndex")
                                            .addType(type)
                                            .setParameter(Parameters.SIZE, size)
                                            .setParameter(Parameters.SCROLL, "1m")
                                            .build();

                SearchResult searchResult = EsClientConn.getJestClient().execute(search);
                return searchResult;

        }

SearchResult object will return the first (size) itens off the search as usual but will return to a scrollId parameter that is a reference to remain resultSet that elasticSearch keeps in memory for you. SearchResult对象将像往常一样从搜索中返回第一个(大小)itens,但会返回一个scrollId参数,该参数是一个引用,以保留elasticSearch为您保留在内存中的resultSet。 Parameters.SCROLL, will define the time that this search will be keeped on memory. Parameters.SCROLL将定义此搜索将保留在内存中的时间。

For read the scrollId: 要阅读scrollId:

scrollId = searchResult.getJsonObject().get("_scroll_id").getAsString();

For read more items from the resultSet you should use something like follow: 要从resultSet中读取更多项目,您应该使用以下内容:

public JestResult readMoreFromSearch(String scrollId, Long size) throws IOException {

    SearchScroll scroll = new SearchScroll.Builder(scrollId, "1m")
                .setParameter(Parameters.SIZE, size).build();

        JestResult searchResult = EsClientConn.getJestClient().execute(scroll);
        return searchResult;

}

Don't forget that each time you read from the result set a new scrollId is returned from elastic. 不要忘记,每次从结果集中读取时,都会从弹性中返回一个新的scrollId。

Please tell me if you have any doubt. 如果您有任何疑问,请告诉我。

Agreed we need to catch up however please open an issue if you need a feature. 同意我们需要赶上但是如果你需要一个功能,请打开一个问题。

Please check https://github.com/searchbox-io/Jest/blob/master/jest/src/test/java/io/searchbox/core/SearchScrollIntegrationTest.java at master 请查看主人的https://github.com/searchbox-io/Jest/blob/master/jest/src/test/java/io/searchbox/core/SearchScrollIntegrationTest.java

EDIT: 编辑:

It doesn't appear that JEST currently supports the "Scan" search type: In a wicked fast turnaround, it appears that JEST now supports Scan type searches! 似乎JEST目前不支持“扫描”搜索类型: 在恶劣的快速周转中,似乎JEST现在支持扫描类型搜索! Props to @Ferhat for the quick turnaround! 推特@Ferhat进行快速周转! JEST - SearchType.java JEST - SearchType.java


Have you considered just using the ElasticSearch Transport client? 您是否考虑过使用ElasticSearch Transport客户端? I could understand if you like the JEST API a little better, but as new features roll out for ElasticSearch ( Exhibit A: ElasticSearch 0.90 is fantastic! ), you'll get to have them as soon as they pop out instead of waiting for JEST to catch up. 我可以理解你是否更喜欢JEST API,但随着ElasticSearch推出新功能( 图表A:ElasticSearch 0.90太棒了! ),你可以在弹出时立即使用它们而不是等待JEST赶上来。

My $0.02. 我的0.02美元。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM