[英]How to query using EasticSearch using java Client without worrying about client java heap memory
I am new to elastic search, I was reading that we can use elasticsearch to query using its rest API calls. 我是弹性搜索的新手,当时我读到我们可以使用elasticsearch通过其其余的API调用进行查询。
I was reading the following link : http://blogs.justenougharchitecture.com/using-jest-as-a-rest-based-java-client-with-elasticsearch/ 我正在阅读以下链接: http : //blogs.justenougharchitecture.com/using-jest-as-a-a-rest-based-java-client-with-elasticsearch/
Is this the right way to do it?? 这是正确的方法吗?
Also, I donot want to put a limit to the number of results that my search will return(it can return millions of records). 另外,我不想对搜索将返回的结果数进行限制(它可以返回数百万条记录)。
So just how ResultSet in java works, where the table might have millions of row, but we can iterate one row at a time and just process it, and not storing it in my java heap anywhere), hence not worrying about the java heap space,.. Similarly I want to do something similar with Elastic Search Querying if possible, ( where I want all the records in the query), but not putting them all together in my memory while iterating them. 因此,这就是Java中ResultSet的工作方式,其中表可能有数百万行,但是我们可以一次迭代一行并对其进行处理,而不是将其存储在我的Java堆中的任何地方),因此不必担心Java堆空间,..同样,如果可能的话,我想与Elastic Search Query做类似的事情(在这里我希望查询中的所有记录),但是在迭代它们时,不要将它们全部放在我的内存中。
Is it possible to do so using any java client(via rest API), if not via rest API, then is there a method of solving this problem. 是否可以使用任何Java客户端(通过rest API)来执行此操作,如果不通过rest API,那么是否有解决此问题的方法。
Thanks 谢谢
First, if you use a Java or another JVM language, you could also use the native client . 首先,如果您使用Java或其他JVM语言,则还可以使用本机客户端 。 Jest is a good option if you want to keep your dependencies small (the java client is essentially the same as the complete server) or if you want or can access Elasticsearch only via the HTTP interface and not via its binary interface.
如果您想保持较小的依赖关系(Java客户端与整个服务器基本相同),或者仅希望通过HTTP接口而不是通过二进制接口访问Elasticsearch,则Jest是一个不错的选择。
Second, what you want to use is the scroll API: https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-search-scrolling.html (didn't found a quick reference on the Jest documentation though). 其次,您要使用的是滚动API: https : //www.elastic.co/guide/zh-cn/elasticsearch/client/java-api/current/java-search-scrolling.html (找不到快捷方式不过请参考Jest文档)。 It doesn't exactly work like ResultSet, but allows you to iterate in chunks over all your results.
它与ResultSet并不完全一样,但是允许您对所有结果进行大块迭代。 An example, copied from the documentation
一个示例,摘自文档
QueryBuilder query = ...;
SearchResponse scrollResponse = client.prepareSearch(index)
.setSearchType(SearchType.SCAN)
.setScroll(new TimeValue(60000)) // timeout
.setQuery(query)
.setSize(100) // bulk size
.execute().actionGet();
//Scroll until no hits are returned
while (!scrollResp.getHits().getHits().isEmpty()) {
for (SearchHit hit : scrollResp.getHits().getHits()) {
//Handle the hit...
}
scrollResp = client.prepareSearchScroll(scrollResp.getScrollId())
.setScroll(new TimeValue(60000))
.execute().actionGet();
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.