简体   繁体   中英

How to read from multiple Elasticsearch indices in Spark?

I need to read data from multiple indices of Elasticsearch. But all of these indices have the same data structure.

For example:

val df1 = spark.read.format("org.elasticsearch.spark.sql")
              .option("query", myquery)
              .option("pushdown", "true")
              .load("news_01/myitem")

val df2 = spark.read.format("org.elasticsearch.spark.sql")
              .option("query", myquery)
              .option("pushdown", "true")
              .load("news_02/myitem")

What happens if I get the array of index names ["news_01", "news_02"] ?

How can I avoid creating df1 , df2 as I do now?

Given that ElasticSearch allows you to target multiple indices at the same time during a search request, you could do something like:

val df = spark.read.format("org.elasticsearch.spark.sql")
              .option("query", myquery)
              .option("pushdown", "true")
              .load("news_01,news_02")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM