繁体   English   中英

EsHadoopIllegalArgumentException:无法检测 ES 版本 Spark-ElasticSearch 示例

[英]EsHadoopIllegalArgumentException: Cannot detect ES version Spark-ElasticSearch example

我正在尝试运行简单的数据写入 ElasticSearch 示例。 但是,我不断收到此错误:

EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only

我对 Spark 和 ElasticSearch 的依赖:

scalaVersion := "2.11.5"

val sparkVersion = "2.3.0"

resolvers += "Spark Packages Repo" at "http://dl.bintray.com/spark-packages/maven"

libraryDependencies ++= Seq(
  "org.apache.spark" %% "spark-core" % sparkVersion,
  "org.apache.spark" %% "spark-sql" % sparkVersion,
  "org.apache.spark" %% "spark-streaming" % sparkVersion,
  "com.typesafe" % "config" % "1.3.0",
  "org.elasticsearch" %% "elasticsearch-spark-20" % "6.2.4"
)

这是我的示例代码:

object App {

  def main(args: Array[String]) {
    val sparkConf = new SparkConf()
      .setMaster(args(0))
      .setAppName("KafkaSparkStreaming")
    sparkConf.set("es.index.auto.create", "true")

    val sparkSession = SparkSession
      .builder()
      .config(sparkConf)
      .getOrCreate()

    val streamingContext = new StreamingContext(sparkSession.sparkContext, Seconds(3))
    val sparkContext = streamingContext.sparkContext
    sparkContext.setLogLevel("ERROR")

    val sqlContext = new SQLContext(sparkContext)

    val numbers = Map("one" -> 1, "two" -> 2, "three" -> 3)
    val airports = Map("arrival" -> "Otopeni", "SFO" -> "San Fran")

    sparkContext.makeRDD(Seq(numbers, airports)).saveToEs("spark/docs")

    streamingContext.start()
    streamingContext.awaitTermination()
  }
}

我使用 docker 镜像运行 ElasticSearch。 这是我的 docker-compose.yml 文件:

version: '3.3'
services:
  kafka:
      image: spotify/kafka
      ports:
        - "9092:9092"
      environment:
      - ADVERTISED_HOST=localhost
  elasticsearch:
      image: elasticsearch
  kibana:
      image: kibana
      ports:
        - "5601:5601"

什么可能导致此异常? 我真的很感激一些帮助。

我在尝试用 elasticsearch 试验 spark 时遇到了类似的情况,用“elasticsearch-hadoop”替换“elasticsearch-spark”依赖来满足我的 elasticsearch 版本。 解决了问题

val conf = new SparkConf().setAppName("Sample").setMaster("local[*]")
conf.set("es.index.auto.create", "true")

val sc = new SparkContext(conf)
val ssc = new StreamingContext(sc, Seconds(10))

val numbers = Map("one" -> 1, "two" -> 2, "three" -> 3)
val airports = Map("arrival" -> "Otopeni", "SFO" -> "San Fran")

val rdd = sc.makeRDD(Seq(numbers, airports))
val microbatches = mutable.Queue(rdd)

ssc.queueStream(microbatches).saveToEs("spark/docs")

ssc.start()
ssc.awaitTermination()

依赖列表

"org.apache.spark" %% "spark-core" % "2.2.0",
"org.apache.spark" %% "spark-sql" % "2.2.0",
"org.apache.spark" %% "spark-streaming" % "2.2.0",
"org.apache.spark" %% "spark-streaming-kafka-0-10" % "2.3.1",
"org.elasticsearch" %% "elasticsearch-hadoop" % "6.3.0",
  1. 您可以通过添加 ES 主机名来编辑 Spark 配置:

     sparkConf.set("es.index.auto.create", "true") sparkConf.set("es.nodes", "your_elasticsearch_ip") sparkConf.set("es.port", "9200") sparkConf.set("es.nodes.wan.only", "true")
  2. 您还可以尝试在 Docker-compose 文件中转发 ES 端口:

     elasticsearch: image: elasticsearch ports: - "9200:9200"
  3. 如果它不起作用,则可能是 Spark 连接器有问题,因此您可以将您对 ES 的调用重定向到您的本地:

    在您的 docker-compose 中添加以下命令:

     elasticsearch: image: elasticsearch command: "apt install -y socat && socat tcp-listen:9200,fork tcp:your_elasticsearch_ip:9200 &"

    或者

     command: "apt install -y socat && socat tcp-listen:9200,fork tcp:localhost:9200 &"

socat 会将您本地的 9200 端口转发到您的远程 elasticsearch 9200 端口。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM