简体   繁体   English

httpclient版本与Apache Spark之间的冲突

[英]Conflict between httpclient version and Apache Spark

I'm developing a Java application using Apache Spark. 我正在使用Apache Spark开发Java应用程序。 I use this version: 我用这个版本:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>1.2.2</version>
</dependency>

In my code, there is a transitional dependency: 在我的代码中,存在过渡依赖:

<dependency>
    <groupId>org.apache.httpcomponents</groupId>
    <artifactId>httpclient</artifactId>
    <version>4.5.2</version>
</dependency>

I package my application into a single JAR file. 我将我的应用程序打包到一个JAR文件中。 When deploying it on EC2 instance using spark-submit , I get this error. 使用spark-submit在EC2实例上部署它时,我收到此错误。

Caused by: java.lang.NoSuchFieldError: INSTANCE
    at org.apache.http.conn.ssl.SSLConnectionSocketFactory.<clinit>(SSLConnectionSocketFactory.java:144)
    at com.amazonaws.http.apache.client.impl.ApacheConnectionManagerFactory.getPreferredSocketFactory(ApacheConnectionManagerFactory.java:87)
    at com.amazonaws.http.apache.client.impl.ApacheConnectionManagerFactory.create(ApacheConnectionManagerFactory.java:65)
    at com.amazonaws.http.apache.client.impl.ApacheConnectionManagerFactory.create(ApacheConnectionManagerFactory.java:58)
    at com.amazonaws.http.apache.client.impl.ApacheHttpClientFactory.create(ApacheHttpClientFactory.java:50)
    at com.amazonaws.http.apache.client.impl.ApacheHttpClientFactory.create(ApacheHttpClientFactory.java:38)

This error shows clearly that SparkSubmit has loaded an older version of the same Apache httpclient library and this conflict happens for this reason. 此错误清楚地表明SparkSubmit已加载相同Apache httpclient库的旧版本,因此发生此冲突。

What is a good way to solve this issue? 解决这个问题的好方法是什么?

For some reason, I cannot upgrade Spark on my Java code. 出于某种原因,我不能在我的Java代码上升级Spark。 However, I could do that with the EC2 cluster easily. 但是,我可以轻松地使用EC2群集。 Is it possible to deploy my java code on a cluster with a higher version say 1.6.1 version? 是否可以在具有更高版本的1.6.1版本的集群上部署我的Java代码?

As said in your post, Spark is loading an older version of the httpclient . 正如你在帖子中所说,Spark正在加载一个旧版本的httpclient The solution is to use the Maven's relocation facility to produce a neat conflict-free project. 解决方案是使用Maven的relocation设施来生成一个整洁的无冲突项目。

Here's an example of how to use it in your pom.xml file : 以下是如何在pom.xml文件中使用它的示例:

<project>
  <!-- Your project definition here, with the groupId, artifactId, and it's dependencies --> 
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-shade-plugin</artifactId>
        <version>2.4.3</version>
        <executions>
          <execution>
            <phase>package</phase>
            <goals>
              <goal>shade</goal>
            </goals>
            <configuration>
              <relocations>
                <relocation>
                  <pattern>org.apache.http.client</pattern>
                  <shadedPattern>shaded.org.apache.http.client</shadedPattern>
                </relocation>
              </relocations>
            </configuration>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>

</project>

This will move all files from org.apache.http.client to shaded.org.apache.http.client , resolving the conflict. 这会将所有文件从org.apache.http.client移动到shaded.org.apache.http.client ,解决冲突。


Original post : 原帖:

If this is simply a matter of transitive dependencies, you could just add this to your spark-core dependency to exclude the HttpClient used by Spark : 如果这只是传递依赖的问题,您可以将它添加到spark-core依赖项中以排除Spark使用的HttpClient:

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>1.2.2</version>
    <scope>provided</scope>
    <exclusions>
        <exclusion>
            <groupId>org.apache.httpcomponents</groupId>
            <artifactId>httpclient</artifactId>
        </exclusion>
    </exclusions>
</dependency>

I also added the scope as provided in your dependency as it will be provided by your cluster. 我还添加了依赖项中providedscope ,因为它将由您的群集提供。

However, that might muck around with Spark's internal behaviour. 然而,这可能会破坏Spark的内部行为。 If you still get an error after doing this, you could try using Maven's relocation facility that should produce a neat conflict-free project. 如果在执行此操作后仍然出现错误,您可以尝试使用Maven的relocation工具来生成一个整洁的无冲突项目。

Regarding the fact you can't upgrade Spark's version, did you use exactly this dependency declaration from mvnrepository ? 关于无法升级Spark版本的事实,你是否使用了mvnrepository中的这个依赖声明?

Spark being backwards compatible, there shouldn't be any problem deploying your job on a cluster with a higher version. Spark向后兼容,在具有更高版本的群集上部署作业应该没有任何问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在apache HttpClient上设置TLS版本 - How to set TLS version on apache HttpClient HttpCore HttpClient版本冲突由以下原因引起:java.lang.NoSuchFieldError:INSTANCE - HttpCore HttpClient version conflict Caused by: java.lang.NoSuchFieldError: INSTANCE commache-httpclient和httpclient之间的关系是什么,都来自apache - what the relationship between commons-httpclient and httpclient, both from apache Apache HttpClient API中的CloseableHttpClient和HttpClient有什么区别? - What is the difference between CloseableHttpClient and HttpClient in Apache HttpClient API? LUCENE Elasticsearch和项目依赖项之间的版本冲突 - LUCENE Version conflict between Elasticsearch and project dependency 我的UploadActivity和Lollipop版本之间存在冲突 - Conflict between my UploadActivity and Lollipop version 在 Ubuntu 上解决 java 和 javac 之间的版本冲突 - Resolving version conflict between java and javac on Ubuntu Apache HttpClient 3使用哪个SSL版本,并且可以更改? - Which SSL version does Apache HttpClient 3 use and can it be changed? Apache HttpClient-在请求中协商的TLS的日志版本? - Apache HttpClient - Log version of TLS that's negotiated in a request? 解决 spark core 和 azure key vault 依赖项之间的 guava 冲突 - Resolve guava conflict between spark core and azure key vault dependency
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM