简体   繁体   English

java.io.IOException:无法在 Hadoop 二进制文件中找到可执行文件 null\bin\winutils.exe。 火花 Eclipse 在 windows 7

[英]java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries. spark Eclipse on windows 7

I'm not able to run a simple spark job in Scala IDE (Maven spark project) installed on Windows 7我无法在安装在 Windows 上的Scala IDE (Maven spark 项目)中运行简单的spark作业Windows 7

Spark core dependency has been added.添加了 Spark 核心依赖项。

val conf = new SparkConf().setAppName("DemoDF").setMaster("local")
val sc = new SparkContext(conf)
val logData = sc.textFile("File.txt")
logData.count()

Error:错误:

16/02/26 18:29:33 INFO SparkContext: Created broadcast 0 from textFile at FrameDemo.scala:13
16/02/26 18:29:34 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
    at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:278)
    at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:300)
    at org.apache.hadoop.util.Shell.<clinit>(Shell.java:293)
    at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
    at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:362)
    at <br>org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
    at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
    at <br>org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
    at <br>org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)<br>
    at scala.Option.map(Option.scala:145)<br>
    at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)<br>
    at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:195)<br>
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)<br>
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)<br>
    at scala.Option.getOrElse(Option.scala:120)<br>
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)<br>
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)<br>
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)<br>
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)<br>
    at scala.Option.getOrElse(Option.scala:120)<br>
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)<br>
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)<br>
    at org.apache.spark.rdd.RDD.count(RDD.scala:1143)<br>
    at com.org.SparkDF.FrameDemo$.main(FrameDemo.scala:14)<br>
    at com.org.SparkDF.FrameDemo.main(FrameDemo.scala)<br>

Here is a good explanation of your problem with the solution. 以下是对解决方案问题的一个很好的解释。

  1. Download winutils.exe from http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe . http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe下载winutils.exe。
  2. SetUp your HADOOP_HOME environment variable on the OS level or programmatically: 在操作系统级别或以编程方式设置您的HADOOP_HOME环境变量:

    System.setProperty("hadoop.home.dir", "full path to the folder with winutils"); System.setProperty(“hadoop.home.dir”,“winutils文件夹的完整路径”);

  3. Enjoy 请享用

  1. Download winutils.exe 下载winutils.exe
  2. Create folder, say C:\\winutils\\bin 创建文件夹,比如C:\\winutils\\bin
  3. Copy winutils.exe inside C:\\winutils\\bin C:\\winutils\\bin复制winutils.exe
  4. Set environment variable HADOOP_HOME to C:\\winutils 将环境变量HADOOP_HOME设置为C:\\winutils

Follow this: 按照这个:

  1. Create a bin folder in any directory(to be used in step 3). 在任何目录中创建bin文件夹(将在步骤3中使用)。

  2. Download winutils.exe and place it in the bin directory. 下载winutils.exe并将其放在bin目录中。

  3. Now add System.setProperty("hadoop.home.dir", "PATH/TO/THE/DIR"); 现在添加System.setProperty("hadoop.home.dir", "PATH/TO/THE/DIR"); in your code. 在你的代码中。

if we see below issue 如果我们看到下面的问题

ERROR Shell: Failed to locate the winutils binary in the hadoop binary path 错误Shell:无法在hadoop二进制路径中找到winutils二进制文件

java.io.IOException: Could not locate executable null\\bin\\winutils.exe in the Hadoop binaries. java.io.IOException:找不到Hadoop二进制文件中的可执行文件null \\ bin \\ winutils.exe。

then do following steps 然后执行以下步骤

  1. download winutils.exe from http://public-repo-1.hortonworks.com/hdp- win-alpha/winutils.exe. http://public-repo-1.hortonworks.com/hdp- win-alpha / winutils.exe下载winutils.exe。
  2. and keep this under bin folder of any folder you created for.eg C:\\Hadoop\\bin 并将其保存在您为..eg C:\\ Hadoop \\ bin创建的任何文件夹的bin文件夹下
  3. and in program add following line before creating SparkContext or SparkConf System.setProperty("hadoop.home.dir", "C:\\Hadoop"); 并在程序中添加以下行,然后创建SparkContext或SparkConf System.setProperty(“hadoop.home.dir”,“C:\\ Hadoop”);

On Windows 10 - you should add two different arguments. 在Windows 10上 - 您应该添加两个不同的参数。

(1) Add the new variable and value as - HADOOP_HOME and path (ie c:\\Hadoop) under System Variables. (1)在系统变量下添加新变量和值为 - HADOOP_HOME和路径(即c:\\ Hadoop)。

(2) Add/append new entry to the "Path" variable as "C:\\Hadoop\\bin". (2)在“Path”变量中添加/追加新条目为“C:\\ Hadoop \\ bin”。

The above worked for me. 以上对我有用。

Setting the Hadoop_Home environment variable in system properties didn't work for me. 在系统属性中设置Hadoop_Home环境变量对我来说不起作用。 But this did: 但这样做:

  • Set the Hadoop_Home in the Eclipse Run Configurations environment tab. 在Eclipse Run Configurations环境选项卡中设置Hadoop_Home。
  • Follow the 'Windows Environment Setup' from here 此处按照“Windows环境设置”进行操作

I got the same problem while running unit tests. 我在运行单元测试时遇到了同样的问题。 I found this workaround solution: 我找到了这个解决方案:

The following workaround allows to get rid of this message: 以下解决方法允许删除此消息:

    File workaround = new File(".");
    System.getProperties().put("hadoop.home.dir", workaround.getAbsolutePath());
    new File("./bin").mkdirs();
    new File("./bin/winutils.exe").createNewFile();

from: https://issues.cloudera.org/browse/DISTRO-544 来自: https//issues.cloudera.org/browse/DISTRO-544

You can alternatively download winutils.exe from GITHub: 您也可以下载winutils.exe从GitHub:

https://github.com/steveloughran/winutils/tree/master/hadoop-2.7.1/bin https://github.com/steveloughran/winutils/tree/master/hadoop-2.7.1/bin

replace hadoop-2.7.1 with the version you want and place the file in D:\\hadoop\\bin hadoop-2.7.1替换为您想要的版本,并将文件放在D:\\hadoop\\bin

If you do not have access rights to the environment variable settings on your machine, simply add the below line to your code: 如果您对计算机上的环境变量设置没有访问权限,只需将以下行添加到您的代码中:

System.setProperty("hadoop.home.dir", "D:\\hadoop");
1) Download winutils.exe from https://github.com/steveloughran/winutils 
2) Create a directory In windows "C:\winutils\bin
3) Copy the winutils.exe inside the above bib folder .
4) Set the environmental property in the code 
  System.setProperty("hadoop.home.dir", "file:///C:/winutils/");
5) Create a folder "file:///C:/temp" and give 777 permissions.
6) Add config property in spark Session ".config("spark.sql.warehouse.dir", "file:///C:/temp")"
  • Download winutils.exe and hadoop.dll in your windows machine.在您的 windows 机器中下载winutils.exe 和 hadoop.dll
  • create folder C:\hadoop\bin创建文件夹C:\hadoop\bin
  • Copy winutils.exe and hadoop.dll in newly created hadoop folder将winutils.exe和hadoop.dll复制到新建的hadoop文件夹中
  • Setup environment variable HADOOP_HOME=C:\hadoop设置环境变量HADOOP_HOME=C:\hadoop

On top of mentioning your environment variable for HADOOP_HOME in windows as C:\\winutils , you also need to make sure you are the administrator of the machine. 除了在Windows中将HADOOP_HOME的环境变量称为C:\\winutils ,还需要确保您是该计算机的管理员。 If not and adding environment variables prompts you for admin credentials (even under USER variables) then these variables will be applicable once you start your command prompt as administrator. 如果没有,并且添加环境变量会提示您输入管理员凭据(即使在USER变量下),那么在您以管理员身份启动命令提示符后,这些变量将适用。

I have also faced the similar problem with the following details Java 1.8.0_121, Spark spark-1.6.1-bin-hadoop2.6, Windows 10 and Eclipse Oxygen.When I ran my WordCount.java in Eclipse using HADOOP_HOME as a system variable as mentioned in the previous post, it did not work, what worked for me is - 我还遇到了类似的问题,包括以下细节:Java 1.8.0_121,Spark spark-1.6.1-bin-hadoop2.6,Windows 10和Eclipse Oxygen。当我使用HADOOP_HOME作为系统变量在Eclipse中运行我的WordCount.java时如前一篇文章所述,它没有用,对我有用的是 -

System.setProperty("hadoop.home.dir", "PATH/TO/THE/DIR"); System.setProperty(“hadoop.home.dir”,“PATH / TO / THE / DIR”);

PATH/TO/THE/DIR/bin=winutils.exe whether you run within Eclipse as a Java application or by spark-submit from cmd using PATH / TO / THE / DIR / bin = winutils.exe,无论您是在Eclipse中作为Java应用程序运行,还是通过cmd使用spark-submit运行

spark-submit --class groupid.artifactid.classname --master local[2] /path to the jar file created using maven /path to a demo test file /path to output directory command spark-submit --class groupid.artifactid.classname --master local [2] /使用maven / path创建的jar文件的路径到演示测试文件/输出目录命令的路径

Example: Go to the bin location of Spark/home/location/bin and execute the spark-submit as mentioned, 示例:转到Spark / home / location / bin的bin位置并执行spark-submit,如上所述,

D:\\BigData\\spark-2.3.0-bin-hadoop2.7\\bin>spark-submit --class com.bigdata.abdus.sparkdemo.WordCount --master local[1] D:\\BigData\\spark-quickstart\\target\\spark-quickstart-0.0.1-SNAPSHOT.jar D:\\BigData\\spark-quickstart\\wordcount.txt D:\\ BigData \\ spark-2.3.0-bin-hadoop2.7 \\ bin> spark-submit --class com.bigdata.abdus.sparkdemo.WordCount --master local [1] D:\\ BigData \\ spark-quickstart \\ target \\ spark-quickstart-0.0.1-SNAPSHOT.jar D:\\ BigData \\ spark-quickstart \\ wordcount.txt

That's a tricky one... Your storage letter must be capical. 这是一个棘手的...你的存储信必须是capical。 For example " C :\\..." 例如“ C :\\ ......”

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 MyBatis加载XML:java.io.IOException:找不到资源(eclipse) - MyBatis loading XML : java.io.IOException: Could not find resource (eclipse) java.io.IOException:使用Eclipse读取Java中的EOF Lucene - java.io.IOException: read past EOF Lucene in Java with Eclipse hadoop线程“主”中的异常java.io.IOException:方案的文件系统没有:https - hadoop Exception in thread “main” java.io.IOException: No FileSystem for scheme: https Eclipse插件开发-Gradle无法从Eclipse启动器运行:java.io.IOException:访问被拒绝 - Eclipse plugin development - Gradle fails to run from Eclipse launcher: java.io.IOException: Access denied 执行失败:java.io.IOException:无法运行程序“jarsigner.exe”:CreateProcess - Execute failed: java.io.IOException: Cannot run program “jarsigner.exe”: CreateProcess java.io.IOException源存在,但不在目录中 - java.io.IOException source exists but not in a directory java.security.cert.CertificateParsingException:java.io.IOException:主题密钥,无法创建EC公共密钥 - java.security.cert.CertificateParsingException: java.io.IOException: subject key, Could not create EC public key java.io.IOException:方案:maprfs没有文件系统 - java.io.IOException: No FileSystem for scheme: maprfs 在Eclipse中运行nutch1.9出现错误CrawlDb更新:java.io.IOException:作业失败 - run nutch1.9 in eclipse got error CrawlDb update: java.io.IOException: Job failed java.io.IOException:非法的UTF-8序列:初始字节为11111xxx:252-Eclipse和PostgreSQL - java.io.IOException: Illegal UTF-8 sequence: initial byte is 11111xxx: 252 - Eclipse and PostgreSQL
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM