[英]Running yarn job from java program using ProcessBuilder gives file does not exist error
I am trying to run a yarn job from a java wrapper program. 我正在尝试从Java包装程序运行毛线作业。 The mapreduce jar takes two inputs:
mapreduce jar需要两个输入:
Apart from these I have an Output directory. 除了这些,我还有一个Output目录。 the processbuilder code looks like:
processbuilder代码如下所示:
HEADER_PATH = INPUT_DIRECTORY+"/HEADER/*.tsv";
INPUT_FILES = INPUT_DIRECTORY+"/DATA/";
OUTPUT_DIRECTORY = OUTPUT_DIRECTORY+"/";
ProcessBuilder mapRProcessBuilder = new ProcessBuilder("yarn","jar",JAR_LOCATION,"-Dmapred.job.queue.name=name","-Dmapred.reduce.tasks=500",HEADER_PATH,INPUT_DIRECTORY,OUTPUT_DIRECTORY);
System.out.println(mapRProcessBuilder.command().toString());
Process mapRProcess = mapRProcessBuilder.start();
On run, I get the following error: 在运行时,出现以下错误:
Exception in thread "main" java.io.FileNotFoundException: Requested file /input/path/dir1/HEADER/*.tsv does not exist.
线程“主”中的异常java.io.FileNotFoundException:请求的文件/input/path/dir1/HEADER/*.tsv不存在。
But when I run the same command as : 但是当我运行与以下命令相同的命令时:
yarn jar jarfile.jar -Dmapred.job.queue.name=name -Dmapred.reduce.tasks=500 /input/path/dir1/HEADER/*.tsv /input/Dir /output/Dir/
It works all fine. 一切正常。
what can be the issue when running the command from java is causing this issue? 从Java运行命令导致此问题时可能是什么问题?
The *
is being treated as part of the literal string in this case rather than a wildcard. 在这种情况下,
*
被视为文字字符串的一部分,而不是通配符。 Therefore globbing isn't expanding to your desired path name. 因此,globbing不会扩展到您想要的路径名。
If there is only one file in the directory, why don't you find what its path is and pass that as the argument instead 如果目录中只有一个文件,为什么不找到其路径并将其作为参数传递呢?
eg. 例如。
File dir = new File(INPUT_DIRECTORY+"/HEADER);
if (dir.list().length > 0)
String HEADER_PATH = dir.list()[0].getAbsolutePath();
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.