[英]Get Input path from job conf in hadoop
I am setting a path as input location to conf 我正在将路径设置为conf的输入位置
FileInputFormat.setInputPaths(conf, new Path("path/to/folder"));
How can I retrieve this location back from conf as I am trying to implement my own RecordReader 我在尝试实现自己的RecordReader时如何从conf中检索此位置
Thanks in advance... 提前致谢...
The property set by this call is map.input.dir
, so this should work for you: 通过此调用设置的属性是map.input.dir
,因此这应该对您map.input.dir
:
conf.get("map.input.dir");
On a side note, your record reader should act upon the input split it is given in the initialize(InputSplit, TaskAttemptContext)
method, as the folder you pass in setInputPath will actually resolve to a number of input splits, typically one for each file in the folder (and possible multiple input splits for larger, splittable files). 附带一提,您的记录读取器应对initialize(InputSplit, TaskAttemptContext)
方法中给定的输入拆分采取行动,因为您传入setInputPath的文件夹实际上将解析为多个输入拆分,通常每个输入拆分一个文件夹(以及较大的可拆分文件可能的多个输入拆分)。
FileInputFormat
based input formats are passed a FileSplit
to the initialize method, and you should be able to pull out the actual file to be processed from the FileSplit.getPath()
method. 基于FileInputFormat
的输入格式将FileSplit
传递给initialize方法,并且您应该能够从FileSplit.getPath()
方法中提取要处理的实际文件。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.