简体   繁体   English

将JSON文件从本地复制到HDFS

[英]Copy JSON file from Local to HDFS

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.OutputStream;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class HdfsWriter extends Configured implements Tool {
 public int run(String[] args) throws Exception {
  //String localInputPath = args[0];
  Path outputPath = new Path(args[0]); // ARGUMENT FOR OUTPUT_LOCATION
  Configuration conf = getConf();
  FileSystem fs = FileSystem.get(conf);
  OutputStream os = fs.create(outputPath);
  InputStream is = new BufferedInputStream(new FileInputStream("/home/acadgild/acadgild.txt")); //Data set is getting copied into input stream through buffer mechanism.
  IOUtils.copyBytes(is, os, conf); // Copying the dataset from input stream to output stream
  return 0;
 }

 public static void main(String[] args) throws Exception {
  int returnCode = ToolRunner.run(new HdfsWriter(), args);
  System.exit(returnCode);
 }
}

Need to Move the data from Local to HDFS. 需要将数据从本地移动到HDFS。

The above code I got from another blog , it's not working. 我从另一个博客获得的上述代码不起作用。 can anyone help me on this. 谁可以帮我这个事。

Also i need to parse the Json using MR and group by DateTime and move to HDFS 我还需要使用MR解析Json并按DateTime分组并移至HDFS

  1. Map Reduce is a distributed job processing framework Map Reduce是一个分布式作业处理框架
  2. for each mapper local means the local filesytem on the node on which that mapper is running. 对于每个映射器,“本地”是指运行该映射器的节点上的本地文件系统。
  3. What you want is reading from local on a given node to be put on to HDFS and then processing it via MapReduce. 您要从给定节点上的本地读取内容放入HDFS,然后通过MapReduce处理它。

There are multiple tools available for copying from Local of one node to HDFS 有多种工具可用于从一个节点的本地复制到HDFS

  1. hdfs put localPath HdfsPath (Shell script) hdfs将localPath HdfsPath(Shell脚本)
  2. flume 水槽

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM