簡體   English   中英

使用JAVA從HDFS中的一個目錄復制到HDFS中的另一個目錄

[英]Copying from one directory in HDFS to another directory in HDFS using JAVA

我正在嘗試將數據從HDFS中的一個目錄復制到HDFS中的另一個目錄,但是我遇到的問題很少。 這是我的代碼段。

  Configuration conf = new Configuration();
  FileSystem fs = FileSystem.get(conf);
  LOGGER.info("Connected");
  Path source=new Path("/data_dev/deepak/src/raw/epic/cl_qanswer_qa/hdp_process_date=2017-07-25/hour=00/minute=00/");
  Path target=new Path("/data_dev/deepak/dest/raw/epics/cl_qanswer_qa/hdp_process_date=2017-07-25/hour=00/minute=00/");
  System.out.println(source);
  System.out.println(target);
  System.out.println("source"+fs.exists(source));
  System.out.println("source"+fs.exists(target));

  FileSystem srcfs = FileSystem.get(conf);
  FileSystem dstFS = FileSystem.get(conf);

  RemoteIterator<LocatedFileStatus> sourceFiles = srcfs.listFiles(source, false);
  LOGGER.info(sourceFiles.toString());
  LOGGER.info("source File System "+fs.toString());
  LOGGER.info("destniation File System"+dstFS.toString());
  if(!fs.exists(target))
  {
      fs.create(target);
      LOGGER.info("created thr path");
  }

  if(sourceFiles != null) {
      while(sourceFiles.hasNext()){
          System.out.println(sourceFiles.toString());
          Path srcfilepath = sourceFiles.next().getPath();
          System.out.println(srcfilepath);
          if(FileUtil.copy(srcfs, srcfilepath, dstFS, target, false,true, conf)){
              System.out.println("Copied Successfully" );
          }
          else
          {
              System.out.println("Copy Failed");
          }
      }           
  }
  srcfs.close();
  dstFS.close();
  fs.close();   
}

如果目標目錄不存在,那么我將在上面的代碼中創建目標目錄。 因此,僅當目標目錄不存在時,我才會收到此錯誤。

hadoop jar Moving.jar
Dec 10, 2017 6:07:30 PM com.ghs.misc.Moving main
INFO: Connected
/data_dev/deepak/src/raw/epic/cl_qanswer_qa/hdp_process_date=2017-07-25/hour=00/minute=00
/data_dev/deepak/dest/raw/epics/cl_qanswer_qa/hdp_process_date=2017-07-25/hour=00/minute=00
sourcetrue
sourcefalse
Dec 10, 2017 6:07:30 PM com.ghs.misc.Moving main
INFO: org.apache.hadoop.fs.FileSystem$6@29a1c0b7
Dec 10, 2017 6:07:30 PM com.ghs.misc.Moving main
INFO: source File System 
DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_61931562_1(auth:KERBEROS)]]
Dec 10, 2017 6:07:30 PM com.ghs.misc.Moving main
INFO: destniation File 
SystemDFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_61931562_1 (auth:KERBEROS)]]
Dec 10, 2017 6:07:30 PM com.ghs.misc.Moving main
INFO: created thr path
org.apache.hadoop.fs.FileSystem$6@29a1c0b7
/data_dev/deepak/src/raw/epic/cl_qanswer_qa/hdp_process_date=2017-07-25/hour=00/minute=00/HQAQA.lzo
Copied Successfully
org.apache.hadoop.fs.FileSystem$6@29a1c0b7
/data_dev/deepak/src/raw/epic/cl_qanswer_qa/hdp_process_date=2017-07-25/hour=00/minute=00/HQAQA.lzo.index
Copied Successfully
org.apache.hadoop.fs.FileSystem$6@29a1c0b7
/data_dev/deepak/src/raw/epic/cl_qanswer_qa/hdp_process_date=2017-07-25/hour=00/minute=00/Test1.txt
Copied Successfully
org.apache.hadoop.fs.FileSystem$6@29a1c0b7
 /data_dev/deepak/src/raw/epic/cl_qanswer_qa/hdp_process_date=2017-07-25/hour=00/minute=00/Test2.txt
Copied Successfully
org.apache.hadoop.fs.FileSystem$6@29a1c0b7
/data_dev/deepak/src/raw/epic/cl_qanswer_qa/hdp_process_date=2017-07-25/hour=00/minute=00/Test3.txt
Copied Successfully17/12/10 
18:07:34 ERROR hdfs.DFSClient: Failed to close inode 364006128
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): No lease on /data_dev/deepak/dest/raw/epics/cl_qanswer_qa/hdp_process_date=2017-07-25/hour=00/minute=00 (inode 364006128): File does not exist. Holder DFSClient_NONMAPREDUCE_61931562_1 does not have any open files.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3693)
    at     org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3781)
    at     org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3748)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:912)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:549)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)

    at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1554)
    at org.apache.hadoop.ipc.Client.call(Client.java:1498)
    at org.apache.hadoop.ipc.Client.call(Client.java:1398)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
    at com.sun.proxy.$Proxy10.complete(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:503)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
    at com.sun.proxy.$Proxy11.complete(Unknown Source)
    at org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2496)
    at org.apache.hadoop.hdfs.DFSOutputStream.closeImpl(DFSOutputStream.java:2472)
    at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2437)
    at org.apache.hadoop.hdfs.DFSClient.closeAllFilesBeingWritten(DFSClient.java:949)
    at org.apache.hadoop.hdfs.DFSClient.closeOutputStreams(DFSClient.java:981)
    at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:1211)
    at com.ghs.misc.Moving.main(Moving.java:67)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:233)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:148)

任何幫助,將不勝感激。 提前致謝!

我發現了錯誤。 創建目錄時,我使用filesystem.create()創建目錄。 但是實際上它是在指示的路徑上創建FSDataOutputStream。 所以我將其更改為filesystem.mkdirs(targetpath),它解決了我的錯誤。 現在我的代碼工作正常。 抱歉,您犯了一個愚蠢的錯誤,浪費您時間。

if(!fs.exists(target))
{
  fs.mkdirs(target);      //I have used this ->fs.create(target);
  LOGGER.info("created the path");
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM