簡體   English   中英

無法從hadoop HDFS檢索文件

[英]Can't retrieve files from hadoop hdfs

我正在學習如何從hdfs讀/寫文件。

這是我用來閱讀的代碼:

import java.io.InputStream;
import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;

public class FileSystemCat {
public static void main (String [] args) throws Exception {

    String uri = "/user/hadoop/file.txt";
    Configuration conf = new Configuration();
    conf.addResource(new Path("/usr/local/hadoop/etc/hadoop/core-site.xml"));
    conf.addResource(new Path("/usr/local/hadoop/etc/hadoop/hdfs-site.xml"));

    FileSystem fs = FileSystem.get(URI.create(uri),conf);

    InputStream in = null;
    try{

        in = fs.open(new Path(uri));
        IOUtils.copyBytes(in, System.out, 4096,false);
    }finally{
        IOUtils.closeStream(in);
    }           
}

}

文件在那里

Hadoop集群

但是,當我在eclipse中運行代碼時,我得到以下信息

Exception in thread "main" java.io.FileNotFoundException: File /user/hadoop/file.txt does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:511)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:724)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:501)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:397)
at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.<init>(ChecksumFileSystem.java:137)
at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:339)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:764)
at hadoop.FileSystemCat.main(FileSystemCat.java:22)

我將文件:///user/hadoop/file.txthdfs:///user/hadoop/file.txt都用作路徑

對於后者,錯誤略有不同:

Exception in thread "main" java.io.IOException: No FileSystem for scheme: hdfs

core-site.xml

<configuration>
   <property>
     <name>fs.default.name</name>
     <value>hdfs://localhost/</value>
   </property>
</configuration>

hdfs-site.xml

<configuration>
<property>
   <name>dfs.replication</name>
   <value>2</value>
 </property>

 <property>
   <name>dfs.namenode.name.dir</name>
   <value>file:///usr/local/hadoop_store/hdfs/namenode/</value>
 </property>

 <property>
   <name>dfs.datanode.data.dir</name>
   <value>file:///usr/local/hadoop_store/hdfs/datanode/,file:///mnt/hadoop/hadoop_store/hdfs/datanode/</value>
 </property>

 <property>
   <name>dfs.webhdfs.enabled</name>
   <value>true</value>
 </property>
</configuration>

有什么事嗎

謝謝

你應該換線

FileSystem fs = FileSystem.get(URI.create(uri),conf);

對於這樣的事情

FileSystem fs = FileSystem.get(URI.create("hdfs://localhost"), conf);

如果您的uri路徑在hdfs中,那應該可以工作。

要查看您的uri路徑是否在hdfs中,可以在命令行中執行hadoop fs -ls /

使用HDFS配置參數添加XML文件:

Configuration conf = new Configuration();
conf.addResource(new Path("your_hadoop_path/conf/core-site.xml"));
conf.addResource(new Path("your_hadoop_path/conf/hdfs-site.xml"));
FileSystem fs = FileSystem.get(URI.create(uri),conf);

如果要讀取HDFS文件的數據,則此代碼將執行此操作。

package com.yp.util;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;


public class ReadHadoopFileData {


public static void main(String[] args) throws IOException {

    Configuration conf = new Configuration();
    FileSystem hdfs = FileSystem.get(conf);

    Path hdfsFile = new Path(args[0]);

    try {
        BufferedReader br=new BufferedReader(new InputStreamReader(hdfs.open(hdfsFile)));
        String line;
        line=br.readLine();
        while (line != null){
                System.out.println(line);
                line=br.readLine();
        }

    }catch (IOException ioe) {
        ioe.printStackTrace();
    }   
  }

}

當您使用命令行運行時,hadoop會注意所有環境設置。

運行上述程序的命令(假設您創建了Read.jar和hdfs文件是part-r-00000)

hadoop jar Read.jar com.yp.util.ReadHadoopFileData /MyData/part-r-00000

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM