简体   繁体   English

是否可以运行HADOOP并将文件从本地fs复制到JAVA BUT中的HDFS,而无需在文件系统上安装Hadoop?

[英]Is that possible to run HADOOP and copy a file from local fs to HDFS in JAVA BUT without installing Hadoop on file system?

I have NOT installed hadoop on my Linux file System. hadoop在Linux文件系统上安装hadoop I would like to run hadoop and copy the file from local file system to HDFS WITHOUT installing hadoop on my Linux file System. 我想运行hadoop并将文件从local file system复制到HDFS无需在Linux文件系统上安装hadoop I have created a sample code but it says "wrong FS, expected file:///". 我创建了一个示例代码,但是显示“ FS错误,预期文件:///”。 Any help for this? 有什么帮助吗?

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;

import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.net.URI;

/**
 * Created by Ashish on 23/4/15.
*/
public class SampleHadoop {

    public static void main(String[] args) throws Exception {
        try {

            Configuration configuration = new Configuration();
            FileSystem fs = FileSystem.get(new URI("hdfs://192.168.1.170:54310/"),configuration);
            fs.copyFromLocalFile(new Path("./part-m-00000"), new Path("hdfs://192.168.1.170:54310/user/hduser/samplefile"));
            fs.close();
        } catch (Exception ex) {
          System.out.println("Exception "+ex.toString());
        }
    }
}

POM.XML 的pom.xml

<dependencies>
    <dependency>
        <groupId>org.postgresql</groupId>
        <artifactId>postgresql</artifactId>
        <version>9.3-1102-jdbc41</version>
    </dependency>
    <dependency>
        <groupId>org.apache.httpcomponents</groupId>
        <artifactId>httpclient</artifactId>
        <version>4.3.4</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-client</artifactId>
        <version>1.0.4</version>
    </dependency>
    <dependency>
        <groupId>org.apache.sqoop</groupId>
        <artifactId>sqoop-client</artifactId>
        <version>1.99.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.sqoop</groupId>
        <artifactId>sqoop</artifactId>
        <version>1.4.0-incubating</version>
    </dependency>
    <dependency>
        <groupId>mysql</groupId>
        <artifactId>mysql-connector-java</artifactId>
        <version>5.1.34</version>
    </dependency>
    <dependency>
        <groupId>org.apache.sqoop</groupId>
        <artifactId>sqoop-tools</artifactId>
        <version>1.99.4</version>
    </dependency>
    <dependency>
        <groupId>commons-httpclient</groupId>
        <artifactId>commons-httpclient</artifactId>
        <version>3.1</version>
    </dependency>
</dependencies>

I looked for all possible solution and found following: 我寻找了所有可能的解决方案,结果发现:

...
Configuration conf = new Configuration();
conf.addResource(new Path("/home/user/hadoop/conf/core-site.xml"));
conf.addResource(new Path("/home/user/hadoop/conf/hdfs-site.xml"));

BUT in my case I do not want to install hadoop on my liunx file system so I could not specify such path like "home/user/hadoop". 但是,在我的情况下,我不想在我的liunx文件系统上安装hadoop ,因此无法指定诸如“ home / user / hadoop”之类的路径。 I prefer if I could make it run only using jar files. 我希望我可以使其仅使用jar文件运行。

The right choice for your use case will be using WebHDFS api. 您的用例的正确选择是使用WebHDFS api。 It supports the systems running outside Hadoop clusters to access and manipulate the HDFS contents. 它支持在Hadoop集群外部运行的系统来访问和操作HDFS内容。 It doesn't require the client systems to have hadoop binaries installed, you could manipulate remote hdfs over http using CURL itself. 它不需要客户端系统安装hadoop二进制文件,您可以使用CURL本身通过http操纵远程hdfs。

Please refer, 请参考

https://hadoop.apache.org/docs/r1.2.1/webhdfs.html https://hadoop.apache.org/docs/r1.2.1/webhdfs.html

http://hortonworks.com/blog/webhdfs-%E2%80%93-http-rest-access-to-hdfs/ http://hortonworks.com/blog/webhdfs-%E2%80%93-http-rest-access-to-hdfs/

You will need a hadoop installation in order to copy files to and from HDFS. 您将需要安装hadoop才能在HDFS之间复制文件。

If you have a system with hadoop installed on remote system within the same network, you can copy the remote hdfs files to your local filesystem (no hadoop installation required on local system) . 如果您的系统在同一网络中的远程系统上安装hadoop ,则可以将远程hdfs文件复制到本地文件系统(无需在本地系统上安装hadoop) Just replace your IP with remote system's IP. 只需用远程系统的IP替换您的IP。

Anyway you will need atleast one system with hadoop installation to use hadoop functions. 无论如何,您至少需要一个具有hadoop安装的系统才能使用hadoop功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM