I have NOT installed hadoop
on my Linux file System. I would like to run hadoop
and copy the file from local file system
to HDFS
WITHOUT installing hadoop
on my Linux file System. I have created a sample code but it says "wrong FS, expected file:///". Any help for this?
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import java.io.BufferedInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.InputStream;
import java.net.URI;
/**
* Created by Ashish on 23/4/15.
*/
public class SampleHadoop {
public static void main(String[] args) throws Exception {
try {
Configuration configuration = new Configuration();
FileSystem fs = FileSystem.get(new URI("hdfs://192.168.1.170:54310/"),configuration);
fs.copyFromLocalFile(new Path("./part-m-00000"), new Path("hdfs://192.168.1.170:54310/user/hduser/samplefile"));
fs.close();
} catch (Exception ex) {
System.out.println("Exception "+ex.toString());
}
}
}
POM.XML
<dependencies>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>9.3-1102-jdbc41</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.3.4</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>1.0.4</version>
</dependency>
<dependency>
<groupId>org.apache.sqoop</groupId>
<artifactId>sqoop-client</artifactId>
<version>1.99.1</version>
</dependency>
<dependency>
<groupId>org.apache.sqoop</groupId>
<artifactId>sqoop</artifactId>
<version>1.4.0-incubating</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.34</version>
</dependency>
<dependency>
<groupId>org.apache.sqoop</groupId>
<artifactId>sqoop-tools</artifactId>
<version>1.99.4</version>
</dependency>
<dependency>
<groupId>commons-httpclient</groupId>
<artifactId>commons-httpclient</artifactId>
<version>3.1</version>
</dependency>
</dependencies>
I looked for all possible solution and found following:
...
Configuration conf = new Configuration();
conf.addResource(new Path("/home/user/hadoop/conf/core-site.xml"));
conf.addResource(new Path("/home/user/hadoop/conf/hdfs-site.xml"));
BUT in my case I do not want to install hadoop
on my liunx file system so I could not specify such path like "home/user/hadoop". I prefer if I could make it run only using jar files.
The right choice for your use case will be using WebHDFS api. It supports the systems running outside Hadoop clusters to access and manipulate the HDFS contents. It doesn't require the client systems to have hadoop binaries installed, you could manipulate remote hdfs over http using CURL itself.
Please refer,
https://hadoop.apache.org/docs/r1.2.1/webhdfs.html
http://hortonworks.com/blog/webhdfs-%E2%80%93-http-rest-access-to-hdfs/
You will need a hadoop installation in order to copy files to and from HDFS.
If you have a system with hadoop installed on remote system within the same network, you can copy the remote hdfs files to your local filesystem (no hadoop installation required on local system) . Just replace your IP with remote system's IP.
Anyway you will need atleast one system with hadoop installation to use hadoop functions.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.