[英]hadoop - Where are input/output files stored in hadoop and how to execute java file in hadoop?
Suppose I write a java program and i want to run it in Hadoop, then 假设我编写了一个Java程序,并且想在Hadoop中运行它,然后
hadoop classname
The simplest answers I can think of to your questions are: 我能想到的最简单的答案是:
1) Anywhere 1)任何地方
2,3,4) $HADOOP_HOME/bin/hadoop jar [path_to_your_jar_file]
2,3,4)
$HADOOP_HOME/bin/hadoop jar [path_to_your_jar_file]
A similar question was asked here Executing helloworld.java in apache hadoop 在这里问了类似的问题在apache hadoop中执行helloworld.java
It may seem complicated, but it's simpler than you might think! 它可能看起来很复杂,但是比您想象的要简单!
map/reduce
classes, and your main
class into a jar. map/reduce
类和main
类编译到jar中。 Let's call this jar myjob.jar
. myjob.jar
。
hadoop
command line utility installed. hadoop
命令行实用程序的任何计算机上。 hadoop jar myjob.jar
Hope that helps. 希望能有所帮助。
The data should be saved in "hdfs". 数据应保存在“ hdfs”中。 You will want to probably load it into the cluster from your data source using something like Apache Flume.
您可能希望使用Apache Flume之类的工具将其从数据源加载到集群中。 The file can be placed anywhere but most home is /user/hadoop/
该文件可以放在任何位置,但大多数位置是/ user / hadoop /
SSH into the hadoop cluster headnode like a standard linux server. 像标准linux服务器一样,通过SSH进入hadoop集群头节点。
To list your hadoop root hdfs hadoop fs -ls /
列出您的hadoop根hdfs
hadoop fs -ls /
hadoop classname
You should be using the hadoop command to access your data and run your programs, try hadoop help
您应该使用hadoop命令来访问数据并运行程序,请尝试使用
hadoop help
hadoop -jar MyJar.jar com.mycompany.MainDriver arg[0] arg[1] ...
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.