[英]What are All the ways we can run a scala code in Apache-Spark?
我知道有兩種方法可以在Apache-Spark中運行scala代碼:
1- Using spark-shell
2- Making a jar file from our project and Use spark-submit to run it
還有其他方法可以在Apache-Spark中運行scala代碼嗎? 例如,我可以直接在Apache-Spark中運行scala對象(例如:object.scala)嗎?
謝謝
String sourcePath = "hdfs://hdfs-server:54310/input/*";
SparkConf conf = new SparkConf().setAppName("TestLineCount");
conf.setJars(new String[] { App.class.getProtectionDomain()
.getCodeSource().getLocation().getPath() });
conf.setMaster("spark://spark-server:7077");
conf.set("spark.driver.allowMultipleContexts", "true");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD<String> log = sc.textFile(sourcePath);
JavaRDD<String> lines = log.filter(x -> {
return true;
});
System.out.println(lines.count());
Scala版本:
import org.apache.log4j.Logger
import org.apache.log4j.Level
import org.apache.spark.{SparkConf, SparkContext}
object SimpleApp {
def main(args: Array[String]) {
Logger.getLogger("org").setLevel(Level.OFF)
Logger.getLogger("okka").setLevel(Level.OFF)
val logFile = "/tmp/logs.txt"
val conf = new SparkConf()
.setAppName("Simple Application")
.setMaster("local")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache
println("line count: " + logData.count())
}
}
有關更多詳細信息,請參閱此博客文章 。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.