[英]scala.collection.Seq doesn't work on Java
Using: 使用:
On the Apache Spark Java API documentation for the class DataSet appears an example to use the method join using a scala.collection.Seq parameter to specify the columns names. 在DataSet类的Apache Spark Java API文档上,出现了一个示例 ,该示例使用使用scala.collection.Seq参数的方法join指定列名称。 But I'm not able to use it. 但是我无法使用它。 On the documentation they provide the following example: 在文档中,他们提供了以下示例:
df1.join(df2, Seq("user_id", "user_name"))
Error: Can not find Symbol Method Seq(String) 错误:找不到符号方法Seq(String)
My Code: 我的代码:
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;
import scala.collection.Seq;
public class UserProfiles {
public static void calcTopShopLookup() {
Dataset<Row> udp = Spark.getDataFrameFromMySQL("my_schema","table_1");
Dataset<Row> result = Spark.getSparkSession().table("table_2").join(udp,Seq("col_1","col_2"));
}
Seq(x, y, ...)
is a Scala way to create sequence. Seq(x, y, ...)
是一种创建序列的Scala方法。 Seq has it's companion object, which has apply method, which allows to not write new
each time. Seq有它的伴随对象,该对象具有apply方法,该方法不允许每次都写new
。
It should be possible to write: 应该可以这样写:
import scala.collection.JavaConversions;
import scala.collection.Seq;
import static java.util.Arrays.asList;
Dataset<Row> result = Spark.getSparkSession().table("table_2").join(udp, JavaConversions.asScalaBuffer(asList("col_1","col_2")));`
Or you can create own small method: 或者您可以创建自己的小方法:
public static <T> Seq<T> asSeq(T... values) {
return JavaConversions.asScalaBuffer(asList(values));
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.