[英]Spark Datasets groupByKey doesn't work (Java)
I try to use Dataset's groupByKey method. 我尝试使用Dataset的groupByKey方法。 I can't figure out the problem and can't find any working example which uses groupByKey .
我无法弄清楚问题, 找不到任何使用groupByKey的工作示例。
So let me point out the points, I am looking for in the solution: 那么,让我指出一点,我在解决方案中寻找:
Here is what I did: 这是我做的:
//Inner class
public static class Bean implements Serializable {
private static final long serialVersionUID = 1L;
private String k;
private int something;
public Bean(String name, int value) {
k = name;
something = value;
}
public String getK() {return k;}
public int getSomething() {return something;}
public void setK(String k) {this.k = k;}
public void setSomething(int something) {this.something = something;}
}
//usage
List<Bean> debugData = new ArrayList<Bean>();
debugData.add(new Bean("Arnold", 18));
debugData.add(new Bean("Bob", 7));
debugData.add(new Bean("Bob", 13));
debugData.add(new Bean("Bob", 15));
debugData.add(new Bean("Alice", 27));
Dataset<Row> df = sqlContext.createDataFrame(debugData, Bean.class);
df.groupByKey(row -> {new Bean(row.getString(0), row.getInt(1));}, Encoders.bean(Bean.class)); //doesn't compile
The error I am getting: 我得到的错误:
Using Java 8 lambda 使用Java 8 lambda
df.groupByKey(row -> {
return new Bean(row.getString(0), row.getInt(1));
}, Encoders.bean(Bean.class));
Using MapFunction
使用
MapFunction
df.groupByKey(new MapFunction<Row, Bean>() {
@Override
public Bean call(Row row) throws Exception {
return new Bean(row.getString(0), row.getInt(1));
}
}, Encoders.bean(Bean.class));
This error arises because groupByKey
has two overloded implementations. 出现此错误是因为
groupByKey
有两个overloded实现。 one of these methods gives MapFunction
as first argument and the second gives Function1
. 其中一种方法将
MapFunction
作为第一个参数,第二个给出Function1
。 Your lambda code can cast to both of them. 你的lambda代码可以强制转换为它们。 So you should explicitly declare which one is your intention.
所以你应该明确声明你的意图。 Casting is an easy solution:
铸造是一个简单的解决方案:
df.groupByKey(row -> (MapFunction<Row, Bean>) new Bean(row.getString(0), row.getInt(1))
, Encoders.bean(Bean.class));
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.