简体   繁体   English

如何使用Java自定义比较器订购Spark RDD

[英]How to use a java custom comparator for ordering a Spark RDD

I have a class that implements a Comparator in this way: 我有一个以这种方式实现Comparator的类:

public class MyObject<T>
{
    public static class MyObjectComp<T> implements Comparator<MyObject<T>>
        {
            private LinkedHashSet<Integer> attrList;

            public MyObjectComp (int[] intList)
            {
                this.attrList = new LinkedHashSet<Integer>();
                for (int idx: intList)
                    attrList.add(idx);
            }

            public MyObjectComp (LinkedHashSet<Integer> attrList)
            {
                this.attrList = attrList;
            }


            public int compare(MyObject<T> pf1, MyObject<T> pf2)
            {
                for (Integer idx: attrList)
                {
                    double pf1Norm = pf1.atribute.get(idx).myList.size();
                    double pf2Norm = pf2.atribute.get(idx).myList.size();

                    if (pf1Norm > pf2Norm)
                        return 1;
                    else if (pf1Norm < pf2Norm)
                        return -1;
                }

                return (pf1.key > pf2.key) ? 1 : ((pf1.key < pf2.key) ? -1 : 0); 
            }       

        }   
}

In another part of the code written in scala I created a RDD with this MyObject. 在用scala编写的代码的另一部分中,我使用此MyObject创建了RDD。 Now I need to order the elements of this RDD using this internal comparator class of MyObject. 现在,我需要使用MyObject的内部比较器类对RDD的元素进行排序。 How could I do that using a function like myRDD. sort()? 我如何使用myRDD. sort()?类的功能来做到这myRDD. sort()? myRDD. sort()?

Something like (note: have not tried this code) 类似于(注意:尚未尝试此代码)

implicit val moo = new MyObjectComp[ T ]( /* ... */ ).asScala

myRDD.sortBy( t => t )

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM