[英]Is there a simple way in Java to get the difference between two collections using a custom equals function without overriding the equals?
I'm open to use a lib.我愿意使用一个库。 I just want something simple to diff two collections on a different criteria than the normal equals function.我只是想要一些简单的东西来区分两个 collections,而不是正常的标准等于 function。
Right now I use something like:现在我使用类似的东西:
collection1.stream()
.filter(element -> !collection2.stream()
.anyMatch(element2 -> element2.equalsWithoutSomeField(element)))
.collect(Collectors.toSet());
and I would like something like:我想要类似的东西:
Collections.diff(collection1, collection2, Foo::equalsWithoutSomeField);
(edit) More context: (编辑)更多背景:
Should of mentioned that I'm looking for something that exists already and not to code it myself.应该提到我正在寻找已经存在的东西,而不是自己编写代码。 I might code a small utils from your ideas if nothing exists.如果什么都不存在,我可能会根据您的想法编写一个小实用程序。
Also, Real duplicates aren't possible in my case: the collections are Sets.此外,在我的情况下,不可能有真正的重复:collections 是集合。 However, duplicates according to the custom equals are possible and should not be removed by this operation.但是,根据自定义等号重复是可能的,不应通过此操作删除。 It seems to be a limitation in a lot of possible solutions.在许多可能的解决方案中,这似乎是一个限制。
We use similar methods in our project to shorten repetitive collection filtering. 我们在项目中使用类似的方法来缩短重复的集合过滤。 We started with some basic building blocks: 我们从一些基本构建块开始:
static <T> boolean anyMatch(Collection<T> set, Predicate<T> match) {
for (T object : set)
if (match.test(object))
return true;
return false;
}
Based on this, we can easily implement methods like noneMatch
and more complicated ones like isSubset
or your diff
: 基于此,我们可以轻松实现像noneMatch
这样的方法和更复杂的方法,如isSubset
或你的diff
:
static <E> Collection<E> disjunctiveUnion(Collection<E> c1, Collection<E> c2, BiPredicate<E, E> match)
{
ArrayList<E> diff = new ArrayList<>();
diff.addAll(c1);
diff.addAll(c2);
diff.removeIf(e -> anyMatch(c1, e1 -> match.test(e, e1))
&& anyMatch(c2, e2 -> match.test(e, e2)));
return diff;
}
Note that there are for sure some possibilities for perfomance tuning. 请注意,性能调整肯定有一些可能性。 But keeping it separated into small methods help understanding and using them with ease. 但是将它分成小方法有助于理解和轻松地使用它们。 Used in code they read quite nice. 在代码中使用,他们阅读相当不错。
You would then use it as you already said: 然后你会按照你已经说过的那样使用它:
CollectionUtils.disjunctiveUnion(collection1, collection2, Foo::equalsWithoutSomeField);
Taking Jose Da Silva's suggestion into account, you could even use Comparator
to build your criteria on the fly: 考虑到Jose Da Silva的建议,你甚至可以使用Comparator
来动态建立你的标准:
Comparator<E> special = Comparator.comparing(Foo::thisField)
.thenComparing(Foo::thatField);
BiPredicate specialMatch = (e1, e2) -> special.compare(e1, e2) == 0;
You can use UnifiedSetWithHashingStrategy
from Eclipse Collections . 您可以使用Eclipse Collections中的 UnifiedSetWithHashingStrategy
。 UnifiedSetWithHashingStrategy
allows you to create a Set with a custom HashingStrategy
. UnifiedSetWithHashingStrategy
允许您使用自定义HashingStrategy
创建Set。 HashingStrategy
allows the user to use a custom hashCode()
and equals()
. HashingStrategy
允许用户使用自定义hashCode()
和equals()
。 The Object's hashCode()
and equals()
is not used. 不使用Object的hashCode()
和equals()
。
Edit based on requirement from OP via comment : 根据OP的要求通过评论进行编辑 :
You can use reject()
or removeIf()
depending on your requirement. 您可以根据需要使用reject()
或removeIf()
。
Code Example: 代码示例:
// Common code
Person person1 = new Person("A", "A");
Person person2 = new Person("B", "B");
Person person3 = new Person("C", "A");
Person person4 = new Person("A", "D");
Person person5 = new Person("E", "E");
MutableSet<Person> personSet1 = Sets.mutable.with(person1, person2, person3);
MutableSet<Person> personSet2 = Sets.mutable.with(person2, person4, person5);
HashingStrategy<Person> hashingStrategy =
HashingStrategies.fromFunction(Person::getLastName);
1) Using reject()
: Creates a new Set
which contains all the elements which do not satisfy the Predicate
. 1)使用reject()
创建新的Set
包含所有这些不满足元素Predicate
。
@Test
public void reject()
{
MutableSet<Person> personHashingStrategySet = HashingStrategySets.mutable.withAll(
hashingStrategy, personSet2);
// reject creates a new copy
MutableSet<Person> rejectSet = personSet1.reject(personHashingStrategySet::contains);
Assert.assertEquals(Sets.mutable.with(person1, person3), rejectSet);
}
2) Using removeIf()
: Mutates the original Set
by removing the elements which satisfy the Predicate
. 2)使用removeIf()
:通过删除满足Predicate
的元素来改变原始Set
。
@Test
public void removeIfTest()
{
MutableSet<Person> personHashingStrategySet = HashingStrategySets.mutable.withAll(
hashingStrategy, personSet2);
// removeIf mutates the personSet1
personSet1.removeIf(personHashingStrategySet::contains);
Assert.assertEquals(Sets.mutable.with(person1, person3), personSet1);
}
Answer before requirement from OP via comment: Kept for reference if others might find it useful. 通过评论在OP要求之前回答:如果其他人可能认为它有用,请保留以供参考。
3) Using Sets.differenceInto()
API available in Eclipse Collections: 3)使用Eclipse集合中可用的Sets.differenceInto()
API:
In the code below, set1
and set2
are the two sets which use Person
's equals()
and hashCode()
. 在下面的代码中, set1
和set2
是使用Person
的equals()
和hashCode()
的两个集合。 The differenceSet
is a UnifiedSetWithHashingStrategy
so, it uses the lastNameHashingStrategy
to define uniqueness. differenceSet
是UnifiedSetWithHashingStrategy
因此,它使用lastNameHashingStrategy
来定义唯一性。 Hence, even though set2
does not contain person3
however it has the same lastName as person1
the differenceSet
contains only person1
. 因此,即使set2
不包含person3
但它具有相同的lastName作为person1
的differenceSet
只包含person1
。
@Test
public void differenceTest()
{
MutableSet<Person> differenceSet = Sets.differenceInto(
HashingStrategySets.mutable.with(hashingStrategy),
set1,
set2);
Assert.assertEquals(Sets.mutable.with(person1), differenceSet);
}
Person class common to both code blocks: 两个代码块共有的Person类:
public class Person
{
private final String firstName;
private final String lastName;
public Person(String firstName, String lastName)
{
this.firstName = firstName;
this.lastName = lastName;
}
public String getFirstName()
{
return firstName;
}
public String getLastName()
{
return lastName;
}
@Override
public boolean equals(Object o)
{
if (this == o)
{
return true;
}
if (o == null || getClass() != o.getClass())
{
return false;
}
Person person = (Person) o;
return Objects.equals(firstName, person.firstName) &&
Objects.equals(lastName, person.lastName);
}
@Override
public int hashCode()
{
return Objects.hash(firstName, lastName);
}
}
Javadocs: MutableSet , UnifiedSet , UnifiedSetWithHashingStrategy , HashingStrategy , Sets , reject , removeIf Javadocs: MutableSet , UnifiedSet , UnifiedSetWithHashingStrategy , HashingStrategy , Sets , reject , removeIf
Note: I am a committer on Eclipse Collections 注意:我是Eclipse Collections的提交者
Comparing 对比
You can achieve this without the use of any library, just using java's Comparator 你可以在不使用任何库的情况下实现这一点,只需使用java的Comparator即可
For instance, with the following object 例如,使用以下对象
public class A {
private String a;
private Double b;
private String c;
private int d;
// getters and setters
}
You can use a comparator like 你可以使用像这样的比较器
Comparator<AA> comparator = Comparator.comparing(AA::getA)
.thenComparing(AA::getB)
.thenComparingInt(AA::getD);
This compares the fields a
, b
and the int d
, skipping c
. 这比较了字段a
, b
和int d
,跳过c
。
The only problem here is that this won't work with null values. 这里唯一的问题是这不适用于空值。
Comparing nulls 比较空值
One possible solution to do a fine grained configuration, that is allow to check for specific null fields is using a Comparator
class similar to: 进行细粒度配置的一种可能的解决方案是允许检查特定的空字段,使用的Comparator
类类似于:
// Comparator for properties only, only writed to be used with Comparator#comparing
public final class PropertyNullComparator<T extends Comparable<? super T>>
implements Comparator<Object> {
private PropertyNullComparator() { }
public static <T extends Comparable<? super T>> PropertyNullComparator<T> of() {
return new PropertyNullComparator<>();
}
@Override
public int compare(Object o1, Object o2) {
if (o1 != null && o2 != null) {
if (o1 instanceof Comparable) {
@SuppressWarnings({ "unchecked" })
Comparable<Object> comparable = (Comparable<Object>) o1;
return comparable.compareTo(o2);
} else {
// this will throw a ccn exception when object is not comparable
@SuppressWarnings({ "unchecked" })
Comparable<Object> comparable = (Comparable<Object>) o2;
return comparable.compareTo(o1) * -1; // * -1 to keep order
}
} else {
return o1 == o2 ? 0 : (o1 == null ? -1 : 1); // nulls first
}
}
}
This way you can use a comparator specifying the allowed null fields. 这样,您可以使用指定允许的空字段的比较器。
Comparator<AA> comparator = Comparator.comparing(AA::getA)
.thenComparing(AA::getB, PropertyNullComparator.of())
.thenComparingInt(AA::getD);
If you don't want to define a custom comparator you can use something like: 如果您不想定义自定义比较器,可以使用以下内容:
Comparator<AA> comparator = Comparator.comparing(AA::getA)
.thenComparing(AA::getB, Comparator.nullsFirst(Comparator.naturalOrder()))
.thenComparingInt(AA::getD);
Difference method 差异法
The difference (A - B) method could be implemented using two TreeSets
. 可以使用两个TreeSets
实现差异(A-B)方法。
static <T> TreeSet<T> difference(Collection<T> c1,
Collection<T> c2,
Comparator<T> comparator) {
TreeSet<T> treeSet1 = new TreeSet<>(comparator); treeSet1.addAll(c1);
if (treeSet1.size() > c2.size()) {
treeSet1.removeAll(c2);
} else {
TreeSet<T> treeSet2 = new TreeSet<>(comparator); treeSet2.addAll(c2);
treeSet1.removeAll(treeSet2);
}
return treeSet1;
}
note: a TreeSet
makes sense to be used since we are talking of uniqueness with a specific comparator. 注意: TreeSet
是有意义的,因为我们正在谈论与特定比较器的唯一性。 Also could perform better, the contains
method of TreeSet
is O(log(n))
, compared to a common ArrayList
that is O(n)
. 也可以执行得更好, TreeSet
的contains
方法是O(log(n))
,与常见的ArrayList
( O(n)
。
Why only a TreeSet
is used when treeSet1.size() > c2.size()
, this is because when the condition is not met, the TreeSet#removeAll
, uses the contains
method of the second collection, this second collection could be any java collection and its contains
method its not guaranteed to work exactly the same as the contains
of the first TreeSet
(with custom comparator). 为什么在treeSet1.size() > c2.size()
时只使用TreeSet
,这是因为当条件不满足时, TreeSet#removeAll
,使用第二个集合的contains
方法,这个第二个集合可以是任何java集合及其contains
方法不能保证与第一个TreeSet
的contains
完全相同(使用自定义比较器)。
Edit (Given the more context of the question) 编辑(考虑到问题的更多背景)
Since collection1 is a set that could contains repeated elements acording to the custom equals
(not the equals
of the object) the solution already provided in the question could be used, since it does exactly that, without modifying any of the input collections and creating a new output set. 由于collection1是一个可以包含自定义equals
(而不是对象的equals
)的重复元素的集合,因此可以使用问题中已经提供的解决方案,因为它确实可以使用,而无需修改任何输入集合并创建新的输出集。
So you can create your own static function (because at least i am not aware of a library that provides a similar method), and use the Comparator
or a BiPredicate
. 因此,您可以创建自己的静态函数(因为至少我不知道提供类似方法的库),并使用Comparator
或BiPredicate
。
static <T> Set<T> difference(Collection<T> collection1,
Collection<T> collection2,
Comparator<T> comparator) {
collection1.stream()
.filter(element1 -> !collection2.stream()
.anyMatch(element2 -> comparator.compare(element1, element2) == 0))
.collect(Collectors.toSet());
}
Edit (To Eugene) 编辑(到尤金)
"Why would you want to implement a null safe comparator yourself" “为什么你要自己实现一个null安全比较器”
At least to my knowledge there isn't a comparator to compare fields when this are a simple and common null, the closest that i know of is (to raplace my sugested PropertyNullComparator.of()
[clearer/shorter/better name can be used]): 至少据我所知,没有一个比较器来比较字段时,这是一个简单的常见null,我知道的最接近的是(raplace我的sugested PropertyNullComparator.of()
[更清晰/更短/更好的名称可以使用]):
Comparator.nullsFirst(Comparator.naturalOrder())
So you would have to write that line for every field that you want to compare. 因此,您必须为要比较的每个字段编写该行。 Is this doable?, of course it is, is it practical?, i think not. 这是可行的吗?当然是,它是否实用?,我想不是。
Easy solution, create a helper method. 轻松解决方案,创建一个帮助方法。
static class ComparatorUtils {
public static <T extends Comparable<? super T>> Comparator<T> shnp() { // super short null comparator
return Comparator.nullsFirst(Comparator.<T>naturalOrder());
}
}
Do this work?, yes this works, is it practical?, it looks like, is it a great solution? 这项工作吗?是的,这是有效的,它是否实用?看起来,这是一个很好的解决方案吗? well that depends, many people consider the exaggerated (and/or unnecessary) use of helper methods as an anti-pattern, (a good old article by Nick Malik ). 这取决于许多人认为使用辅助方法作为反模式的夸大(和/或不必要),( 尼克马利克的一篇好文章)。 There are some reasons listed there, but to make things short, this is an OO language, so OO solutions are normally preferred to static helper methods. 这里列出了一些原因,但为了简化,这是一种OO语言,因此OO解决方案通常比静态辅助方法更受欢迎。
"As stated in the documentation : Note that the ordering maintained by a set (whether or not an explicit comparator is provided must be consistent with equals if it is to correctly implement the Set interface. Further, the same problem would arise in the other case, when size() > c.size() because ultimately this would still call equals in the remove method. So they both have to implement Comparator and equals consistently for this to work correctly" “如文档中所述:请注意,由集合维护的顺序(无论是否提供显式比较器,如果要正确实现Set接口,必须与equals一致。此外,在另一种情况下会出现同样的问题,当size()> c.size()时,因为最终这仍然会在remove方法中调用equals。所以他们都必须实现Comparator并且一致地等于此才能正常工作“
The javadoc says of TreeSet the following, but with a clear if: javadoc说TreeSet如下,但有一个明确的if:
Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the Set interface 请注意, 如果要正确实现Set接口,则由set维护的排序(无论是否提供显式比较器)必须与equals一致。
Then says this: 然后这说:
See Comparable or Comparator for a precise definition of consistent with equals 有关与equals一致的精确定义,请参见Comparable或Comparator
If you go to the Comparable javadoc says: 如果你去比较 javadoc说:
It is strongly recommended (though not required) that natural orderings be consistent with equals 强烈建议(尽管不要求)自然排序与equals一致
If we continue to read the javadoc again from Comparable (even in the same paragraph) says the following: 如果我们继续从Comparable中读取javadoc(即使在同一段中),请说明以下内容:
This is so because the Set interface is defined in terms of the equals operation, but a TreeSet instance performs all key comparisons using its compareTo (or compare ) method, so two keys that are deemed equal by this method are, from the standpoint of the set, equal. 这是因为Set接口是根据equals操作定义的,但是TreeSet实例使用compareTo(或compare)方法执行所有键比较,因此从这个方法看,两个被认为相等的键是设定,平等。 The behavior of a set is well-defined even if its ordering is inconsistent with equals; 集合的行为即使其排序与equals不一致也是明确定义的; it just fails to obey the general contract of the Set interface. 它只是不遵守Set接口的一般合同。
By this last quote and with a very simple code debug, or even a reading, you can see the use of an internal TreeMap , and that all its derivated methods are based on the comparator
, not the equals
method; 通过这个最后的引用和一个非常简单的代码调试,甚至是一个阅读,你可以看到内部TreeMap的使用,并且它的所有派生方法都是基于comparator
,而不是equals
方法;
"Why is this so implemented? because there is a difference when removing many elements from a little set and the other way around, as a matter of fact same stands for addAll" “为什么这样实现?因为从一个小集合中移除许多元素时存在差异,反之亦然,事实上同样代表addAll”
If you go to the definition of removeAll
you can see that its implementation is in AbstractSet
, it is not overrided. 如果你去removeAll
的定义你可以看到它的实现是在AbstractSet
,它没有被覆盖。 And this implementation uses a contains
from the argument collection when this is larger, the beavior of this contains
is uncertain, it isn't necessary (nor probable) that the received collection (eg list, queue, etc) has/can define the same comparator. 并且这个实现使用来自参数集合的contains
,当它更大时, contains
是不确定的,所接收的集合(例如列表,队列等)没有必要(也可能)定义相同的比较。
Update 1: This jdk bug is being discussed (and considerated to be fixed) in here https://bugs.openjdk.java.net/browse/JDK-6394757 更新1:这个jdk错误正在讨论(并考虑修复)在这里https://bugs.openjdk.java.net/browse/JDK-6394757
static <T> Collection<T> diff(Collection<T> minuend, Collection<T> subtrahend, BiPredicate<T, T> equals) {
Set<Wrapper<T>> w1 = minuend.stream().map(item -> new Wrapper<>(item, equals)).collect(Collectors.toSet());
Set<Wrapper<T>> w2 = subtrahend.stream().map(item -> new Wrapper<>(item, equals)).collect(Collectors.toSet());
w1.removeAll(w2);
return w1.stream().map(w -> w.item).collect(Collectors.toList());
}
static class Wrapper<T> {
T item;
BiPredicate<T, T> equals;
Wrapper(T item, BiPredicate<T, T> equals) {
this.item = item;
this.equals = equals;
}
@Override
public int hashCode() {
// all items have same hash code, check equals
return 1;
}
@Override
public boolean equals(Object that) {
return equals.test(this.item, ((Wrapper<T>) that).item);
}
}
pom.xml: pom.xml:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
<version>4.4</version>
</dependency>
code/test:代码/测试:
package com.my;
import lombok.Builder;
import lombok.Getter;
import lombok.ToString;
import org.apache.commons.collections4.CollectionUtils;
import org.apache.commons.collections4.Equator;
import java.util.Collection;
import java.util.HashSet;
import java.util.Objects;
import java.util.Set;
import java.util.function.Function;
public class Diff {
public static class FieldEquator<T> implements Equator<T> {
private final Function<T, Object>[] functions;
@SafeVarargs
public FieldEquator(Function<T, Object>... functions) {
if (Objects.isNull(functions) || functions.length < 1) {
throw new UnsupportedOperationException();
}
this.functions = functions;
}
@Override
public boolean equate(T o1, T o2) {
if (Objects.isNull(o1) && Objects.isNull(o2)) {
return true;
}
if (Objects.isNull(o1) || Objects.isNull(o2)) {
return false;
}
for (Function<T, ?> function : functions) {
if (!Objects.equals(function.apply(o1), function.apply(o2))) {
return false;
}
}
return true;
}
@Override
public int hash(T o) {
if (Objects.isNull(o)) {
return -1;
}
int i = 0;
Object[] vals = new Object[functions.length];
for (Function<T, Object> function : functions) {
vals[i] = function.apply(o);
i++;
}
return Objects.hash(vals);
}
}
@SafeVarargs
private static <T> Set<T> difference(Collection<T> a, Collection<T> b, Function<T, Object>... functions) {
if ((Objects.isNull(a) || a.isEmpty()) && Objects.nonNull(b) && !b.isEmpty()) {
return new HashSet<>(b);
} else if ((Objects.isNull(b) || b.isEmpty()) && Objects.nonNull(a) && !a.isEmpty()) {
return new HashSet<>(a);
}
Equator<T> eq = new FieldEquator<>(functions);
Collection<T> res = CollectionUtils.removeAll(a, b, eq);
res.addAll(CollectionUtils.removeAll(b, a, eq));
return new HashSet<>(res);
}
/**
* Test
*/
@Builder
@Getter
@ToString
public static class A {
String a;
String b;
String c;
}
public static void main(String[] args) {
Set<A> as1 = new HashSet<>();
Set<A> as2 = new HashSet<>();
A a1 = A.builder().a("1").b("1").c("1").build();
A a2 = A.builder().a("1").b("1").c("2").build();
A a3 = A.builder().a("2").b("1").c("1").build();
A a4 = A.builder().a("1").b("3").c("1").build();
A a5 = A.builder().a("1").b("1").c("1").build();
A a6 = A.builder().a("1").b("1").c("2").build();
A a7 = A.builder().a("1").b("1").c("6").build();
as1.add(a1);
as1.add(a2);
as1.add(a3);
as2.add(a4);
as2.add(a5);
as2.add(a6);
as2.add(a7);
System.out.println("Set1: " + as1);
System.out.println("Set2: " + as2);
// Check A::getA, A::getB ignore A::getC
Collection<A> difference = difference(as1, as2, A::getA, A::getB);
System.out.println("Diff: " + difference);
}
}
result:结果:
Set1: [Diff.A(a=2, b=1, c=1), Diff.A(a=1, b=1, c=1), Diff.A(a=1, b=1, c=2)]
Set2: [Diff.A(a=1, b=1, c=6), Diff.A(a=1, b=1, c=2), Diff.A(a=1, b=3, c=1), Diff.A(a=1, b=1, c=1)]
Diff: [Diff.A(a=1, b=3, c=1), Diff.A(a=2, b=1, c=1)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.