简体   繁体   English

Java流和按字符串Levenshtein距离过滤

[英]Java Stream and Filter by string Levenshtein distance

I'm trying to figure out if there is an elegant way of doing the following using Java streams: 我试图找出使用Java流是否有一种优雅的方式来执行以下操作:

  1. take a list of Pojos where one of the fields is a String (eg surname) 获取Pojos列表,其中一个字段是字符串(例如姓氏)
  2. take a String that you want to search for (eg surnameTypedIn) 获取要搜索的字符串(例如surnameTypedIn)
  3. find the Pojo in the list with the smallest Levenshtein distance (I'm using the Apache Commons StringUtils.getLevenshteinDistance) 找到Levenshtein距离最小的列表中的Pojo(我使用的是Apache Commons StringUtils.getLevenshteinDistance)
  4. return the whole Pojo, not just the surname or the distance 返回整个Pojo,而不仅仅是姓氏或距离

So far the only way I've been able to do it is to create an intermediate map at each level, which works but feels very dirty. 到目前为止,我能够做到的唯一方法是在每个级别创建一个中间地图,它可以工作但感觉很脏。 Is there an accepted way to do this, eg by implementing a custom Collector or something like that? 有没有可接受的方法来实现这一点,例如通过实现自定义收集器或类似的东西?

Just create a Comparator<Pojo> : 只需创建一个Comparator<Pojo>

Comparator<Pojo> comparator =
    Comparator.comparingInt(
        p -> StringUtils.getLevenshteinDistance(p.surname(), surnameTypedIn)

Then use the Stream.min method: 然后使用Stream.min方法:

Optional<Pojo> minPojo = listOfPojos.stream().min(comparator);

(You can inline the Comparator.comparingInt in the Stream.min call if you want; I just separated them for readability). (如果需要,可以在Stream.min调用中内联Comparator.comparingInt ;为了便于阅读,我将它们分开)。

Or, without streams: 或者,没有流:

Pojo minPojo = Collections.min(listOfPojos, comparator);

Note that this way will throw a NoSuchElementException if listOfPojos is empty. 请注意,如果listOfPojos为空,这种方式将抛出NoSuchElementException

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM