简体   繁体   English

Scala:比较庞大列表中的所有元素

[英]Scala: Compare all elements in a huge list

Please advice on algorithm and implementation to compare elements in a very long list in Scala. 请提供有关算法和实现的建议,以比较Scala中很长的列表中的元素。 I have a list with thousands of strings (from SQL) and I need to compare each list element with all other elements in this list. 我有一个包含数千个字符串的列表(来自SQL),我需要将每个列表元素与该列表中的所有其他元素进行比较。

As a result I need to get a list of tuples: List[(String, String, Boolean)] where first two elements are strings to match and third is a result. 结果,我需要获取一个元组List[(String, String, Boolean)]List[(String, String, Boolean)]其中前两个元素是要匹配的字符串,第三个元素是结果。

For a list of N elements my algorithm so far is as follows: 到目前为止,对于N个元素的列表,我的算法如下:

  1. Take head of the list 占据榜首
  2. Compare head with remaining N-1 elements in the list 比较head和列表中剩余的N-1个元素
  3. Make new list from a tail of the old list and do all above work with this new list of N -1 elements: 从旧列表的尾部创建一个新列表,并使用N -1个元素的新列表进行以上所有工作:

Code: 码:

   /**
   * Compare head of the list with each remaining element in this list
   */
  def cmpel(
    fst: String, lst: List[String],
    result: List[(String, String, Boolean)]): List[(String, String, Boolean)] = {

    lst match {
      case next :: tail => cmpel(fst, tail, (fst, next, fst == next) :: result)
      case nill => result.reverse
    }
  }

  /**
   * Compare list elements in all combinations of two
   */
  def cmpAll(lst: List[String],
    result: List[(String, String, Boolean)]): List[(String, String, Boolean)] = {
    lst match {
      case head :: tail => cmpAll(tail, result ++ cmpel(head, tail, List()))
      case nill => result
    }
  }

  def main(args: Array[String]): Unit = {
    val lst = List[String]("a", "b", "b", "a")
    println(cmpAll(lst, List()))
  }

Result: 结果:

 List((a,b,false), (a,b,false), (a,a,true), (b,b,true), (b,a,false), (b,a,false))

Thanks! 谢谢!

You can use the tails and flatMap methods to write a more concise and idiomatic solution: 您可以使用tailsflatMap方法编写更简洁,惯用的解决方案:

list.tails.flatMap {
  case x :: rest => rest.map { y =>
    (x, y, x == y)
  }
  case _ => List()
}.toList

The tails method returns an iterator that iterates over repeated applications of .tail to the list. tails方法返回一个迭代器,该迭代器将.tail重复应用迭代到列表中。 The first element in the iterator is the list itself, then the tail of the list, and so on, finally returning the empty list. 迭代器中的第一个元素是列表本身,然后是列表的尾部,依此类推,最后返回空列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM