删除Ruby Array中的重复项

Question

我可以使用.uniq轻松删除数组中的重复.uniq ，但是如果不使用.uniq方法，我该如何去做呢？

Answer 1

a = [1, 1, 1, 2, 4, 3, 4, 3, 2, 5, 5, 6]

class Array
  def my_uniq
    self | []
  end
end

a.my_uniq
  #=> [1, 2, 4, 3, 5, 6]

这使用方法Array＃| ：“Set Union - 通过将ary与other_ary连接，返回一个新数组，排除任何重复项并保留原始数组中的顺序。”

以下是各种答案的基准，以及Array#uniq 。

require 'fruity'
require 'set'

def doit(n, m)
  arr = n.times.to_a
  arr = m.times.map { arr.sample }
  compare do
    uniq     { arr.uniq } 
    Schwern  { uniq = []; arr.sort.each { |e| uniq.push(e) if e != uniq[-1]; uniq } }
    Sharma   {b = []; arr.each{ |aa| b << aa unless b.include?(aa) }; b }
    Mihael   { arr.to_set.to_a }
    sawa     { arr.group_by(&:itself).keys }
    Cary     { arr | [] }
  end
end

doit(1_000, 500)
# Schwern is faster than uniq by 19.999999999999996% ± 10.0% (results differ)
# uniq is similar to Cary
# Cary is faster than Mihael by 10.000000000000009% ± 10.0%
# Mihael is similar to sawa
# sawa is faster than Sharma by 5x ± 0.1

doit(100_000, 50_000)
# Schwern is faster than uniq by 50.0% ± 10.0%               (results differ)
# uniq is similar to Cary
# Cary is similar to Mihael
# Mihael is faster than sawa by 10.000000000000009% ± 10.0%
# sawa is faster than Sharma by 310x ± 10.0

“Schwern”和“uniq”返回包含相同元素但不以相同顺序排列的数组（因此“结果不同”）。

这是@Schern要求的额外基准。

def doit1(n)
  arr = n.times.map { rand(n/10) }
  compare do
    uniq     { arr.uniq } 
    Schwern  { uniq = []; arr.sort.each { |e| uniq.push(e) if e != uniq[-1]; uniq } }
    Sharma   {b = []; arr.each{ |aa| b << aa unless b.include?(aa) }; b }
    Mihael   { arr.to_set.to_a }
    sawa     { arr.group_by(&:itself).keys }
    Cary     { arr | [] }
  end
end

doit1(1_000)
# Cary is similar to uniq
# uniq is faster than sawa by 3x ± 1.0
# sawa is similar to Schwern                     (results differ)
# Schwern is similar to Mihael                   (results differ)
# Mihael is faster than Sharma by 2x ± 0.1

doit1(50_000)
# Cary is similar to uniq
# uniq is faster than Schwern by 2x ± 1.0        (results differ)
# Schwern is similar to Mihael                   (results differ)
# Mihael is similar to sawa
# sawa is faster than Sharma by 62x ± 10.0

Answer 2

大多数Ruby方法的代码可以在ruby-doc.org API文档中找到。 如果将鼠标悬停在方法的文档上，则会出现“单击以切换源”按钮。 代码在C中，但它很容易理解。

if (RARRAY_LEN(ary) <= 1)
    return rb_ary_dup(ary);

if (rb_block_given_p()) {
    hash = ary_make_hash_by(ary);
    uniq = rb_hash_values(hash);
}
else {
    hash = ary_make_hash(ary);
    uniq = rb_hash_values(hash);
}

如果有一个元素，则返回它。 否则将元素转换为哈希键，将哈希值转换回数组。 通过记录的Ruby哈希的怪癖，“ Hashes按照插入相应键的顺序枚举它们的值 ” ，这种技术保留了Array中元素的原始顺序。 在其他语言中它可能不会。

或者，使用Set 。 集合永远不会有重复。 加载set将方法to_set添加到所有Enumerable对象，其中包括Arrays。 但是，Set通常被实现为Hash，因此您正在做同样的事情。 如果你想要一个独特的数组，如果你不需要订购元素，你应该改为创建一个集合并使用它。 unique = array.to_set

或者，对Array进行排序并循环遍历它，将每个元素推送到新的Array上。 如果新数组的最后一个元素与当前元素匹配，则丢弃它。

array = [2, 3, 4, 5, 1, 2, 4, 5];
uniq = []

# This copies the whole array and the duplicates, wasting
# memory.  And sort is O(nlogn).
array.sort.each { |e|
  uniq.push(e) if e != uniq[-1]
}

[1, 2, 3, 4, 5]
puts uniq.inspect

应该避免使用此方法，因为它比其他方法更慢并且占用更多内存。 排序使它变慢。 排序是O（nlogn）意味着当数组变大时，排序将比数组增长更慢。 它还要求您复制整个数组，并使用重复项，除非您想通过排序进行sort!来更改原始数据sort! 。

其他方法是O（n）速度和O（n）存储器意味着它们将随着阵列变大而线性扩展。 而且他们不必复制可以使用更少内存的副本。

Answer 3

您可以使用#to_set 在此处阅读更多相关信息

Answer 4

array.group_by(&:itself).keys

......................

Answer 5

您也可以尝试这个，请查看以下示例。

a = [1, 1, 1, 2, 4, 3, 4, 3, 2, 5, 5, 6]

b = []

a.each{ |aa| b << aa unless b.include?(aa) }

# when you check b you will get following result.

[1, 2, 4, 3, 5, 6]

或者您也可以尝试以下

a = [1, 1, 1, 2, 4, 3, 4, 3, 2, 5, 5, 6]

b = a & a

# OR

b = a | a

# both will return following result

[1, 2, 4, 3, 5, 6]

删除Ruby Array中的重复项

问题描述

5 个解决方案

解决方案1
5 2015-12-26 21:32:57

解决方案2
4 已采纳 2015-12-26 20:28:47

解决方案3
3 2015-12-26 20:23:01

解决方案4
2 2015-12-26 20:44:26

解决方案5
1 2015-12-26 20:40:03

删除Ruby Array中的重复项

问题描述

5 个解决方案

解决方案1 5 2015-12-26 21:32:57

解决方案2 4 已采纳 2015-12-26 20:28:47

解决方案3 3 2015-12-26 20:23:01

解决方案4 2 2015-12-26 20:44:26

解决方案5 1 2015-12-26 20:40:03

解决方案1
5 2015-12-26 21:32:57

解决方案2
4 已采纳 2015-12-26 20:28:47

解决方案3
3 2015-12-26 20:23:01

解决方案4
2 2015-12-26 20:44:26

解决方案5
1 2015-12-26 20:40:03