简体   繁体   English

在 C++ 中获取数组中数字频率的最快方法是什么?

[英]What is the fastest way to get the frequency of numbers in an array in C++?

My method creates an std::map<int, int> and populates it with the number and its frequency by iterating over the array once, but I'm wondering if there's a quicker way without using a map.我的方法创建了一个 std::map<int, int> 并通过遍历数组一次来用数字及其频率填充它,但我想知道是否有更快的方法而不使用 map。

std::unordered_map<int,int> can count frequencies as well but its operator[] has complexity ( cppreference ): std::unordered_map<int,int>也可以计算频率,但其operator[]具有复杂性( cppreference ):

Average case: constant, worst case: linear in size.平均情况:恒定,最坏情况:大小呈线性。

Compared to相比

Logarithmic in the size of the container.容器大小的对数。

with a std::map .带有std::map

When the maximum number is small you can use an array, and directly count:当最大数较小时,可以使用数组,直接计数:

 for (const auto& number : array) counter[number]++;

Admittetly, all this has already been said in comments, so I'll also add this one: You need to measure.诚然,这一切都已经在评论中说了,所以我还要补充一点:你需要衡量。 Complexity is only about asymptotic runtime, while for given input size a std::map can actually be faster.复杂性仅与渐近运行时有关,而对于给定的输入大小, std::map实际上可以更快。

NOTE: ValueType, DifferenceType are defined to be注意: ValueType, DifferenceType被定义为

template <std::input_iterator I>
using ValueType = typename std::iterator_traits<I>::value_type;

template <std::input_iterator I>
using
DifferenceType = typename std::iterator_traits<I>::difference_type;

If the array is sorted , you can use std::equal_range to find the range of elements that is equal to x .如果数组是sorted ,您可以使用std::equal_range来查找等于x的元素范围。 With concepts you write:使用您编写的概念:

// I2 is homomorphic to std::pair<I, unsigned>
// [first, last) is partially ordered with respect to I::value_type
// return value is d_first + |{x | x in [first, last)}|
// R is a relation over I, compare element using R
template <std::random_access_iterator I, std::forward_iterator I2,
std::relation<bool, ValueType<I>> R = std::less<ValueType<I>>>
requires(std::regular<ValueType<I>> &&
std::is_constructible_v<ValueType<I2>, I, DifferenceType<I>>)

I2 frequency_sorted(I first, I last, I2 d_first, R r = R())
{
  while(first != last)
  {
    auto [left, right] = std::equal_range(first, last, *first, r);
    *d_first = {left, std::distance(left, right)};
    ++d_first;
    first = right;
  }
  return d_first;
}

If you have limited resources, you can truncate the result and have:如果您的资源有限,您可以截断结果并拥有:

// I2 is homomorphic to std::pair<I, unsigned>
// [first, last) is partially ordered with respect to I::value_type
// return value is a pair, where the first element is 
// the starting point of subsequence [first, last) where such
// subsequence is unevaluated
// the second element is 
// - d_last if |{x | x in [first, last)}| >= d_last - d_first
// - d_first + |{x | x in [first, last)}| if otherwise
template <std::random_access_iterator I, std::forward_iterator I2,
std::relation<bool, ValueType<I>> R = std::less<ValueType<I>>>
requires(std::regular<ValueType<I>> &&
std::is_constructible_v<ValueType<I2>, I, DifferenceType<I>>)

std::pair<I, I2>
frequency_sorted_truncate(I first, I last, I2 d_first, I2 d_last, R r = R())
{
  while(first != last && d_first != d_last)
  {
    auto [left, right] = std::equal_range(first, last, *first, r);
    *d_first = {left, std::distance(left, right)};
    ++d_first;
    first = right;
  }
  return {first, d_first};
}

These two functions allow you to pass in any relation, and the default comparison uses operator< .这两个函数允许你传入任何关系,默认比较使用operator<

If your array is unsorted, and the size of the array is large enough, then it might be a good idea to just sort the array and use the algorithm.如果您的数组未排序,并且数组的大小足够大,那么对数组进行排序并使用算法可能是个好主意。 Hashing might be tempting but it creates cache miss and might not be as fast as you would expect.散列可能很诱人,但它会造成缓存未命中,并且可能不如您预期的那么快。 You can try both methods and measure which one is faster, you are welcome to tell me the result.两种方法你都可以试试看哪一种比较快,欢迎告诉我结果。

My compiler version is g++ 11.2.11 , I think the code can be compiled with a C++ 20 compiler.我的编译器版本是g++ 11.2.11 ,我认为代码可以用C++ 20编译器编译。 If you don't have one, simply replace the concepts part with typename , I think by doing that you will only need a C++ 17 compiler(due to structural binding).如果您没有,只需将概念部分替换为typename ,我认为这样做您只需要一个C++ 17编译器(由于结构绑定)。

Please tell me whether my code can be improved.请告诉我是否可以改进我的代码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 C++ 中复制数组最快的可移植方法是什么 - What is the fastest portable way to copy an array in C++ 获取字节中4个最低有效位(C ++)的最快方法是什么? - What is the fastest way to get the 4 least significant bits in a byte (C++)? 在 C++ 中转置矩阵的最快方法是什么? - What is the fastest way to transpose a matrix in C++? 在 C++ 中初始化一个数组:最快的方法? - Initializing an Array in C++ : the fastest way? 从排序的数字数组中返回一系列数字的最快方法是什么? - What is the fastest way to return a range of numbers from a sorted array of numbers? C / C ++如何将3维数组存储在内存中以及遍历数组的最快方法是什么 - C/C++ How a 3 dimensional array is stored in memory and what is the fastest way to traverse it 在 C++ 中创建均匀随机数的最快方法是什么 - What is the fastest method to create uniform random numbers in c++ 从C ++中的字节数组中提取非零索引的最快方法是什么 - What's the fastest way to extract non-zero indices from a byte array in C++ 在C ++中,从数组元素的指针获取索引的最快方法是什么? - What is the fastest way to obtain an index from a pointer of an array element in C++? 用c ++在N维的位数组中存储和访问单个位的最快方法是什么? - What is the fastest way to store and access single bits in an N dimensional array of bits in c++?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM