简体繁体 English

O(N) 速度和 O(1) 内存的汉明数

[英]Hamming numbers for O(N) speed and O(1) memory

原文 2016-05-12 23:27:55 5 1 algorithm/ time-complexity/ big-o/ space-complexity/ hamming-numbers

Disclaimer: there are many questions about it, but I didn't find any with requirement of constant memory.免责声明：关于它有很多问题，但我没有发现任何需要恒定内存的问题。

Hamming numbers is a numbers 2^i*3^j*5^k , where i, j, k are natural numbers.汉明数是一个数2^i*3^j*5^k ，其中 i, j, k 是自然数。

Is there a possibility to generate Nth Hamming number with O(N) time and O(1) (constant) memory?是否有可能用 O(N) 时间和 O(1)（恒定）内存生成第 N 个汉明数？ Under generate I mean exactly the generator, ie you can only output the result and not read the previously generated numbers (in that case memory will be not constant).在 generate 下，我的意思是生成器，即您只能输出结果而不能读取先前生成的数字（在这种情况下，内存将不固定）。 But you can save some constant number of them.但是你可以保存一些常量。

I see only best algorithm with constant memory is not better than O(N log N), for example, based on priority queue.我看到只有具有恒定内存的最佳算法并不比 O(N log N) 好，例如，基于优先级队列。 But is there mathematical proof that it is impossible to construct an algorithm in O(N) time?但是有没有数学证据表明不可能在 O(N) 时间内构造一个算法？

1 个解决方案

First thing to consider here is the direct slice enumeration algorithm which can be seen eg in this SO answer , enumerating the triples (k,j,i) in the vicinity of a given logarithm value ( base 2 ) of a sequence member so that target - delta < k*log2_5 + j*log2_3 + i < target + delta , progressively calculating the cumulative logarithm while picking the j and k so that i is directly known.这里要考虑的第一件事是直接切片枚举算法，例如可以在这个 SO answer 中看到，枚举序列成员的给定对数值（以2为底）附近的三元组(k,j,i)以便target - delta < k*log2_5 + j*log2_3 + i < target + delta ，在选择j和k同时逐步计算累积对数，以便直接知道i 。

It is thus an N ^2/3 -time algo producing N ^2/3 -wide slices of the sequence at a time (with k*log2_5 + j*log2_3 + i close to the target value, so these triples form the crust of the tetrahedron filled with the Hamming sequence triples ¹ ), meaning O(1) time per produced number, thus producing N sequence members in O(N) amortized time and O(N ^2/3 ) -space.因此，它是一个N ^2/3时间算法，一次产生N ^2/3宽的序列切片（ k*log2_5 + j*log2_3 + i接近目标值，所以这些三元组形成了填充有汉明序列三元组¹ ) 的四面体，这意味着每个产生的数字O(1)时间，从而在O(N)分摊时间和O(N ^2/3 )空间中产生N 个序列成员。 That's no improvement over the baseline Dijkstra's algorithm ² with the same complexities, even non-amortized and with better constant factors.这与具有相同复杂度的基线 Dijkstra 算法²相比没有任何改进，即使是非摊销且具有更好的常数因子。

To make it O(1) -space, the crust width will need to be narrowed as we progress along the sequence.为了使它成为O(1)空间，随着我们沿着序列前进，地壳宽度将需要变窄。 But the narrower the crust, the more and more misses will there be when enumerating its triples -- and this is pretty much the proof you asked for .但是地壳越窄，在枚举它的三元组时就会有越来越多的失误——这几乎就是你要求的证据。 The constant slice size means O(N ^2/3 ) work per the O(1) slice, for an overall O(N ^5/3 ) amortized time, O(1) space algorithm.恒定切片大小意味着每个O(1)切片的O(N ^2/3 )工作量，对于整体O(N ^5/3 )分摊时间， O(1)空间算法。

These are the two end points on this spectrum: from N ¹ -time, N ^2/3 -space to N ⁰ space, N ^5/3 -time, amortized.这些是此频谱上的两个端点：从N ^{1 次}， N ^{2/3 次}空间到N ^{0 次}空间， N ^{5/3 次}，摊销。

¹ Here's the image from Wikipedia , with logarithmic vertical scale: ¹这是来自 Wikipedia 的图像，具有对数垂直刻度：

This essentially is a tetrahedron of Hamming sequence triples (i,j,k) stretched in space as (i*log2, j*log3, k*log5) , seen from the side.从侧面看，这本质上是汉明序列三元组(i,j,k)在空间中拉伸为(i*log2, j*log3, k*log5) 。 The image is a bit askew, if it's to be true 3D picture.图像有点歪，如果它是真正的 3D 图片。

edit: ² It seems I forgot that the slices have to be sorted, as they are produced out of order by the j,k -enumerations.编辑： ²似乎我忘记了切片必须排序，因为它们是由j,k枚举乱序生成的。 This changes the best complexity for producing the sequence's N numbers in order via the slice algorithm to O(N ^2/3 log N) time, O(N ^2/3 ) space and makes Dijkstra's algorithm a winner there.这将通过切片算法按顺序生成序列的N 个数字的最佳复杂度更改为O(N ^2/3 log N)时间， O(N ^2/3 )空间，并使 Dijkstra 算法成为赢家。 It doesn't change the top bound of O(N ^5/3 ) time though, for the O(1) slices.对于O(1)切片，它不会改变O(N ^5/3 )时间的上限。