简体   繁体   English

BigInts 在 Julia 中似乎很慢

[英]BigInts seem slow in Julia

I'm really impressed with Julia since it ran faster than D on a processor intensive Euler Project question. Julia 给我留下了深刻的印象,因为它在处理器密集型 Euler 项目问题上的运行速度比 D 快。 #303 if anyone is interested.第303章 有兴趣的话

What's weird is how slow BigInts in Julia seems to be.奇怪的是 Julia 中的 BigInts 似乎有多慢。 Strange because I read their performance is quite good.奇怪是因为我看了他们的表现相当不错。

The following is a Julia program to calculate the number of partitions of 15k using Euler's recurrence formula.下面是一个 Julia 程序,使用欧拉的递推公式计算 15k 的分区数。

function eu78()
  lim = 15000
  ps = zeros(BigInt, lim)

  function p(n)  #applies Euler recurrence formula
    if n < 0
      return BigInt(0)
    elseif n == 0
      return BigInt(1)
    elseif ps[n] > 0
      return ps[n]
    end
    s = BigInt(0)
    f = BigInt(-1)
    for k = 1 : n
      f *= -1
      t1 = (k * (3k - 1)) ÷ BigInt(2)
      t2 = (k * (3k + 1)) ÷ 2
      s += f * (p(n - t1) + p(n - t2))
    end
    ps[n] = s
  end

  for i = 1 : lim
    p(i)
  end
  println(ps[lim])
end

eu78()

Runs in a whopping 3min43sec to generate the 132 digit answer.以惊人的 3 分 43 秒运行以生成 132 位数字答案。

The equivalent Python code run with pypy takes a mere 8 seconds.使用 pypy 运行的等效 Python 代码仅需 8 秒。

What am I doing wrong?我究竟做错了什么?

BigInts are currently pretty slow in Julia. Julia目前BigInts相当缓慢。 This is because each operation allocates a new BigInt, as opposed to Python where all integers are BigInts and therefore they've spent a fair amount of time making sure that basic operations are fast. 这是因为每个操作分配一个新的BigInt,而不是Python,其中所有整数都是BigInts,因此他们花了相当多的时间来确保基本操作很快。 Python actually uses a hybrid approach where small integer values are represented inline and only when values get too large are they represented as BigInts. Python实际上使用混合方法,其中小整数值以内联方式表示,并且仅当值变得太大时才表示为BigInts。 The same could absolutely be done in Julia, but no one has implemented this yet – partly because standard Int values are machine integers and therefore the performance of BigInts is not critical. 在Julia中绝对可以做到这一点,但是还没有人实现这一点 - 部分原因是标准的Int值是机器整数,因此BigInts的性能并不重要。 The real boost to BigInt performance will come from integrating Julia's BigInt type with GC and allowing small BigInt values to be stack allocated – and to live in registers. BigInt性能的真正提升将来自于将Julia的BigInt类型与GC集成,并允许小的BigInt值被分配堆栈 - 并且存在于寄存器中。 We're not quite there yet, however. 然而,我们还没到那里。

The following version runs in under 12 seconds on my machine with Julia 0.4: 以下版本在我的机器上使用Julia 0.4在12秒内运行:

const lim = 6*10^4
const ps = zeros(Int64, lim)
ps[1] = 1

function p(n)  #applies Euler recurrence formula for the number of partitions
    n < 0 && return 0
    n == 0 && return 1

    ps[n] > 0 && return ps[n]

    s, f = 0, -1

    for k = 1:n
      f *= -1
      t1 = (k * (3k - 1)) ÷ 2
      t2 = (k * (3k + 1)) ÷ 2
      s += f * (p(n - t1) + p(n - t2))
    end

    siz = 10^9
    ps[n] = mod(s, siz)
end

function eu78(lim=6*10^4)

    for i = 10:lim
        a = p(i)
        if mod(a, 1000000) == 0
            return (i, a)
        end
    end
end

@time eu78(10)  # warm-up

@time eu78(6*10^4)

The question is 5 years old but I'm answering for future reference.这个问题已有 5 年历史,但我正在回答以供将来参考。

Unfortunately BigInts are quite slow in Julia by default.不幸的是,默认情况下,在 Julia 中 BigInts 很慢。 Maybe they will get faster in the future with better compiler tech, but for now one needs to manually tweak the code to achieve better performance.也许他们会在未来通过更好的编译器技术变得更快,但现在需要手动调整代码以获得更好的性能。

There are two things in your code that are slowing it down.您的代码中有两件事会减慢它的速度。

  1. First, as any guide on Julia optimization would suggest, one need to reduce memory allocation .首先,正如任何关于 Julia 优化的指南所建议的那样,需要减少内存分配 And construct a BigInt allocates memory, so one should avoid declare BigInt if possible.并构造一个 BigInt 分配内存,因此应尽可能避免声明 BigInt。 Hence in因此在

    if n < 0 return BigInt(0)

    the BigInt(0) should be replaced with big"0" , or alternatively, with a predefined constant ZERO = BigInt(0) , to avoid creating a zero every time. BigInt(0)应替换为big"0" ,或者替换为预定义的常量ZERO = BigInt(0) ,以避免每次都创建零。

    Another important issue is the use of inplace operators for BigInt arithmetic .另一个重要问题是BigInt 算术的就地运算符使用 When we write a += b , Julia creates a new BigInt c with value a + b , and links a to it.当我们写a += b ,Julia 创建了一个新的 BigInt c ,其值为a + b ,并将a链接到它。 To avoid the allocation, we should use the inplace operator Base.GMP.MPZ.add!(a, b) instead, which directly adds the value of b to a .为了避免分配,我们应该使用就地运算符Base.GMP.MPZ.add!(a, b)代替,它直接将b的值添加到a

    So instead of s += f * (p(n - t1) + p(n - t2)) one should write所以,而不是s += f * (p(n - t1) + p(n - t2))应该写

    add!(s, p(n - t1)) add!(s, p(n - t2))

    (or sub! if f is negative). (或sub!如果f为负)。 Note that the two terms should be separated, since p(n - t1) + p(n - t2) will still allocate a BigInt.请注意,这两项应该分开,因为p(n - t1) + p(n - t2)仍然会分配一个 BigInt。

    With these two improvement, the code for lim = 15000 can finish in 6s, opposed to the original at around 160s (notice the significant drop of memory allocations).通过这两个改进, lim = 15000的代码可以在 6 秒内完成,而原始代码大约需要 160 秒(注意内存分配的显着下降)。

     6.020336 seconds (224.91 M allocations: 3.353 GiB, 1.34% gc time, 0.40% compilation time)
     162.123506 seconds (2.47 G allocations: 49.384 GiB, 28.53% gc time, 0.06% compilation time)
  2. As mentioned in the other answer, we should avoid nested functions if possible.正如另一个答案中提到的,我们应该尽可能避免嵌套函数 I don't have a precise explanation (probably because Julia needs to create the scope for such function), but if you move the function p outside of eu78 and pass the vector ps to it as an argument, the code will finish in just 2s.我没有准确的解释(可能是因为 Julia 需要为这样的函数创建作用域),但是如果将函数p移到eu78之外并将向量ps作为参数传递给它,则代码将在 2 秒内完成. This is as fast as C with the GMP big number library (which is what Julia's BigInt uses under the hood) and is quite impressive.这与带有 GMP 大数库的 C 一样快(这是 Julia 的 BigInt 在幕后使用的)并且非常令人印象深刻。 BTW with pypy3 and the same algorithm I get a runtime of 26s.顺便说一句,使用 pypy3 和相同的算法,我得到了 26 秒的运行时间。 So the claimed 8s in 2016 looks very suspicious to me...所以 2016 年声称的 8s 对我来说看起来很可疑......

     1.910670 seconds (45.22 k allocations: 1.259 MiB)

And here is the code after modification.这是修改后的代码。

import Base.GMP.MPZ: add!, sub!
function p(n, ps)  #applies Euler recurrence formula
  if n < 0
    return big"0"
  elseif n == 0
    return big"1"
  elseif ps[n] > 0
    return ps[n]
  end
  s = BigInt(0)
  for k = 1 : n
    t1 = (k * (3k - 1)) ÷ 2
    t2 = (k * (3k + 1)) ÷ 2
    if iseven(k)
      sub!(s, p(n - t1, ps))
      sub!(s, p(n - t2, ps))
    else
      add!(s, p(n - t1, ps))
      add!(s, p(n - t2, ps))
    end
  end
  ps[n] = s
end

function eu78()
  lim = 15000
  ps = zeros(BigInt, lim)
  for i = 1 : lim
    p(i, ps)
  end
  println(ps[lim])
end

@time eu78()

Thanks Stefan for that quick response. 感谢Stefan的快速反应。

I feel though that there is something else going on since a Julia version without bigints still runs orders of magnitude slower than pypy. 我觉得,自从朱莉娅版本没有bigint仍然比pypy慢几个数量级时,还有其他事情发生。

So this is not so much an answer as a different question. 所以这不是一个不同的问题的答案。

That Euler Project question is to find the first number whose partitiions total a multiple of a million. 欧拉项目的问题是找到第一个数字,其数量总计为一百万。 So we need just over 6 significant digits so it can be done using machine integers. 所以我们需要超过6位有效数字,因此可以使用机器整数完成。

Here is this version in Julia, which completes in 2min44sec. 以下是Julia的这个版本,它在2分44秒内完成。

function eu78()
  const lim = 6 * 10 ^ 4
  ps = zeros(Int64, lim)
  ps[1] = 1
  const siz = 10 ^ 9

  function p(n)  #applies Euler recurrence formula for the number of partitions
    if n < 0
      return 0
    elseif n == 0
      return 1
    elseif ps[n] > 0
      return ps[n]
    end
    s, f = 0, -1
    for k = 1 : n
      f *= -1
      t1 = (k * (3k - 1)) ÷ 2
      t2 = (k * (3k + 1)) ÷ 2
      s += f * (p(n - t1) + p(n - t2))
    end
    ps[n] = mod(s, siz)
  end

 for i = 10 : lim
   a = p(i)
   if mod(a, 1000000) == 0
     println(i,'\n', a)
     break
   end
 end
end

eu78()

(Incidentally, thanks guys for using % in Julia to give the remainder rather than the more usual modulus. :) That cost me a whole evening trying to get it to return an answer, anything %$#%&! (顺便说一下,谢谢你们在Julia中使用%来给出余数而不是更常用的模数。:)这花了我整整一个晚上试图让它返回答案,任何%$#%&! rather than nothing at all.) 而不是一无所获。)

The python version running in pypy completes in 41 sec, a quarter of the time. 在pypy中运行的python版本在41秒内完成,四分之一的时间。 (To be fair, running in python2, it takes 13min48 sec.) (公平地说,在python2中运行,需要13分48秒。)

So what's the problem? 所以有什么问题? The double recursion in line 20? 第20行的双递归? How can I get the speed up? 我怎样才能加快速度? Not that anyone reading this will care, but there is a one minute rule on program execution in Project Euler. 并不是说读这篇文章的人都会关心,但是Project Euler中的程序执行有一分钟的规则。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM