基准测试JS和Python之间的性能差异

Question

We have an application that so far has been kept exclusively on the front end, but when testing it with large grid configurations we have hit several processing and memory constraints. 到目前为止，我们已经有一个应用程序仅保留在前端，但是在使用大型网格配置进行测试时，我们遇到了一些处理和内存限制。 Thus we're exploring the option of pushing resource demanding tasks to the backend. 因此，我们正在探索将资源需求任务推向后端的选项。

Thus I'm currently running some benchmarks to get some indications of the performance difference that we might see. 因此，我目前正在运行一些基准测试，以了解我们可能会看到的性能差异。 More specifically, given that a good portion of the performance bottleneck is on nested loops, I wrote a simple snippet in js and python to test the performance of handling nested loops and building an array of varying sizes. 更具体地说，鉴于性能瓶颈的很大一部分位于嵌套循环上，我在js和python中编写了一个简单的代码段，以测试处理嵌套循环和构建大小可变的数组的性能。

To my surprise, it looks like js is consistently faster regardless of anything permutations I have tested. 令我惊讶的是，无论我测试了什么排列，js看起来总是更快。

JS snippet: JS代码段：

var timeTotal = 0;
var benchmarkTests = 100;
var testSizes = [50, 100, 500, 1000, 3000];

for (var a = 0; a < testSizes.length; a++) {
  var minX = 0;
  var maxX = testSizes[a];
  var minY = 0;
  var maxY = testSizes[a];
  var cellDimensionsX = 0.991; // cell width
  var cellDimensionsY = 1.652; // cell height
  var cellDimensionsXHalf = cellDimensionsX / 2; // half cell width
  var cellDimensionsYHalf = cellDimensionsY / 2; // half cell height

  var maxCellsXCount = Math.floor(maxX / cellDimensionsX);
  var maxCellsYCount = Math.floor(maxY / cellDimensionsY);

  console.log("X:", maxCellsXCount, "| Y", maxCellsYCount, "| Total", maxCellsXCount * maxCellsYCount);

  for (var k = 0; k < benchmarkTests; k++) {
    var cellCoordsTime = new Date().getTime();

    var cellCoords = {};
    var index = 0;

    for (var i = 0; i < maxCellsXCount; i++) {
      var xCoord = (i * cellDimensionsX) + cellDimensionsXHalf;

      for (var d = 0; d < maxCellsYCount; d++) {
        cellCoords[index] = {
          x: xCoord,
          y: (d * cellDimensionsY) + cellDimensionsYHalf
        };
        index++;
      }
    }

    var thisTime = new Date().getTime() - cellCoordsTime;
    timeTotal += thisTime;
    // console.log('cellCoords', thisTime, 'for grid with', index, 'cells');
  }
  console.log('Testing with a', testSizes[a], '*', testSizes[a], 'grid area. Total time', timeTotal, '. Avg', (timeTotal / benchmarkTests), 'for', benchmarkTests, 'tests');
}

Python snippet: Python片段：

#!/usr/bin/python
import time

timeTotal = 0
benchmarkTests = 100

testSizes = [50, 100, 500, 1000, 3000]

for a in range(len(testSizes)):
    minX = 0
    maxX = testSizes[a]
    minY = 0
    maxY = testSizes[a]
    cellDimensionsX = 0.991 # cell width
    cellDimensionsY = 1.652 # cell height
    cellDimensionsXHalf = cellDimensionsX / 2 # half cell width
    cellDimensionsYHalf = cellDimensionsY / 2 # half cell height

    maxCellsXCount = int(maxX / cellDimensionsX)
    maxCellsYCount = int(maxY / cellDimensionsY)

    print("X: %s | Y %s | Total %s" % (maxCellsXCount, maxCellsYCount, maxCellsXCount * maxCellsYCount))

    for k in range(benchmarkTests):
        start = time.time()
        cellCoords = {}
        index = 0

        for i in range(maxCellsXCount):
            xCoord = (i * cellDimensionsX) + cellDimensionsXHalf
            for d in range(maxCellsYCount):
                cellCoords[index] = {'x': xCoord, 'y': (d * cellDimensionsY) + cellDimensionsYHalf}
                index += 1

        thisTime = (time.time() - start) * 1000;
        timeTotal = timeTotal + thisTime;
        # print("Elapsed Time: %s for grid with %s cells" % (thisTime, index))

    print("Testing with a %s*%s grid area. Total time %s. Avg %s for %s tests" % (testSizes[a], testSizes[a], timeTotal, (timeTotal / benchmarkTests), benchmarkTests))

When running these I get: 运行这些时，我得到：

JS: JS：

Testing with a 50 * 50 grid area. Total time 26 . Avg 0.26 for 100 tests
Testing with a 100 * 100 grid area. Total time 85 . Avg 0.85 for 100 tests
Testing with a 500 * 500 grid area. Total time 4539 . Avg 45.39 for 100 tests
Testing with a 1000 * 1000 grid area. Total time 23160 . Avg 231.6 for 100 tests
Testing with a 3000 * 3000 grid area. Total time 243760 . Avg 2437.6 for 100 tests

Python: 蟒蛇：

Testing with a 50*50 grid area. Total time 50.8642196655. Avg 0.508642196655 for 100 tests
Testing with a 100*100 grid area. Total time 262.931108475. Avg 2.62931108475 for 100 tests
Testing with a 500*500 grid area. Total time 6338.83333206. Avg 63.3883333206 for 100 tests
Testing with a 1000*1000 grid area. Total time 30769.4478035. Avg 307.694478035 for 100 tests
Testing with a 3000*3000 grid area. Total time 304995.391846. Avg 3049.95391846 for 100 tests

All the times are in ms and all the tests were ran on the same localhost. 所有时间均为毫秒，所有测试均在同一本地主机上运行。

I would have expected python to be significantly faster than js. 我本来希望python比js快得多。 Is there anything I'm missing? 有什么我想念的吗？

UPDATE #1 (Moved everything into a function | Overall gain ~17%): 更新＃1（将所有内容移至函数中|总增益约为17％）：

Testing with a 50*50 grid area. Total time 41.2473678589. Avg 0.412473678589 for 100 tests
Testing with a 100*100 grid area. Total time 174.555540085. Avg 1.74555540085 for 100 tests
Testing with a 500*500 grid area. Total time 5617.09475517. Avg 56.1709475517 for 100 tests
Testing with a 1000*1000 grid area. Total time 21199.390173. Avg 211.99390173 for 100 tests
Testing with a 3000*3000 grid area. Total time 255921.251535. Avg 2559.21251535 for 100 tests

UPDATE #2 (Swapped range for xrange | Overall additional gain after UPDATE #1 ~15% | Overall gain compared to the initial code ~30%): UPDATE＃2（xrange的交换范围| UPDATE＃1之后的整体附加增益〜15％|与初始代码相比的整体增益〜30％）：

Testing with a 50*50 grid area. Total time 38.7289524078. Avg 0.387289524078 for 100 tests
Testing with a 100*100 grid area. Total time 176.453590393. Avg 1.76453590393 for 100 tests
Testing with a 500*500 grid area. Total time 5346.49443626. Avg 53.4649443626 for 100 tests
Testing with a 1000*1000 grid area. Total time 21618.1008816. Avg 216.181008816 for 100 tests
Testing with a 3000*3000 grid area. Total time 213622.769356. Avg 2136.22769356 for 100 tests

UPDATE #3 (Swapped dict for list | Overall additional gain after UPDATE #2 ~35% | Overall gain compared to the initial code ~55%): UPDATE＃3（列表交换字典| UPDATE＃2之后的整体附加收益〜35％|与初始代码相比的整体收益〜55％）：

Testing with a 50*50 grid area. Total time 20.7185745239. Avg 0.207185745239 for 100 tests
Testing with a 100*100 grid area. Total time 100.9953022. Avg 1.009953022 for 100 tests
Testing with a 500*500 grid area. Total time 3033.61153603. Avg 30.3361153603 for 100 tests
Testing with a 1000*1000 grid area. Total time 12399.708271. Avg 123.99708271 for 100 tests
Testing with a 3000*3000 grid area. Total time 140118.921518. Avg 1401.18921518 for 100 tests

Matching JS with Python's UPDATE #3 (Swapped object for array | Overall loss compared to the initial code ~165%): 将JS与Python的UPDATE＃3匹配（数组的交换对象|与初始代码相比，总损失为〜165％）：

Testing with a 50 * 50 grid area. Total time 30 . Avg 0.3 for 100 tests
Testing with a 100 * 100 grid area. Total time 48 . Avg 0.48 for 100 tests
Testing with a 500 * 500 grid area. Total time 12694 . Avg 126.94 for 100 tests
Testing with a 1000 * 1000 grid area. Total time 81402 . Avg 814.02 for 100 tests
Testing with a 3000 * 3000 grid area. Total time 625615 . Avg 6256.15 for 100 tests

UPDATE #4 (Pulled Cython into the fight | Overall additional gain after UPDATE #3 ~26% | Overall gain compared to the initial code ~66%): UPDATE＃4（将Cython拉进战斗| UPDATE＃3之后的整体额外收益〜26％|与初始代码相比的整体收益〜66％）：

Testing with a 50*50 grid area. Total time 30. Avg 0.300 for 100 tests
Testing with a 100*100 grid area. Total time 68. Avg 0.680 for 100 tests
Testing with a 500*500 grid area. Total time 2475. Avg 24.750 for 100 tests
Testing with a 1000*1000 grid area. Total time 9924. Avg 99.240 for 100 tests
Testing with a 3000*3000 grid area. Total time 101697. Avg 1016.970 for 100 tests

UPDATE #5 (Typing variables | Overall additional gain after UPDATE #4 ~23% | Overall gain compared to the initial code ~74%): UPDATE＃5（键入变量| UPDATE＃4之后的整体附加收益〜23％|与初始代码相比的整体收益〜74％）：

Testing with a 50*50 grid area. Total time 5. Avg 0.048 for 100 tests
Testing with a 100*100 grid area. Total time 43. Avg 0.426 for 100 tests
Testing with a 500*500 grid area. Total time 1851. Avg 18.511 for 100 tests
Testing with a 1000*1000 grid area. Total time 8020. Avg 80.202 for 100 tests
Testing with a 3000*3000 grid area. Total time 78350. Avg 783.502 for 100 tests

(HOPEFULLY FINAL) UPDATE #6 (Cython related optimizations, including parallelizing it over 2 cores | Overall additional gain after UPDATE #5 ~88% | Overall gain compared to the initial code ~97%): （最终完成）更新6（与Cython相关的优化，包括在2个内核上并行化|更新5之后的总体附加收益〜88％|与初始代码相比的整体收益〜97％）：

Testing with a 50*50 grid area. Total time 0.668. Avg 0.007 for 100 tests
Testing with a 100*100 grid area. Total time 1.584. Avg 0.016 for 100 tests
Testing with a 500*500 grid area. Total time 57.374. Avg 0.574 for 100 tests
Testing with a 1000*1000 grid area. Total time 521.210. Avg 5.212 for 100 tests
Testing with a 3000*3000 grid area. Total time 10113.633. Avg 101.136 for 100 tests

The currently version is on average between 30 and 35 times faster compared to the original implementation. 当前版本平均比原始实现快30到35倍。 So calling it quits for the time being. 因此，暂时停止调用它。

FINAL UPDATE (Tested on hyperthreaded 8 Core machine (No longer localhost and no longer apples to apples comparison) | Further cython related optimizations, and caching the value of y | Overall additional gain after UPDATE #6 ~78% | Overall gain compared to the initial code ~99.3%): 最终更新（在超线程8核计算机上进行了测试（不再是本地主机，也不再是苹果与苹果的比较）|进一步与cython相关的优化，并缓存y的值|更新＃6之后的总体附加收益〜78％|与初始代码〜99.3％）：

Testing with a 50*50 grid area. Total time 0.498. Avg 0.005 for 100 tests
Testing with a 100*100 grid area. Total time 1.146. Avg 0.011 for 100 tests
Testing with a 500*500 grid area. Total time 22.856. Avg 0.229 for 100 tests
Testing with a 1000*1000 grid area. Total time 113.819. Avg 1.138 for 100 tests
Testing with a 3000*3000 grid area. Total time 2228.098ms. Avg 22.281 for 100 tests
Testing with a 10000*10000 grid area. Total time 29407.874ms. Avg 294.79 for 100 tests
Testing with a 20000*20000 grid area. Total time 157185.469ms. Avg 1571.855 for 100 tests

Answer 1

Using more appropriate data types for Python (a tuple instead of creating {x: .., y: ..} dicts, using a list instead of a dict with consecutive integer indices), using xrange instead of range, and wrapping everything into a function, giving the following code: 为Python使用更合适的数据类型（用元组代替范围，而不是创建{x: .., y: ..} dict，使用列表代替具有连续整数索引的dict），使用xrange代替range，并将所有内容包装到函数，给出以下代码：

import time


def foo():
    timeTotal = 0
    benchmarkTests = 100

    testSizes = [50, 100, 500, 1000, 3000]

    for a in range(len(testSizes)):
        minX = 0
        maxX = testSizes[a]
        minY = 0
        maxY = testSizes[a]
        cellDimensionsX = 0.991 # cell width
        cellDimensionsY = 1.652 # cell height
        cellDimensionsXHalf = cellDimensionsX / 2 # half cell width
        cellDimensionsYHalf = cellDimensionsY / 2 # half cell height

        maxCellsXCount = int(maxX / cellDimensionsX)
        maxCellsYCount = int(maxY / cellDimensionsY)

        print("X: %s | Y %s | Total %s" % (maxCellsXCount, maxCellsYCount, maxCellsXCount * maxCellsYCount))

        for k in xrange(benchmarkTests):
            start = time.time()
            # cellCoords = {}
            cellCoords = []

            for i in xrange(maxCellsXCount):
                xCoord = (i * cellDimensionsX) + cellDimensionsXHalf
                for d in xrange(maxCellsYCount):
                    # cellCoords[index] = {'x': xCoord, 'y': (d * cellDimensionsY) + cellDimensionsYHalf}
                    # cellCoords[index] = (xCoord, (d * cellDimensionsY) + cellDimensionsYHalf)
                    cellCoords.append((xCoord, (d * cellDimensionsY) + cellDimensionsYHalf))
                    # index += 1

            thisTime = (time.time() - start) * 1000;
            timeTotal = timeTotal + thisTime;
            # print("Elapsed Time: %s for grid with %s cells" % (thisTime, index))

        print("Testing with a %s*%s grid area. Total time %s. Avg %s for %s tests" % (testSizes[a], testSizes[a], timeTotal, (timeTotal / benchmarkTests), benchmarkTests))

foo()

gives about 50% speedup: 提速约50％：

X: 50 | Y 30 | Total 1500
Testing with a 50*50 grid area. Total time 33.9999198914. Avg 0.339999198914 for 100 tests
X: 100 | Y 60 | Total 6000
Testing with a 100*100 grid area. Total time 191.999912262. Avg 1.91999912262 for 100 tests
X: 504 | Y 302 | Total 152208
Testing with a 500*500 grid area. Total time 4790.99988937. Avg 47.9099988937 for 100 tests
X: 1009 | Y 605 | Total 610445
Testing with a 1000*1000 grid area. Total time 24529.9999714. Avg 245.299999714 for 100 tests
X: 3027 | Y 1815 | Total 5494005
Testing with a 3000*3000 grid area. Total time 201085.000038. Avg 2010.85000038 for 100 tests

compared to the original code on my machine: 与我机器上的原始代码相比：

X: 50 | Y 30 | Total 1500
Testing with a 50*50 grid area. Total time 94.0001010895. Avg 0.940001010895 for 100 tests
X: 100 | Y 60 | Total 6000
Testing with a 100*100 grid area. Total time 495.000123978. Avg 4.95000123978 for 100 tests
X: 504 | Y 302 | Total 152208
Testing with a 500*500 grid area. Total time 10732.0001125. Avg 107.320001125 for 100 tests
X: 1009 | Y 605 | Total 610445
Testing with a 1000*1000 grid area. Total time 49074.0001202. Avg 490.740001202 for 100 tests
X: 3027 | Y 1815 | Total 5494005
Traceback (most recent call last):
  File "pyperf-orig.py", line 31, in <module>
    cellCoords[index] = {'x': xCoord, 'y': (d * cellDimensionsY) + cellDimensionsYHalf}
MemoryError

Answer 2

One improvement is to put everything into a function, because, apparently, Python is not so good with global vars: 一种改进是将所有内容都放入一个函数中，因为显然，Python对于全局var并不是很好：

Running your code: 运行代码：

Testing with a 50*50 grid area. Total time 52.17409133911133. Avg 0.5217409133911133 for 100 tests
Testing with a 100*100 grid area. Total time 262.12358474731445. Avg 2.6212358474731445 for 100 tests
Testing with a 500*500 grid area. Total time 6206.289052963257. Avg 62.06289052963257 for 100 tests
Testing with a 1000*1000 grid area. Total time 30345.27611732483. Avg 303.4527611732483 for 100 tests

When your code is put inside a function, it becomes a lot faster: 当您将代码放入函数中时，它变得更快：

Testing with a 50*50 grid area. Total time 37.11581230163574. Avg 0.3711581230163574 for 100 tests
Testing with a 100*100 grid area. Total time 176.71799659729004. Avg 1.7671799659729004 for 100 tests
Testing with a 500*500 grid area. Total time 4659.825325012207. Avg 46.59825325012207 for 100 tests
Testing with a 1000*1000 grid area. Total time 23246.346712112427. Avg 232.46346712112427 for 100 tests

基准测试JS和Python之间的性能差异

问题描述

2 个解决方案

解决方案1
2 已采纳 2019-02-28 19:44:42

解决方案2
1 2019-02-28 19:22:19

基准测试JS和Python之间的性能差异

问题描述

2 个解决方案

解决方案1 2 已采纳 2019-02-28 19:44:42

解决方案2 1 2019-02-28 19:22:19

解决方案1
2 已采纳 2019-02-28 19:44:42

解决方案2
1 2019-02-28 19:22:19