简体   繁体   English

为什么这个Python脚本在CP​​U中比在GPU中运行得更快?

[英]Why is this Python script running faster in the CPU than in the GPU?

I'm using Python's library somoclu to train a self-organising map using Python. 我正在使用Python的库somoclu来训练使用Python的自组织地图。 The library allows users to perform the training either on the CPU (Intel Core i7-8700) or on the GPU (GeForce GTX 1080 Ti). 该库允许用户在CPU(Intel Core i7-8700)或GPU(GeForce GTX 1080 Ti)上执行培训。

I noticed that the CPU was running the script faster than the GPU did, so I ran a sweep varying the number of datapoints and the size of the map, to see if at some point the GPU outperformed the CPU. 我注意到CPU运行脚本的速度比GPU快,所以我进行了一次扫描,改变了数据点的数量和地图的大小,看看GPU在某些时候是否胜过CPU。 This was the script: 这是脚本:

import numpy as np
import somoclu
import time

m = 3 # Number of dimensions
points = [5000, 30000, 80000, 150000, 300000] # Number of datapoints
iterMax = 200 # Max number of iterations
mapSize = [4, 32, 64, 128] # Dimensions of SOM
np.random.seed(0)
#%% SOM
for n in points:
    for size in mapSize:
        y = np.random.rand(n,m) # Input data
        # With CPU
        t = time.clock() # Start time
        som = somoclu.Somoclu(size,
                              size,
                              compactsupport = False,
                              kerneltype = 0)
        som.train(y.astype(np.float32), epochs = iterMax)
        elapsedTime = time.clock() - t
        # With GPU
        t = time.clock() # Start time
        som = somoclu.Somoclu(size,
                              size,
                              compactsupport = False,
                              kerneltype = 1)
        som.train(y.astype(np.float32), epochs = iterMax)
        elapsedTime = time.clock() - t

I saved the times in a CSV, and this is what I got: 我用CSV保存了时间,这就是我得到的:

CPU                 GPU
2.7632589999999997  5.935387999999999
60.340638           82.796062
228.292085          305.75625900000006
861.3243            1141.331934
11.692982999999913  24.568256999999903
330.17140100000006  443.82112400000005
1354.677431         1749.3110039999992
5559.308704         6990.034151000002
29.3726179999976    47.36881999999969
913.3250950000001   1163.5942189999987
3703.653313999999   4615.292857
14868.418703000003  18635.051464000004
37.40133600000263   68.64375999999902
1699.020611         2141.047305
6925.692426000009   8645.564134
27887.844171999997  illegal memory access was encountered

As you can see, the CPU outperforms the GPU in every single case (on top of it, the GPU version crashed when running the script with 150000 datapoints and a 64x64 map). 正如您所看到的,CPU在每种情况下都优于GPU(在其上面,当运行具有150000数据点和64x64映射的脚本时,GPU版本崩溃)。 How is this possible? 这怎么可能? What is the advantage on using the GPU to train the SOM then? 那么使用GPU训练SOM有什么好处呢?

EDIT: 编辑:

I tried the same library in R, and in this language the GPU outperforms the CPU. 我在R中尝试了相同的库,在这种语言中,GPU优于CPU。 So apparently is just a Python issue, but I'm no expert in programming to figure out what is happening. 所以显然只是一个Python问题,但我不是编程的专家,无法弄清楚发生了什么。 I believe the kernel running is the same, so it's just the interface that changes. 我相信内核运行是一样的,所以它只是改变的界面。 Let's see if this helps somebody to find why in Python the CPU is going faster than the GPU. 让我们看看这是否有助于某人在Python中找到为什么CPU比GPU更快。

According to Figure 5 in this paper on somoclu , the GPU was faster. 根据本文关于somoclu的图5,GPU更快。 However, the paper did not show extensive benchmarking. 但是,该文件没有显示广泛的基准。 I can only suggest that for your machine, the CPU is more capable. 我只能建议你的机器,CPU更强大。 But you could study the paper to run a more similar test for comparison. 但是你可以研究这篇论文来运行一个更类似的测试来进行比较。 在此输入图像描述

To ensure replicability of the results, we benchmarked with publicly available cluster GPUinstances provided by Amazon Web Services. 为确保结果的可复制性,我们使用Amazon Web Services提供的公开群集GPU实例进行基准测试。 The instance type was cg1.4xlarge ( https://aws.amazon.com/ec2/instance-types/ ), equipped with 22 GiB of memory, two IntelXeon X5570 quad-core CPUs, and two NVIDIA Tesla M2050 GPUs, running Ubuntu 12.04. 实例类型为cg1.4xlarge( https://aws.amazon.com/ec2/instance-types/ ),配备22 GiB内存,两个IntelXeon X5570四核CPU和两个运行Ubuntu的NVIDIA Tesla M2050 GPU 12.04。

(16) Somoclu: An Efficient Parallel Library for Self-Organizing Maps, Available from: https://www.researchgate.net/publication/236635216_Somoclu_An_Efficient_Parallel_Library_for_Self-Organizing_Maps (16)Somoclu:一个用于自组织地图的高效并行库,可从以下网址获取: https ://www.researchgate.net/publication/236635216_Somoclu_An_Efficient_Parallel_Library_for_Self-Organizing_Maps

It seems that both your CPU and your GPU are more powerful than the AWS benchmark. 您的CPU和GPU似乎都比AWS基准测试更强大。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我的 i5 笔记本电脑运行 python 脚本的速度比 Xeon 40 核心服务器的 cpu Ubuntu 18.04 快,为什么? - My i5 laptop is running python script faster than Xeon 40 core server's cpu Ubuntu 18.04, Why? 为什么Keras LSTM在CPU上比GPU快三倍? - Why is Keras LSTM on CPU three times faster than GPU? 为什么这个操作在CPU上比GPU执行得更快? - Why does this operation execute faster on CPU than GPU? 为什么GPU上的乘法比CPU慢? - Why multiplication on GPU is slower than on CPU? 如何通过并行运行 CPU 和 GPU 更快地训练我的神经网络 - How to train my neural network faster by running CPU and GPU in parallel 在GPU而不是CPU上运行的Python-Tensorflow - Python-Tensorflow running on GPU instead of CPU 在CPU和GPU上并行运行Python Tensorflow - Running Python Tensorflow on CPU and GPU in parallel 为什么 DataBricks 的 Spark 在 Python 中运行速度比 Scala 稍快 - Why is DataBricks's Spark running slightly faster in Python than Scala 为什么 python 多处理比指定的并行进程数使用更多的 CPU 和 GPU? - Why python multiprocessing used more CPU and GPU than the specified in-parallel process numbers? GPU tensorflow 在笔记本电脑上运行速度比 CPU tensorflow 慢? - GPU tensorflow running slower than CPU tensorflow on laptop?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM