简体   繁体   English

Win32事件循环似乎是程序瓶颈

[英]Win32 event loop appearing to be the program bottleneck

I am making a game in Python with Pyglet. 我正在用Pyglet用Python做游戏。 I have just finished the display part, and getting issues with speed. 我刚刚完成了显示部分,并且遇到了速度问题。 Like a good person, I profiled, and got the following: (uninteresting bits excluded; currently it just redraws the screen when I push an arrow key with random magenta and white) 像一个好人一样,我进行了分析,并获得了以下内容:(排除了一些无关紧要的内容;当前,当我按下带有随机洋红色和白色的箭头键时,它只是重绘了屏幕)

    15085326 function calls (15085306 primitive calls) in 32.166 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   32.168   32.168 <string>:1(<module>)

   120139    0.499    0.000    0.686    0.000 allocation.py:132(alloc)
   120121    0.563    0.000    0.844    0.000 allocation.py:268(dealloc)

       99    0.743    0.008   20.531    0.207 engine.py:58(Update)

   237600    0.796    0.000   11.995    0.000 sprite.py:349(_set_texture)
   120121    0.677    0.000    9.062    0.000 sprite.py:365(_create_vertex_list)
   357721    1.487    0.000    3.478    0.000 sprite.py:377(_update_position)

   420767    0.786    0.000    2.054    0.000 vertexbuffer.py:421(get_region)
   715442    0.859    0.000    1.280    0.000 vertexbuffer.py:467(invalidate)


        1    9.674    9.674   32.168   32.168 win32.py:46(run)
      180    0.007    0.000    1.771    0.010 win32.py:83(_timer_func)


   237600    0.416    0.000   17.069    0.000 window.py:60(SetTile)
   237600    0.646    0.000    2.174    0.000 window.py:72(GetTileTexture)

Everything which took < 0.5 seconds for total time has been removed, pretty much. 总时间少于0.5秒的所有内容都已删除。 Mostly stuff that couldn't be a problem. 大多数情况下,这不是问题。

This is the result of me hitting the keyboard for half a minute. 这是我敲击键盘半分钟的结果。 For the most part, I could get 2 or 3 changes of screen per second.. I would personally like as fast as I could hit the keyboard. 在大多数情况下,我每秒可以得到2或3个屏幕切换。我个人希望能尽快击键盘。 Heck, my aim is a good 50-60fps. 哎呀,我的目标是达到50-60fps。

The win32 run being 10 seconds not spent in subfunctions is what worries me. 令我感到不安的是,在子功能上花费了10秒的win32运行时间。 It could be idle time (even though there is a pyglet idle), but wouldn't that be spent drawing? 它可能是空闲时间(即使有一个pyglet都处于空闲状态),但是这不会花费在绘图上吗?

The part I thought was slow was actually fast; 我认为很慢的部分实际上很快。 the window SetTile part. 窗口SetTile部分。 To deal with the tiles, I have a 2D list of sprites that represent them on screen and simply alter the images. 为了处理这些图块,我有一个2D的精灵列表,它们在屏幕上表示它们并仅更改图像。 I don't think that's an issue. 我认为这不是问题。

The other potential problem I saw was my Update - I have to iterate across ~2400 tiles each time it is called. 我看到的另一个潜在问题是我的更新-每次调用时,我必须遍历〜2400个图块。 However, it doesn't seem all that bad. 但是,这似乎并不那么糟糕。 Only 0.7 seconds for 90 keypresses. 90次按键仅需0.7秒。

I start to wonder if this is a sign that Python is too slow for my needs. 我开始怀疑这是否表明Python太慢而无法满足我的需求。 Then again, it shouldn't be. 再说一次,不应该这样。 It's not too much of a computationally heavy thing I'm doing. 我所做的并不是太多的计算工作。

tl;dr Is the win32 event loop in Python my bottleneck, and what does that mean? tl; dr Python的win32事件循环是我的瓶颈吗?这是什么意思? If not, where may I have lost speed? 如果没有,我可能会在哪里失去速度?

Code available if needed. 如果需要,可用代码。 I assume it's Pywin32 used by pyglet. 我假设它是pyglet使用的Pywin32。

REVISED Answer: I deleted the columns that are worthless information, such as self time, call count, and per-call time. 修订的答案:我删除了无用的信息列,例如自身时间,通话次数和每次通话时间。 Then I arranged them in descending order by cumtime, and discarded the small ones. 然后,我按时间按降序排列它们,并丢弃小块。

cumtime  filename:lineno(function)
 32.168  <string>:1(<module>)
 32.168  win32.py:46(run)
 20.531  engine.py:58(Update)
 17.069  window.py:60(SetTile)
 11.995  sprite.py:349(_set_texture)
  9.062  sprite.py:365(_create_vertex_list)

Cumtime means the total amount of time that particular routine was on the call stack. Cumtime表示特定例程在调用堆栈上的总时间。 So naturally some high-level routines were on the stack for all 32 seconds. 因此自然而然地,一些高级例程在整个32秒内都处于堆栈中。 Others were on the stack a smaller fraction of the time. 其他人则很少。 For example, _set_texture was active about 1/3 the time, while _create_vertex_list was also active about 1/3 of the time. 例如, _set_texture处于活动状态的时间约为1/3,而_create_vertex_list也处于活动状态的时间约为1/3。 That suggests vertices are being created a lot, rather than being re-used, so maybe you could save about 30% of time by not recreating them. 这表明创建了很多顶点,而不是重复使用这些顶点,因此,不重新创建顶点可以节省大约30%的时间。

But that's just a guess. 但这只是一个猜测。 There is no need to guess. 无需猜测。

What you need to know is the fraction of time statements (not just functions) in your code were active on the stack. 您需要知道的是代码中的时间语句(不仅是函数)在堆栈中处于活动状态的比例。 You need to know that because if there is a performance problem, it is such a line of code. 您需要知道,因为如果存在性能问题,那就是这样的代码行。

Here's how the problem can be found if you have one. 如果您有问题,可以按以下方法找到问题。

The profiler seems based on gprof , and here are some comments about that . 探查似乎基于gprof这里有关于一些评论

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM