being a long time C++ developer I just started working on algorithms in Python. I'm currently profiling my code to get a feel for how to program efficiently in Python. One thing in particular stood out to me for which I would be very glad to get an expert explanation.
I wrote this wrapper function for a ray-triangle intersection:
def rayIntersectsTriangle( origin , direction , meshData , poly , worldCoordinateVertices ):
return mathutils.geometry.intersect_ray_tri( worldCoordinateVertices[ meshData.loops[ poly.loop_start ].vertex_index ],
worldCoordinateVertices[ meshData.loops[ poly.loop_start + 1 ].vertex_index ],
worldCoordinateVertices[ meshData.loops[ poly.loop_start + 2 ].vertex_index ],
direction , origin ) != None
When profiling (using cProfile) code that executes this function a lot of times I have the following result:
ncalls tottime percall cumtime percall filename:lineno(function)
15694126 22.314 0.000 25.812 0.000 ****.py:176(rayIntersectsPoly)
[...]
15694126 3.497 0.000 3.497 0.000 {built-in method mathutils.geometry.intersect_ray_tri}
How come this wrapper adds so much overhead? The only thing I can really see going on here is the array element access. Coming from C++ this really confuses me :D
Any help on this would be super appreciated. I want to get my algorithms as fast as possible.
Thanks in advance! Cheers!
The time looks large in comparison because mathutils.geometry.intersect_ray_tri()
is so fast . That method is implemented in an extension and executes at native speed.
The Python time for the method goes to:
mathutils.geometry
and mathutils.geometry.intersect_ray_tri
and poly.loop_start
worldCoordinateVertices[ ... ]
. You could make it a little faster by caching the results of some of these in local names or default parameters:
def rayIntersectsTriangle(
origin, direction, meshData, poly, worldCoordinateVertices
_intersect_ray_tri=mathutils.geometry.intersect_ray_tri):
loop_start = poly.loop_start
meshData_loops = meshData.loops
return _intersect_ray_tri(
worldCoordinateVertices[meshData_loops[loop_start].vertex_index],
worldCoordinateVertices[meshData_loops[loop_start + 1].vertex_index],
worldCoordinateVertices[meshData_loops[loop_start + 2].vertex_index],
direction, origin) is not None
I also used is not None
; that's a pointer operation that is recommended for testing for the None
singleton.
This brings about 8 attribute lookups down to just 2, and removes the global name lookup for mathutils
.
Still, these are micro-optimisations, only do these if they really have an impact (eg the method is called a lot in your code). If this really is a bottleneck for you consider using Cython as an easy path towards turning this code into a compiled extension that also can operate at native speeds.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.