简体   繁体   中英

Under what conditions does Metal shader code “crash?”

I'm developing a Metal-based app, and in some cases properly compiled and linked shader code will cause the application to simply crash without throwing any errors.

A "crash" consists of a halt in visual output (in some cases preceded by a short stutter of a couple alternating frames), but otherwise normal procession of the rest of the application. The Xcode performance monitoring utilities report 60fps but 0ms GPU latency, and CPU-side execution continues, with calls to the Metal API still completing successfully.

No errors are reported to the console.

This is extremely difficult to debug, as I have no indication of where in shader code the error is coming from. It would help if I knew under what conditions this is actually supposed to happen, so that I can have a good list of things to check. Otherwise I'm just shooting in the dark whenever this comes up.

The GPU can crash when you read or write off the end of a MTLBuffer, write off the end of a MTLTexture, or simply run too long. There is a watchdog timer that will reset the GPU if it doesn't complete its work in less than a few seconds. Work on the GPU is not preemptively scheduled. It is possible for long running work to make the device seem locked up by preventing basic GUI tasks from executing. If you have long running workloads, it is necessary to split it up into many smaller kernels. To keep the interface responsive you should keep workloads < 100 ms. To avoid video stuttering, a consistent frame rate is recommended.

I was having frequent crashes due to heavy Metal shaders as well and manged to fix it by throttling the dispatch rate. You can do this easily by measuring the runtime of the last "frame", and inserting a wait before every dispatch by a ratio of that amount:

[NSthread sleepFortimeInterval: _lastRunTime*RATIO];
NSDate *startTime = [NSDate date];
... [use Metal shaders] ...
_lastRunTime = -[startTime timeIntervalSinceNow];

I set the RATIO to 1.0. So it never uses more than 50% of gpu. It obviously impacts frame rate, but beats random crashes. You can play with the ratio. Nice thing is you don't have to worry about throttling too much or too little on different products, as its a ratio of runtime.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM