简体   繁体   中英

What is a simple example of replacing c code with assembly to improve performance?

I've heard that game developers will sometimes replace parts of inner loops w/ assembly code to improve the performance.

What is a simple example of this?

Where does the assembly go? Just inline w/ the c code?

Thanks!

Edit: a code sample is greatly appreciated.

I'm not a game developer, but I write almost nothing but assembly code for a living (I'm a library writer). Generally this is for performance reasons, but I also do it to work around compiler bugs, or to use hardware features like condition flags that are actually easier to express in assembly than in C.

I'm usually writing complete functions in assembly, so I tend to write .s files that are assembled into object files and linked into an executable or library. People who just need to move a single loop into assembly often use inline assembly in their C source, which is supported by most compilers via some sort of intrinsic.

"Simple" examples are pretty rare, since if it was simple, the compiler would do a sufficiently good job and there would be no need for assembly.

Here are the assembly coding pros:

  • Assembly code can take advantage of a processor's unique instructions as well as various specialised hardware resources. On the other hand, C code is generic, and must support various hardware platforms. Thus, it is difficult for C to support platform-specific code.

  • The assembly programmer is usually very familiar with the application and can make assumptions that are unavailable to the compiler.

  • The assembly programmer can use human creativity; the compiler, advanced as it may be, is merely an automatic program.

On the other hand, here are the assembly coding cons:

  • The assembly programmer has to handle time-consuming machine-level issues such as register allocation and instruction scheduling. With C code, these issues are taken care of by the compiler.

  • Assembly coding requires specialised knowledge of the DSP architecture and its instruction set, whereas C coding only requires knowledge of the C language—which is rather common.

  • With assembly code, it is extremely difficult and time consuming to port applications from one platform to another. Porting is relatively easy for C applications.

from here

Here is a simple(ish) example - the Swap code for Watt-32 .

In this case, __asm is used to integrate assembly code inline with C/C++ code throughout for performance. Since this is an alternative, cross-platform network stack, there are many points where keeping the performance as critical as possible is important.

I have given up assembly coding years ago when I found that an optimizing C++ compiler would beat me hands down when it came to performance because the people building the optimizer consider all sorts of things, like pipeline stalls, partially parallel execution of subsequent, independent code fragments (a good optimizer can rearrange your code a fair bit), and the disadvantages of assembly code (hard to read, hard to debug, not portable) by far outweigh the advantages it used to have in those days when compilers didn't have good optimizers.

If I was you I wouldn't bother myself with assembly coding for normal programming tasks. Even if you could save a CPU clock cycle here or there, looking at the overall performance of a complex application the effect is negligable.

The syntax will depend of your compiler; I use gcc, and it supports a couple of different ways to inline assembler code.

Check this link for description and examples: http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#s4

You will find very little inline assembly in most modern games for the PC, Xbox 360 or PS3. Modern optimizing compilers do a fairly good job of instruction scheduling and register allocation so the performance gain from writing inline assembly is rarely worth the effort any more. Inline assembly is not even supported for 64 bit code in Visual Studio.

Inline assembly used to be important for accessing hardware specific instructions that the compiler would not automatically use. With modern compilers intrinsics are the preferred way of accessing hardware specific instructions. In games intrinsics are often used for math heavy code to access hardware specific vector math instructions (using SSE on the PC or VMX on the Xbox 360 / PS3 PPU or the SPU instruction set on the PS3 SPUs). Intrinsics are platform/compiler specific extensions that look like regular C/C++ functions but map directly to single instructions on the underlying hardware. You can see the documentation for the x86 and x64 intrinsics in Visual Studio on MSDN .

You may still find some really performance critical bits of code written in assembly in some games but generally whole functions will be written in assembly rather than using bits of inline assembly in C/C++ code. I haven't seen any inline assembly in any PC/Xbox 360/PS3 games in any of the code I've worked on in the last 5 years or so.

Michael Abrash wrote a book called the Graphics Programming Black Book . It is definitely worth a read. You can get the PDFs for free online here .

Michael Abrash's classic Graphics Programming Black Book is a compilation of Michael's previous writings on assembly language and graphics programming (including from his "Graphics Programming" column in Dr. Dobb's Journal). Much of the focus of this book is on profiling and code testing, as well as performance optimization. It also explores much of the technology behind the Doom and Quake 3-D games, and 3-D graphics problems such as texture mapping, hidden surface removal, and the like. Thanks to Michael for making this book available.

Even if you do use some api like __asm to inline assembly code, there is an overhead involved. The compiler will first dump all yr registers, (or the ones that you are using in your inlined code depending if the compiler chooses to optimize), then inline your code, then restore those registers. I feel that if there is no SIGNIFICANT advantage of inlining assembly code, it should be avoided, given the tradeoff between maintainability, porting, correctness, readability and performance..

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM