简体   繁体   中英

What does brace "{" mean in AT&T assembly

I am using Intel Xeon Phi. I compile the program like

icpc -mmic -S xxxx.cpp

There are some syntax I don't understand in the assembly code.

     vgetmantpd $0, %zmm2, %zmm9{%k3}                        #85.59 c79
     vsubpd    %zmm11, %zmm10, %zmm12{%k3}                   #85.59 c83
     vpminsd   %zmm14{aaaa}, %zmm12, %zmm13                  #85.59 c87
     vcvtpd2ps {rz-sae}, %zmm9, %zmm6{%k3}                   #85.59 c91
     vpminud   %zmm14{bbbb}, %zmm13, %zmm15                  #85.59 c95

What does the "{"/"}" mean in %zmm12{%k3}. And what is %k3? What is %zmm14{bbbb} ?

Michael is correct in all three points:

1) the {aaaa} and {bbbb} are operand qualifiers that direct each "lane" of the input register (zmm14, in both cases) to be "swizzled" in a particular manner ("{aaaa}" means the low order element of each lane is to be replicated to all four "elements" of the lane, so if zmm14 contained, from high-order to low-order, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, 10; then zmm14{aaaa} would be 130, 130, 130, 130, 90, 90, 90, 90, 50, 50, 50, 50, 10, 10, 10, 10; and zmm14{bbbb} would be 140, 140, 140, 140, 100, 100, 100, 100, 60, 60, 60, 60, 20, 20, 20, 20. zmm14{dcba} is the default swizzle, ie the same as just saying zmm14, and it is no swizzle at all.)

2) the {k3} operand qualifier means only change those elements of the output register (zmm9, in the topmost instruction) for which the corresponding bit in the k3 mask register is set; leave all other elements in zmm9 unchanged.

3) And Michael is also totally on target that you really aren't going to be able to divine all this stuff out. You are going to need to study the architectural documents, because the Xeon Phi VPU architecture is quite a bit different than MMX and SSE. The introduction of mask registers (which are used as predicates to control which elements are modified), swizzles, broadcasts, and up- and down-conversions. In the document Michael linked, the relevant chapter for introduction to this level of the Xeon Phi architecture is chapter 7. Another document you might peruse is this one: http://software.intel.com/en-us/articles/intel-xeon-phi-coprocessor-vector-microarchitecture

Not mentioned in your exact query or in Michael's response is that the {rz-sae} instruction qualifier means that that instruction should perform Rounding toward Zero, and should handle Arithmetic Exceptions Silently.

Regards, Brian R. Nickerson

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM