简体   繁体   中英

Multiple assignment in one line

I just come across the statement in embedded c (dsPIC33)

sample1 = sample2 = 0;

Would this mean

sample1 = 0;

sample2 = 0;

Why do they type it this way? Is this good or bad coding?

Remember that assignment is done right to left, and that they are normal expressions. So from the compilers perspective the line

sample1 = sample2 = 0;

is the same as

sample1 = (sample2 = 0);

which is the same as

sample2 = 0;
sample1 = sample2;

That is, sample2 is assigned zero, then sample1 is assigned the value of sample2 . In practice the same as assigning both to zero as you guessed.

Formally, for two variables t and u of type T and U respectively

T t;
U u;

the assignment

t = u = X;

(where X is some value) is interpreted as

t = (u = X);

and is equivalent to a pair of independent assignments

u = X;
t = (U) X;

Note that the value of X is supposed to reach variable t "as if" it has passed through variable u first, but there's no requirement for it to literally happen that way. X simply has to get converted to type of u before being assigned to t . The value does not have to be assigned to u first and then copied from u to t . The above two assignments are actually not sequenced and can happen in any order, meaning that

t = (U) X;
u = X;

is also a valid execution schedule for this expression. (Note that this sequencing freedom is specific to C language, in which the result of an assignment in an rvalue. In C++ assignment evaluates to an lvalue, which requires "chained" assignments to be sequenced.)

There's no way to say whether it is a good or bad programming practice without seeing more context. In cases when the two variables are tightly related (like x and y coordinate of a point), setting them to some common value using "chained" assignment is actually perfectly good practice (I'd even say "recommended practice"). But when the variables are completely unrelated, then mixing them in a single "chained" assignment is definitely not a good idea. Especially if these variables have different types, which can lead to unintended consequences.

I think there is no good answer on C language without actual assembly listing :)

So for a simplistic program:

int main() {
        int a, b, c, d;
        a = b = c = d = 0;
        return a;
}

I've got this assemly (Kubuntu, gcc 4.8.2, x86_64) with -O0 option of course ;)

main:
        pushq   %rbp
        movq    %rsp, %rbp

        movl    $0, -16(%rbp)       ; d = 0
        movl    -16(%rbp), %eax     ; 

        movl    %eax, -12(%rbp)     ; c = d
        movl    -12(%rbp), %eax     ;

        movl    %eax, -8(%rbp)      ; b = c
        movl    -8(%rbp), %eax      ;

        movl    %eax, -4(%rbp)      ; a = b
        movl    -4(%rbp), %eax      ;

        popq    %rbp

        ret                         ; return %eax, ie. a

So gcc is actually chaining all the stuff.

The results are the same. Some people prefer chaining assignments if they are all to the same value. There is nothing wrong with this approach. Personally, I find this preferable if the variables have closely related meanings.

You can yourself decide that this way of coding is good or bad.

  1. Simply see the assembly code for the following lines in your IDE.

  2. Then change the code to two separate assignments, and see the differences.

In addition to this, you can also try turning off/on optimizations (both Size & Speed Optimizations) in your compiler to see how that affects the assembly code.

As noticed earlier,

sample1 = sample2 = 0;

is equal to

sample2 = 0;
sample1 = sample2;

The problem is that riscy asked about embedded c , which is often used to drive registers directly. Many of microcontroller's registers have a different purpose on read and write operations. So, in gereral case, it is not the same , as

sample2 = 0;
sample1 = 0;

For example, let UDR be a UART data register. Reading from UDR means getting the recieved value from the input buffer, while writing to UDR means putting the desired value into transmit buffer and hitting the communication. In that case,

sample = UDR = 0;

means the following: a) transmit value of zero using UART ( UDR = 0; ) and b) read input buffer and place data into sample value ( sample = UDR; ).

You could see, the behavior of embedded system could be much more complicated than the code writer may expect to be. Use this notation carefully while programming MCUs.

sample1 = sample2 = 0;

does mean

sample1 = 0;
sample2 = 0;  

if and only if sample2 is declared earlier.
You can't do this way:

int sample1 = sample2 = 0; //sample1 must be declared before assigning 0 to it

Regarding coding style and various coding recommendations see here: Readability a=b=c or a=c; b=c;?

I beleive that by using

sample1 = sample2 = 0;

some compilers will produce an assembly slightly faster in comparison to 2 assignments:

sample1 = 0;
sample2 = 0;

specially if you are initializing to a non-zero value. Because, the multiple assignment translates to:

sample2 = 0; 
sample1 = sample2;

So instead of 2 initializations you do only one and one copy. The speed up (if any) will be tiny but in embedded case every tiny bit counts!

As others have said, the order in which this gets executed is deterministic. The operator precedence of the = operator guarantees that this is executed right-to-left. In other words, it guarantees that sample2 is given a value before sample1.

However , multiple assignments on one row is bad practice and banned by many coding standards (*). First of all, it is not particularly readable (or you wouldn't be asking this question). Second, it is dangerous. If we have for example

sample1 = func() + (sample2 = func());

then operator precedence guarantees the same order of execution as before (+ has higher precedence than =, therefore the parenthesis). sample2 will get assigned a value before sample1. But unlike operator precedence, the order of evaluation of operators is not deterministic, it is unspecified behavior. We can't know that the right-most function call is evaluated before the left-most one.

The compiler is free to translate the above to machine code like this:

int tmp1 = func();
int tmp2 = func();
sample2 = tmp2;
sample1 = tmp1 + tmp2;

If the code depends on func() getting executed in a particular order, then we have created a nasty bug. It may work fine in one place of the program, but break in another part of the same program, even though the code is identical. Because the compiler is free to evaluate sub-expressions in any order it likes.


(*) MISRA-C:2004 12.2, MISRA-C:2012 13.4, CERT-C EXP10-C.

Take care of this special case ... suppose b is an array of a structure of the form

{
    int foo;
}

and let i be an offset in b. Consider the function realloc_b() returning an int and performing the reallocation of array b. Consider this multiple assignment:

a = (b + i)->foo = realloc_b();

To my experience (b + i) is solved first, let us say it is b_i in the RAM ; then realloc_b() is executed. As realloc_b() changes b in RAM, it results that b_i is no more allocated.

Variable a is well assigned but (b + i)->foo is not because b as been changed bythe execution of the most left term of the assignment ie realloc_b()

This may cause a segmentation fault since b_i is potentially in an unallocated RAM location.

To be bug free, and to have (b + i)->foo equal to a, the one-line assignment must be splitted in two assignments:

a = reallocB();
(b + i)->foo = a;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM