简体   繁体   中英

Variables ending with “1” have the “1” removed within ILSpy. Why?

In an effort to explore how the C# compiler optimizes code, I've created a simple test application. With each test change, I've compiled the application and then opened the binary in ILSpy.

I just noticed something that, to me, is weird. Obviously this is intentional, however, I can't think of a good reason why the compiler would do this.

Consider the following code:

static void Main(string[] args)
{
    int test_1 = 1;
    int test_2 = 0;
    int test_3 = 0;

    if (test_1 == 1) Console.Write(1);
    else if (test_2 == 1) Console.Write(1);
    else if (test_3 == 1) Console.Write(2);
    else Console.Write("x");
}

Pointless code, but I had written this to see how ILSpy would interpret the if statements.

However, when I compiled/decompiled this code, I did notice something that had me scratching my head. My first variable test_1 was optimized to test_ ! Is there a good reason why the C# compiler would do this?

For full inspection this is the output of Main() that I'm seeing in ILSpy.

private static void Main(string[] args)
{
    int test_ = 1; //Where did the "1" go at the end of the variable name???
    int test_2 = 0;
    int test_3 = 0;
    if (test_ == 1)
    {
        Console.Write(1);
    }
    else
    {
        if (test_2 == 1)
        {
            Console.Write(1);
        }
        else
        {
            if (test_3 == 1)
            {
                Console.Write(2);
            }
            else
            {
                Console.Write("x");
            }
        }
    }
}

UPDATE

Apparently after inspecting the IL, this is an issue with ILSpy, not the C# compiler. Eugene Podskal has given a good answer to my initial comments and observations. However, I am interested in knowing if this is rather a bug within ILSpy or if this is intentional functionality.

It is probably some problem with decompiler. Because IL is correct on .NET 4.5 VS2013:

.entrypoint
  // Code size       79 (0x4f)
  .maxstack  2
  .locals init ([0] int32 test_1,
           [1] int32 test_2,
           [2] int32 test_3,
           [3] bool CS$4$0000)
  IL_0000:  nop
  IL_0001:  ldc.i4.1
  IL_0002:  stloc.0

edit: it uses data from .pdb file(see this answer ) to get correct name variables. Without pdb it will have variables in form V_0, V_1, V_2 .

EDIT:

Variable name mangles in the file NameVariables.cs in method:

public string GetAlternativeName(string oldVariableName)
{
    if (oldVariableName.Length == 1 && oldVariableName[0] >= 'i' && oldVariableName[0] <= maxLoopVariableName) {
        for (char c = 'i'; c <= maxLoopVariableName; c++) {
            if (!typeNames.ContainsKey(c.ToString())) {
                typeNames.Add(c.ToString(), 1);
                return c.ToString();
            }
        }
    }

    int number;
    string nameWithoutDigits = SplitName(oldVariableName, out number);

    if (!typeNames.ContainsKey(nameWithoutDigits)) {
        typeNames.Add(nameWithoutDigits, number - 1);
    }

    int count = ++typeNames[nameWithoutDigits];

    if (count != 1) {
        return nameWithoutDigits + count.ToString();
    } else {
        return nameWithoutDigits;
    }
}

NameVariables class uses this.typeNames dictionary to store names of variables without ending number (such variables mean something special to ILSpy, or perhaps even to IL, but I actually doubt it) associated with counter of their appearances in the method to decompile.

It means that all variables ( test_1, test_2, test_3 ) will end in one slot ("test_") and for the first one count var will be one, resulting in execution:

else {
    return nameWithoutDigits;
}

where nameWithoutDigits is test_

EDIT

First, thanks @HansPassant and his answer for pointing the fault in this post.

So, the source of the problem:

ILSpy is as smart as ildasm, because it also uses .pdb data (or how else does it get test_1, test_2 names at all). But its inner workings are optimized for use with assemblies without any debug related info, hence its optimizations related to dealing with V_0, V_1, V_2 variables works inconsistently with the wealth of metadata from .pdb file.

As I understand, the culprit is an optimization to remove _0 from lone variables.

Fixing it will probably require propagating of the fact of .pdb data usage into the variable name generations code.

Well, it is a bug. Not much of a bug, fairly unlikely that anybody ever filed a bug report for it. Do note that Eugene's answer is very misleading. ildasm.exe is smart enough to know how to locate the PDB file for an assembly and retrieve debugging info for the assembly. Which includes the names of local variables.

This is not normally a luxury available to a disassembler. Those names are not actually present in the assembly itself and they invariably have to make-do without the PDB. Something you can see in ildasm.exe as well, just delete the .pdb files in the obj\\Release and bin\\Release directories and it now looks like this:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       50 (0x32)
  .maxstack  2
  .locals init (int32 V_0,
           int32 V_1,
           int32 V_2)
  IL_0000:  ldc.i4.1
  // etc...

Names like V_0 , V_1 etcetera are of course not great, a disassembler usually comes up with something better. Something like "num".

So, kinda clear where the bug in ILSpy is located, it too reads the PDB file but fumbles the symbol it retrieves. You could file the bug with the vendor, pretty unlikely they'll treat it as a high-priority bug however.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM