Digging in MSIL
As part of my work I sometimes have to verify that an installer contains only rebuild code for some of the sub modules. To achieve this I use the ildasm.exe tool to decompile the assemblies and then make a text comparison of the IL output.
During examination of the output I encountered some weird IL output that I couldn't directly map to the source code and that was changing every time the code was rebuild even though the source code didn't change. The characteristic of this output was always <privateimplementationdetails>{SOME GUID}.
In an effort to understand what this was I tried googling for any information that could explain what this was and how I could verify that there actually was no source code change. But Google disappointed me and didn't give any answer.
Instead I (and my colleague) started making code samples to get IL code with this <privateimplementationdetails> output. After some binary search in code known to produce this we found the code structure, or at lest one, that could reproduce this.
The following code sample will show you code that both do and don't produce <privateimplementationdetails> in the IL output.
As it's clear from the IL there is clearly a difference between assigning a char array with 2 and 3 elements. This goes for most value types tested (summary at the end).
Expanding the the search to reference types the following was observed.
What is this <privateimplementationdetails> anyway?
To the best of my knowledge, and this is only an educated guess from what I have observed, these private implementation details are just that. something that the C# compiler does behind your back, presumably to preserve stack usage.
What types get this treatment?
As far as I have observed, again this is not meant to be an exhaustive list, the numeric value types behaves this way. That is byte, sbyte, short, ushort, int, uint, long, ulong, float and double. When using reference types this compiler trick doesn't seem to invoked.
It's also worth noting that even though enums are value types they are treated like reference types in this case. String, known to get special treatment are also just reference types in this case.
When can I expect this behavior from the compiler?
I have only seen the <privateimplementationdetails> been generated by the C# compiler. I have tried making the equivalent code in VB.NET and they just replicate the way C# handles 2 items, by using the stack for each element. In C# I have tried with version 2.0 and 4, both version produces the same IL code.
As part of my work I sometimes have to verify that an installer contains only rebuild code for some of the sub modules. To achieve this I use the ildasm.exe tool to decompile the assemblies and then make a text comparison of the IL output.
During examination of the output I encountered some weird IL output that I couldn't directly map to the source code and that was changing every time the code was rebuild even though the source code didn't change. The characteristic of this output was always <privateimplementationdetails>{SOME GUID}.
In an effort to understand what this was I tried googling for any information that could explain what this was and how I could verify that there actually was no source code change. But Google disappointed me and didn't give any answer.
Instead I (and my colleague) started making code samples to get IL code with this <privateimplementationdetails> output. After some binary search in code known to produce this we found the code structure, or at lest one, that could reproduce this.
The following code sample will show you code that both do and don't produce <privateimplementationdetails> in the IL output.
class CharClass { char[] char_1 = new[] { 'a', 'b' }; char[] char_2 = new[] { 'c', 'd', 'e' }; }The resulting IL code looks like the following (excluding the meta data).
.class private auto ansi beforefieldinit CharClass extends [mscorlib]System.Object { .field private char[] char_1 .field private char[] char_2 .method public hidebysig specialname rtspecialname instance void .ctor() cil managed { // Code size 55 (0x37) .maxstack 4 .locals init ([0] char[] CS$0$0000) IL_0000: ldarg.0 IL_0001: ldc.i4.2 IL_0002: newarr [mscorlib]System.Char IL_0007: stloc.0 IL_0008: ldloc.0 IL_0009: ldc.i4.0 IL_000a: ldc.i4.s 97 IL_000c: stelem.i2 IL_000d: ldloc.0 IL_000e: ldc.i4.1 IL_000f: ldc.i4.s 98 IL_0011: stelem.i2 IL_0012: ldloc.0 IL_0013: stfld char[] CharClass::char_1 IL_0018: ldarg.0 IL_0019: ldc.i4.3 IL_001a: newarr [mscorlib]System.Char IL_001f: dup IL_0020: ldtoken field valuetype '<privateimplementationdetails>{F325A222-827B-4DCA-B78A-B2EC85FEBAD3}'/'__StaticArrayInitTypeSize=6' '<privateimplementationdetails>{F325A222-827B-4DCA-B78A-B2EC85FEBAD3}'::'$$method0x6000001-1' IL_0025: call void [mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [mscorlib]System.Array, valuetype [mscorlib]System.RuntimeFieldHandle) IL_002a: stfld char[] CharClass::char_2 IL_002f: ldarg.0 IL_0030: call instance void [mscorlib]System.Object::.ctor() IL_0035: nop IL_0036: ret } // end of method CharClass::.ctor } // end of class CharClass .class private auto ansi '<privateimplementationdetails>{F325A222-827B-4DCA-B78A-B2EC85FEBAD3}' extends [mscorlib]System.Object { .custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = ( 01 00 00 00 ) .class explicit ansi sealed nested private '__StaticArrayInitTypeSize=6' extends [mscorlib]System.ValueType { .pack 1 .size 6 } // end of class '__StaticArrayInitTypeSize=6' .field static assembly valuetype '<privateimplementationdetails>{F325A222-827B-4DCA-B78A-B2EC85FEBAD3}'/'__StaticArrayInitTypeSize=6' '$$method0x6000001-1' at I_00002050 } // end of class '<privateimplementationdetails>{F325A222-827B-4DCA-B78A-B2EC85FEBAD3}' // ============================================================= .data cil I_00002050 = bytearray ( 63 00 64 00 65 00) // c.d.e.
As it's clear from the IL there is clearly a difference between assigning a char array with 2 and 3 elements. This goes for most value types tested (summary at the end).
Expanding the the search to reference types the following was observed.
class ObjectClass { object[] Object_1 = new[] { new object(), new object() }; object[] Object_2 = new[] { new object(), new object(), new object() }; }Results in
.class private auto ansi beforefieldinit ObjectClass extends [mscorlib]System.Object { .field private object[] Object_1 .field private object[] Object_2 .method public hidebysig specialname rtspecialname instance void .ctor() cil managed { // Code size 76 (0x4c) .maxstack 4 .locals init ([0] object[] CS$0$0000) IL_0000: ldarg.0 IL_0001: ldc.i4.2 IL_0002: newarr [mscorlib]System.Object IL_0007: stloc.0 IL_0008: ldloc.0 IL_0009: ldc.i4.0 IL_000a: newobj instance void [mscorlib]System.Object::.ctor() IL_000f: stelem.ref IL_0010: ldloc.0 IL_0011: ldc.i4.1 IL_0012: newobj instance void [mscorlib]System.Object::.ctor() IL_0017: stelem.ref IL_0018: ldloc.0 IL_0019: stfld object[] ObjectClass::Object_1 IL_001e: ldarg.0 IL_001f: ldc.i4.3 IL_0020: newarr [mscorlib]System.Object IL_0025: stloc.0 IL_0026: ldloc.0 IL_0027: ldc.i4.0 IL_0028: newobj instance void [mscorlib]System.Object::.ctor() IL_002d: stelem.ref IL_002e: ldloc.0 IL_002f: ldc.i4.1 IL_0030: newobj instance void [mscorlib]System.Object::.ctor() IL_0035: stelem.ref IL_0036: ldloc.0 IL_0037: ldc.i4.2 IL_0038: newobj instance void [mscorlib]System.Object::.ctor() IL_003d: stelem.ref IL_003e: ldloc.0 IL_003f: stfld object[] ObjectClass::Object_2 IL_0044: ldarg.0 IL_0045: call instance void [mscorlib]System.Object::.ctor() IL_004a: nop IL_004b: ret } // end of method ObjectClass::.ctor } // end of class ObjectClassClearly reference types do not get the same treatment as value types.
What is this <privateimplementationdetails> anyway?
To the best of my knowledge, and this is only an educated guess from what I have observed, these private implementation details are just that. something that the C# compiler does behind your back, presumably to preserve stack usage.
What types get this treatment?
As far as I have observed, again this is not meant to be an exhaustive list, the numeric value types behaves this way. That is byte, sbyte, short, ushort, int, uint, long, ulong, float and double. When using reference types this compiler trick doesn't seem to invoked.
It's also worth noting that even though enums are value types they are treated like reference types in this case. String, known to get special treatment are also just reference types in this case.
When can I expect this behavior from the compiler?
I have only seen the <privateimplementationdetails> been generated by the C# compiler. I have tried making the equivalent code in VB.NET and they just replicate the way C# handles 2 items, by using the stack for each element. In C# I have tried with version 2.0 and 4, both version produces the same IL code.
Comments
Post a Comment