Microsoft.NET

……………………………………………….Expertise in .NET Technologies

Primitive, Reference and Value Types

Posted by Ravi Varma Thumati on March 16, 2009

Introduction

Many .NET programmers still do not realize the difference between Primitive, Reference and Value types, or even do not know what they really are or how to use them correctly. Misuse of any type can lead to subtle bugs that are very hard to find. It is very important to understand the different types that .NET Framework supports. In this article, I am going to talk about these different types briefly.

Primitive types

Data types that can be mapped directly to types that exist in the base class library are called Primitive. For example, the type “int” is mapped to System.Int32, “short” is mapped to System.Int16, and so on. In fact, all data types in .NET are derived from the System.Object class. The following two classes are equivalent (in C#):

// Class implicitly derive from System.Object
class Car
{
};
/***************************************************/

// Class explicitly derive from System.Object
Class Car: System.Object
{
};

There is no difference between them, the upper class inherits implicitly from System.Object whereas the lower one inherits explicitly. If you want to verify this for the upper class: set a breakpoint on the variable representing the instance. Run the debugger. Now press F10 to step over it and open the Locals Window: in my case, the variable is obj.

Now, you discover the System.Object inheritance.

 

The following two types are equivalent, too:

int n = 0;         // use managed C++ type
System::Int32 n = 0;     // use .NET native type

Value types

Value types inherit from the System.ValueType class, which in turn, inherits from System.Object. However, you can not inherit directly from the System.ValueType class. If you try to inherit explicitly from System.Value, you’ll get the C3838 compiler error. Value types have several special properties:

  • Value types are stored on the stack. (We’ll discover that shortly.)
  • Value type objects have two representations: an unboxed form and a boxed form.
  • Value types are accessed directly which means you don’t need to use the new operator.
  • Value types are managed types, they are initialised to 0 when they are created.
  • Value type instances are not under the control of the Garbage Collector.
  • Value types are implicitly sealed, which means that no class can derive from them.
  • In C#, structs are always value types and allocated on the stack.

Let’s see how a value type is passed to a function. The following is a Managed C++ Console Application source file:

#include "stdafx.h"

#using <mscorlib.dll>
using namespace System;

// __value keyword means create instance on the stack.

__value class Student
{
public:
int age;
int month;
int day;
int year;
};

// Function that takes Student as parameter.
void Take (Student obj)
{
// It does nothing, just needed for debugging.
}

void main()
{
  Student Nad = {19, 7, 14, 1986};
  Take (Nad); // <--- Set a breakpoint here.
}

Now open the disassembler window and you will see the following output:

00000031          lea eax,[ebp-10h]
    ; loads instance (Nad) address into EAX register,
    ; note that the value of EAX is very close
    ; to EBP register, which means that Nad resides on the current stack.

00000034         push dword ptr [eax+0Ch] ; push year   (1986)
00000037         push dword ptr [eax+8]   ; push day    (14)
0000003a         push dword ptr [eax+4]   ; push month  (7)
0000003d         push dword ptr [eax]     ; push age    (19)
0000003f         call dword ptr ds:[009E5C7Ch] ; call Take
00000045         xor esi,esi 

}

Now, you see how the instance (Nad) was passed by value, which means that each member of the class is copied into the stack before calling the desired function. This method is lightweight only if the instance of the type is not frequently passed as a parameter to methods.

Reference types

Reference types inherit directly from System.Object, they offer many advantages over Value types:

  • Reference types are stored on the managed heap, thus they are under the control of Garbage Collector.
  • Two or more reference type variables can refer to a single object in the heap, allowing operations on one variable to affect the object referenced by the other variable.

The variable representing the instance contains a pointer to the instance of the class, it is dereferenced first if you want to access any of its members. In C#, Classes are always reference types and created on the managed heap.

Let’s see how a reference type is passed to a function. The following is the same as the example above:

#include "stdafx.h"

#using <mscorlib.dll>
using namespace System;

// __gc keyword means create instance on the managed heap.

__gc class Student
{
public:
int age;
int month;
int day;
int year;
};

// Function that takes Student as parameter.
void Take (Student *obj)
{
// It does nothing, just needed for debugging.
}

void main()
{
 Student *Nad = new Student; // create instance
 Nad->age = 19; Nad->month = 7; Nad->day = 14; Nad->year = 1986;

 Take (Nad); // <--- Set a breakpoint here.
}

Now open the disassembler windows and you will see the following output:

 ; Here ESI holds pointer to the instance of the class.

 ; members initialization:
00000020   mov dword ptr [esi+4],13h    ; initialize age (19)
00000027   mov dword ptr [esi+8],7      ; initialize month (7)
0000002e   mov dword ptr [esi+0Ch],0Eh  ; initialize day (14)
00000035   mov dword ptr [esi+10h],7C2h ; initialize year (1986)

0000003c   mov ecx,esi ; <-- here pointer is passed to ECX before calling function.
0000003e   call dword ptr ds:[009E5C7Ch] ; call Take
00000044   xor edi,edi

Now you noticed how the instance was passed to the function by reference, none of the members were copied to the stack as in the value type’s case. The pointer to the instance was loaded into ECX before calling the function; it is like the fast calling convention because the function expects its parameter in a register instead of the stack. Now if the function were designed to access any of the class members, it would dereference the ECX value first.

Sample Image

ECX points to the newly created instance on the managed heap.

What is Boxing?

Boxing is a mechanism for converting value types to reference types. Boxing wraps a value type in a box so that it can be used where an object reference is needed. Consider the following lines of code (Managed C++):

 int x=10;
 Console::WriteLine (S"The value of x is {0}", x);

You’ll get the C2665 compiler error because the method expects a reference type as the second parameter. To fix this, add the keyword __box to the second parameter:

 Console::WriteLine (S"The value of x is {0}", __box(x));

What happens when you use the “__box” keyword?

  • Memory is allocated on the managed heap.
  • The value of the value type is copied into the memory allocated.
  • The address of the memory is returned, which is the newly created managed object.

Note that some compilers, like C#, use implicit boxing by emitting some code necessary to box data types. For example, the following C# lines of code compile fine:

int x=10;
Console.WriteLine (S"The value of x is {0}", x);

In the preceding example, the C# compiler detected that I was passing a value type for a method that expects a reference type and it automatically used boxing.

What is Unboxing?

Unboxing is a mechanism for converting reference types to value types. Consider the following lines of code (Managed C++):

#include "stdafx.h"

#using <mscorlib.dll>
using namespace System;

// __value keyword means create instance on the stack.

__value class Student
{
public:
int age;
int month;
int day;
int year;
};

void main()
{
 Student Nad = {19, 7, 14, 1986};     // create instance (on the stack)
 Object *obj = __box(Nad);            // box Nad into obj

 // Unbox obj. You can also use
 // "*dynamic_cast <__box Student*> (obj)"
 Student b = *(__box Student*) obj;
 Console::WriteLine (b.age);        // display age
}

Well, in Managed C++, a boxed object of a __value class can be unboxed by using the dynamic_cast operator, or using old C-style cast as I did above in order to obtain a __gc pointer to the object that is stored in the “box” on the Common Language Runtime heap. The __gc pointer can then be dereferenced to obtain a copy of the object of the __value class.

In C#, this is done automatically:

 Int32  x = 10; // create variable
 Object o = x;  // box x
 Int32  d = (Int32) o; // unbox "o"

Conclusion

In C#, classes are implicitly created as reference types whereas structures are value types. Boxing and unboxing are implicitly done with some compilers like C#, whereas in Managed C++, they are done explicitly using __box and dynamic_cast operators, respectively. Value types act like primitive types and are lighter weight than reference types. Reference types are used in collections such as ArrayList, Hashtable and so on, or used when passed frequently as parameters to methods.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: