Is C# pass by reference or pass by value?

Posted on

One of the first things we learn with our first programming language is the difference between passing by value and passing by reference. We learn that, in .NET, structs are passed “by value” and classes are passed “by reference”. Ask anyone who is a reasonably experienced C# programmer and they will probably tell you that “Structs are passed by value, classes are passed by reference”, as if repeating a mantra.

We say this because when we pass a struct to a method and we modify the structure in that method we are modifying a copy, not the original. With classes however the reverse is true; if we modify it we are modifying the original because we are modifying it through a reference, and both variables point to the same object on the heap. For example:

var a = new Person { Name = "Alex" };
var b = a;

b.Name = "Sam";

Console.WriteLine(a.Name); // Prints "Sam"

This is all true and well-understood, but in actual fact this is not what “passing by reference” means. It may sound like splitting hairs, but this is “passing a reference by value“; the confusion arises from the fact that the values you are passing are references.

The reason this is confusing is because high-level OO languages such as C# deliberately hide from you the existence of references. You, as the programmer, never need to explicitly dereference a pointer and the syntaxes at method calls and variable assignments do not give any indication that the value is a reference.

Funnily enough C# does actually allow passing by reference by use of the ref keyword, but most C# programmers (those that don’t work in performance-critical domains) won’t be very familiar with it.

Allow me to explain with a simple example; let’s ignore the two cases of structs and classes for a moment and imagine we’re working in a language that’s just like C# except that it only has structs (value types); there is no object heap at all. This will allow us to look at variable passing in isolation. Now, imagine we have written two methods in our language, F and G, and we want to call G from F, passing a value as we do so:

void F()
{
   int x = 5;

   G(x);
}

void G(int y)
{
   // ...
}

The stack for these two methods might look something like this:

In order to pass the value 5 between our two methods F and G we need to somehow transfer this value to the new stack frame. The two choices we have are to either (a) copy the value from the old stack frame into a new stack frame:

or (b) create a reference from the variable in the new stack frame to variable in the old stack frame:

Scenario (a) is what we call pass-by-value (the value is copied into the new stack frame) and scenario (b) is pass-by-reference, that is passing a value into a method call creates a reference to the original value instead of performing a copy.

Note: I am talking exclusively about passing arguments to methods in my examples, but the same logic applies to simple variable assignment as well.

Let’s pause for a moment and take in the fact that we’ve introduced the concepts of pass-by-value and pass-by-reference even though we’re talking about a play language that has no reference types. This is an important realisation: that passing by reference is a completely distinct concept from that of a reference type.

Now that we’ve reached this stage of enlightenment we can proceed with the introduction of reference types into our theoretical language. Let’s see how the presence of an object heap changes our diagram:

You can see that the variable x, although it still lives on the stack, is now just reference to an object on the heap. What happens when we pass this variable to the call of the second function G?

The reference is copied! This is pass-by-value. The confusion is because the value that we’re passing is a reference, it’s understandable that people think this is pass-by-reference. “So”, you may ask, “What is passing a reference type by reference?”. Let’s take the exact same process we did for value types and apply it to the reference:

It’s a double reference, that is: a reference to a reference to an object. Again, C# completely hides this from you by automatically dereferencing when you make use of the variable, such as in this example:

public void M(ref Person p)
{
    Console.WriteLine(p.Name); // <-- `p` is automatically dereferenced here
}

Now that we’ve been through all the scenarios, let’s consolidate our diagrams into a matrix so we can see the four different scenarios laid out:

As far as I can tell the confusion has arisen because of a lack of common understanding of what pass-by-reference means. This is something that even seasoned developers can’t quite agree on. The situation is not helped by the fact that Microsoft’s own documentation seems to use the term pass by reference in place of “passing a reference”.

Whenever someone asks me whether C# is pass-by-reference or pass-by-value I tailor my answer to the person asking. To a novice I will say “pass-by-reference” because they’re unlikely to understand the necessary nuance of the real answer.

If an experienced but non-C# dev asks me I normally hedge my bets by saying “C# is pass-reference-by-value“. That normally clears things up by making them ask for more info!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s