Understanding async code with .NET Reflector

Yesterday, I gave an overview of the problems that async solves, and how it actually solves them. As we saw in that post, async is remarkably cunning, and can become remarkably tricky to follow when applied in a real application, which naturally makes any decompilation a bit of a challenge. If you haven’t read the previous post on how async works already, I recommend you at least skim through it so that you know what examples we’ll be working with, and so that we’re all working with the same understanding of state machines.

Now, with that introduction done, we should have a look at some real code.

We are currently implementing support for the decompilation of async inside .NET Reflector, so all the code you’ll see in this post will be generated by Reflector, both before and after the translation phase that deals with the async state machines.

You need to be aware that there are several implementations of async in the code that we’ll see, because there are different implementations of async in the code that you find in the .NET 4.5 framework libraries compared with code that is compiled using the Dev11 C# 5 compiler.  For our Reflector work we are currently working on the former, mainly because the fundamental part of the work (the processing of the state machine) is the same in both cases, and the framework libraries contain hundreds of potential test cases.

Real code

If you look at the ReadAsyncInternal method in the .NET 4.5 version of mscorlib, you’ll see that it has the following code:

the ReadAsyncInternal method in the .NET 4.5 version of mscorlib
Click for an enlarged version

Remember, from our earlier example of the simple string-reader program, the compiler generates a class which implements the state machine, setting the various fields corresponding to the arguments and the local variables, which get lifted into this class. The <>t__builder is an instance of AsyncTaskMethodBuilder, which is used to do the plumbing between the client and the implementation of the async method, and we’ll see it used in the MoveNext method.

The compiler-generated class has the following form:

The compiler-generated class in the MoveNext method
Click for an enlarged version

Notice the state value field <>1__state, and the field <>4__ this which allows access back to the original instance of StreamReader which constructed this instance of the state machine.

For us, the really interesting part is the code in the MoveNext method:

the code in the MoveNext method
Click for an enlarged version

The outer Try has a body that ends with the code:

The outer Try in the MoveNext method.

Let’s take a quick walk through the implications of this.

Stepping through the code

In the exception case we set the state to -1, which is a state that causes the MoveNext to return instantly while remaining in that state. In the success case, we store the function result value using the SetResult, which ends up setting the value into the Task associated with this invocation of the method. Likewise, the SetException sets the Exception property of the Task object, taking care to specially handle the OperationCanceledException, which is used by the Task library to communicate that a synchronous cancellation request was successfully honoured. After this, anyone waiting on the Task will be notified that they can continue running via the usual implementation in the Task class.

It’s fairly easy to follow the code around while keeping an eye on the state transitions, but the most interesting thing that happens occurs when we call out to some asynchronous method, such as in the following code fragment:

Calling out to some asynchronous method.
Click for an enlarged version

Our method is calling into the ReadAsync method:

Calling into the ReadSync method.
Click for an enlarged version

This returns a Task which will eventually contain a count of the number of bytes read, potentially also modifying the arrays that were passed in as arguments.

We now get a slight complication. In order to control whether the synchronization context and other information needs to be preserved, the Task is passed through the .ConfigureAwait method before its GetAwaiter is called. This result of the call to GetAwaiter() supports the Await pattern; it is expected to have certain methods implemented on it, much like an Enumerable is expected to have a GetEnumerator method. Let’s take a look at how this process can proceed.

First we have the quick path: when IsCompleted returns true to say that the result is already available, we can jump straight to label 03AD. Otherwise, things get a little more complicated: we need to save any state we’re going to need later into a field in the current object, set up the OnCompleted action of the task so that the MoveNext method is going to be called again, set the state field to the next state, and return. At some point, assuming no exceptions, we’ll be called back, the state value will ensure that we reach label 0390, where we can get back the values that we saved away, and then continue to the same place that we’d have gone in the fast path.

At that location, we’ll get the result of the asynchronous operation using GetResult and carry on as if we had been running all the time. Notice that all of the thread transitions (if any are required) are hidden away inside the object that is returned by GetAwaiter, which is responsible for getting us to the right place before it calls the delegate that was passed into OnCompleted.

Not one compiler, but two

As I mentioned earlier, part of the challenge in understand all this comes as a result that we’re targeting at least two, subtly different, compilers. Thus far we’ve been looking at the results of the compiler used to generate the .NET 4.5 framework libraries, so how does this all change if you use the compiler that ships with the latest Dev11 beta?

First there is now an interface, IAsyncStateMachine, that constrains the interface between the state machine and the code that uses it:

The IAsyncStateMachine interface.

This is implemented by the compiler generated type which is now a struct instead of a class:

IAsyncStateMachine implemented by the compiler-generated type.

The initialization code also looks a little different:

IAsyncStateMachine initialization code.

The body of the MoveNext is still surrounded by a Try Catch block as before, though the code to fire off an asynchronous call now has slightly fewer lines to it, using the builder to do more of the work. As before, if the operation doesn’t instantly complete, the state is set, and we set up a notification from the called method so that we’ll be brought back after the operation completes. In order to do this, we need to pass in the awaiter and the state machine object by using the ref modifier on the call:

Passing in the awaiter using the ref modifier.
Click for an enlarged version

The idea is the same as the previous implementation, though the mechanics of capturing the relevant synchronization and execution contexts is a little different. If you want to see what I mean, take a look at the code in the AsyncTaskmethodBuilder class in System.Runtime.CompilerServices in the .NET 4.5 version of mscorlib.dll.

By looking out for the pattern of Async code, Reflector can fold code back into the form using async/await. So, for example, the above example will be displayed as:

Async decompilation.
Click for an enlarged version

Summary

Async is implemented using a compiler transform to implement code as a state machine, though this requires a few support classes in the runtime library, and that a set of methods need to be implemented on types that support asynchronous calls following the Await pattern. No changes are required to the CLR, though some support has been added inside mscorlib, and extra methods have been added to the Task<…> type to support the calls of the Awaiter pattern.

From the point of view of a tool like Reflector, having the compiler do a transform, rather than implementing the feature as something built into the CLR, unsurprisingly makes life harder for a decompiler. There’s no metadata to say some set of IL instructions are here as the result of the translation of an await, so instead the decompiler needs to look for the pattern of calls that is generated by the C# compiler. This is exactly the kind of thing that is already done to recognise lambda expressions and iterator blocks, which are also implemented by the compiler rather than the CLR. Given that, as we saw in the previous post, a basic state machine can be lashed together using lambda expressions and iterator blocks, the similarities in the both the implementation and our solution are no surprise.

Async really seems to have hit the sweet spot for a technology that allows you to fairly transparently run non-blocking code on a single thread, and user code looks very much like it would for a straightforward blocking implementation. In fact, you can almost write it that way first, and then change it into non-blocking code by just changing the method’s return type to Type<…>, marking it with async, and then using await and asynchronous versions of methods that you call.

Naturally, it isn’t quite that simple in practice. For example, having a function return too early might not interact well with constructs such as locks and exception handlers, and potentially offers the chance of re-entrancy in cases that were previously safe. Nevertheless, async is not as opaque as perhaps it first seems, and you could learn a huge amount by using .NET Reflector to start investigating C#5 as soon as possible.

Leave a Reply

Your email address will not be published. Required fields are marked *