April 17th, 2011

glider

c# 4.0: dynamic

dynamic x = 2;
var y = x * 2;

how it works, part I:
* C# compiler emits two entities to reify this construct: a callsite, and a binder.
* The callsite has the type of CallSite<Func<CallSite, object, int, object>> (dynamic first argument and dynamic return value get reified as instances of Object).
* The binder is an instance of type CSharpBinaryOperationBinder that gets instantiated with ExpressionType.Multiply and several bookkeeping flags.
* In Reflector, in place of x * 2, one can see the following: callSite.Target(callSite, x, 2).
* Target is a delegate field, it holds a compiled implementation of "x * 2".
* Upon the first call to the callsite, Target is null, and it gets generated by the binder.
* Upon subsequent calls, Target is reused, which makes dynamic code almost as fast as statically compiled code.

Now we'll proceed with finding out how exactly the binder generates an implementation of a callsite.

how it works, part II:
* Target gets generated with "Func<CallSite, T0, T1, Ret> UpdateDelegates.UpdateAndExecute2<T0, T1, Ret>(CallSite callSite, T0 arg0, T1 arg1)". By the way, if you take a look at the UpdateDelegates class, you'll notice shitloads of UpdateAndExecute methods (that generate actions and funcs of 0..10 arguments, 20 methods total). Pretty gross!
* UpdateAndExecute is a huge method that does lots of things. We'll look into them further, however for the moment it's sufficient to know that UAE boils down to calling binder.BindCore(callSite, new []{arg0, arg1}).
* Binders are completely customizable, i.e. you can override the Bind method with any logic you wish, however, standard binders (e.g. those that are generated by C# compiler) are all subclasses of DynamicMetaObjectBinder. Further in this section, I'll describe only standard binders - keep this in mind.
* Before proceeding, standard binder converts all arguments to DynamicMetaObjects: 1) if argX implements IDynamicMetaObjectProvider, binder calls its GetDynamicMetaObject method, 2) otherwise, it just wraps argX in a default implementation of DynamicMetaObject.
* After all arguments become DMOs, the binder calls Bind(DMO target, DMO[] args). Yeah, here we've got an asymmetry: the first argument is treated specially. In fact, it will be the one and only responsible for dispatching the dynamic call. Within our context, this asymmetry implies that "x * 2" and "2 * x" will be processed differently. Keep this in mind when writing your dynamic stuff.
* Since our binder is a subclass of BinaryOperationBinder, its Bind method just calls target.BindBinaryOperation(args[0]). Notice that args[0] here is the arg1 from the call to binder.BindCore.
* If x were an IDMOP, its DMO could implement BindBinaryOperation to execute arbitrary code. However, since x is 2, it's DMO is just a stub that provides default implementations for BindXXX methods. These default implementations simply calls binder.FallbackXXX(DMO target, DMO[] args).

Ain't you getting dizzy here with all these redirects? Take your time to track the trace of the program.

how it works, part III:
* We've stopped at CSharpBinaryOperationBinder.FallbackBinaryOperation(DMO target, DMO arg), let's continue.
* This method is a gateway into Microsoft.CSharp that implements a full-fledged C# compiler in runtime.
* The binder first calls RuntimeBinder.BindXXX, which returns a result of type EXPR.
* Now the binder calls CreateExpressionTree over the EXPR and returns the result.
* After all the wrapper methods unwrap and return, we arrive back to binder.BindCore.
* BindCore compiles an expression tree and returns it to UpdateAndExecute.
* UpdateAndExecute takes the compiled delegate and stores it in callSite.Target.

In this smooth picture, I've omitted a very important detail. The point is that callsites are often used in multiple contexts. For example, our x could come from a dynamic parameter of the method, and the caller of that method could pass anything to it, e.g. a double or a String. That's why we need to make callsites handle different argument types. An obvious way of doing that is to recompile Target every time when a callsite is invoked in a different context. That would incur a significant performance hit, that's why we need to introduce caching. DLR features three levels of caching.

how it works, part IV:
* L1 is the Target itself. It holds a compiled implementation of a dynamic operation for a single invocation context and looks like: "if (x is int && y is int) return x * y; else UpdateDelegates.UpdateAndExecute2(x ,y);". They say that Target is a monomorphic cache, since it holds at most 1 entry.
* L2 is the Rules[] array of the callsite. It can store up to 10 most recent results of compiling a dynamic operation for different invocation contexts (e.g. different multiplication codes for (int, int) and (double, int)). They say that Rules is a MRU cache, since its most recently used entry is always pushed to the first position of an array (so that subsequent lookups check most recently used entries first).
* L3 is the Cache dictionary of the binder. It can store up to 100 compilation results that may come from different call sites.

extension points:
* If you wish to use pretty syntax of C#, you'll have to live with callsites and binders generated by csc. Most of the times, that will be the case, however, if you develop a custom programming language with DLR, it seems reasonable to build your own infrastructure of binders.
* A canonical way to provide custom dynamic lookup logic is to implement an IDMOP that returns a custom DMO that implements all the stuff that you fancy. Remember that here you will need to build expression trees by hand, so this task is not for the faint of heart. That's why BCL provides two shortcuts.
* The first shortcut involves inheriting from DynamicObject. DO provides basic implementation of IDMOP+DMO boilerplate and features several virtual methods named TryXXX (e.g. TryBinaryOperation or TryInvoke). These TryXXX methods deal with regular .NET objects, and the machinery behind DO converts those to expression trees and vice versa.
* The second shortcut is even more lightweight and involves ExpandoObject. An expando is reminiscent of JavaScript objects: one can dynamically expand it with new properties and methods, enumerate those and even treat the expando as a Dictionary<String, Object>.
glider

c# 5.0: async/await

in brief:
* methods marked as async always return Task.
* one can use the await operator inside bodies of these methods.
* the await XXX operator is roughly equivalent to call/cc(k => XXX.GetAwaiter().Await(k)).
* notice the Await stuff. it effectively operates with two entities: XXX (the work at hand) and k (the continuation).
* Await can do whatever it wishes, e.g. run XXX on another thread, and call k upon completion.

control flow:
* recall call/cc semantics: immediately upon being called, it yields control to the the first non-async frame upwards the stack.
* you want to say that the async method returns shortly after being called? yep.
* if it's a button click, then okay, but what if I need that method's result? recall the signature of an async method - it returns Task. you just call the Wait method of the Task and then do whatever you wish with the result, or use one of task combinators and propagate the result down the promise chain.

nitpicks:
* the async keyword was introduced to contain the asynchronity plague within bounds, otherwise, using an async method would require rewriting all the methods on top of the stack.
* await XXX operator is not equivalent to the expression above. it involves BeginAwait/EndAwait methods pair, see details for more info.
* GetAwaiter and BeginAwait are resolved similarly to LINQ methods - both instance and extension methods will do, dynamic seems to be working as well, at least, in recent CTPs.
* an async method might take a long time to return, since: 1) it might contain long-running code that precedes the first await, 2) BeginAwait might decide to return false and the subsequent EndAwait call might just block before returning the result.

details:
* the first stab (with codegen example): http://stackoverflow.com/questions/5132873/asynchrony-in-c-5-0-how-does-eric-lipperts-example-work
* eric lippert's series (deep dive): http://blogs.msdn.com/b/ericlippert/archive/tags/c_2300_+5-0/
* what will this code print (check your understanding): https://msmvps.com/blogs/jon_skeet/archive/2010/10/30/c-5-async-investigating-control-flow.aspx
* the spec: http://go.microsoft.com/fwlink/?LinkId=204845