High speed applications – Parallelism in .NET part 3, TPL

[The post has been edited 2016/march/24. Some rephrasing, more explanations and more examples].

Welcome back again!

In case you’ve missed them or want to read them again, these are the links to Part 1 and Part 2 of the series.

It’s nice to see that you are interested in knowing more about parallelism in .NET. In this post we are going to learn about using the…

Task Parallel Library (TPL)

TPL is the preferred way to develop asynchronous and parallel applications in .NET. It was introduced in .NET 4, a lot of improvements where made in 4.5 and yet some in 4.6. Originally I thought I would first explain how TPL works without async/await and then show you how much easier it is with them, but I changed my mind and I will go through both in this post first post about TPL.

Task-based Asynchronous Pattern (TAP) (and scheduling a task)

Working with TPL, we should follow and comply to the Task-based Asynchronous Pattern TAP (for even more information see reference 1). We use this pattern both for computational tasks (CPU and memory) and I/O tasks (input / output).

All methods that schedule something to execute asynchronous or call something asynchronously through TPL should have their name postfixed with Async, for example RegularMethodNameAsync, where RegularMethodName is what you would call the method if it was not asynchronous. We are going to start off by demonstrating how to schedule a task by using TPL. The easiest way to start a task is to call Task.Run which in it’s simplest overload takes a delegate or method as parameter and returns a Task or Task<T>.

 

 

TaskGetStringExampleExecution

GetStringSync is called and executed synchronously. GetStringAsync is called synchronously but the string is returned asynchronously from a thread pool thread. As you noticed GetStringAsync is declared with return value Task<string>. String created task’s return value.

The method returns a task that may or may not be completed when GetStringAsync returns. All methods that starts a task or call another method that does, should be named “{NameOfMethod}Async” AND they should return a Task. The only exceptions to this should be the framework methods, like Task.WhenAll or Task.Delay (which we will cover later). Methods that in a synchronous world would return void, now returns Task. The ones that would return a value or an instance of an object now return Task<TType>.

We can also create a task by instantiating a new Task and call the Start method:

All tasks returned from public methods should be started, completed, canceled or faulted. More information about task statuses in a later post.

Calling the Result property of a Task that is not completed, like we did in this example will result in the current thread sleeping until the task is competed. This will result internally in a call to Task.Wait() before the result is returned. Task.Wait() is as bad as Thread.Sleep since it blocks the current thread and can cause context switching, (see Part 1). How do we avoid this? I’m glad you asked :). The answer is continuations.

Continuations

When we have a reference to a Task we can schedule work to happen when the task is done. To schedule a continuation for a task, we call the ContinueWith method on that task’s reference. Take a look at this example:

We schedule continuations with Task.ContinueWith.

The lambda parameter (getStringAsyncTask) is a reference to the task that was finished.

We’ve added two continuations to the code. In Main we have scheduled a new task to execute when the result from GetStringAsync is available, where we write the returned string from GetStringAsync to the console. The second one is in GetStringAsync where we’ve added a continuation to Task.Delay. Task.Delay returns a task that is completed after at least x milliseconds (5000 in our case). Task.Delay does not cause any unnecessary context switching. Except for demonstration purposes, there are a couple of nice usages for Task.Delay which we will go into later in the series.

A task can have any number of continuations that can execute in parallel. How about this:

We’ve added 1 + 50 continuations to the task returned by GetStringAsync. The reason for using “var currentContinuation = i;” is that the variable i is used in a closure (more information in a future post about closures), but in short, the variable i is referenced in the delegate (the code block we want to execute) and the delegate will use the value variable i is when it executes rather than the value it is in our iteration. In this example all continuations would write “Continuation 50”, since the value of i would be 50 before the first continuation would execute. The same problem can occur with all delegates / closures, not just the ones executed on the thread pool.

Here’s an example of how it can look when executing the example:

multipleContinuations

As we can see, the continuations are not executed on the same thread pool thread.

We can hint to the task scheduler (more about task scheduler later in the series) that we would like a continuation to run on the same thread as it’s antecedent task. The antecedent task is the task we added the continuation to, in this case getStringAsyncTask. So let’s try it out and have the continuations execute synchronously on the antecedent tasks thread pool thread. Change

to

And as we can see below, all continuations are now executed on the same thread pool thread as their antecedent task.

multipleContinuations_samethread

This can be useful if the continuations are rather small.

Continuations are not scheduled for execution until their antecedent task is completed, which means the work item queues (see post 2 for more information) are not filled with work items until the continuations are really required to execute.

We can also hint the task scheduler that we would rather have a continuation scheduled on the global work item queue (see post 2 for more information), like this:

And the result is:

multiple_continuations_fairness

All tasks scheduled from a non thread pool thread will be placed in the Global work item queue. All tasks scheduled from a thread pool thread will by default be placed in that threads local work item queue (see post 2 for more information). Just like we hinted the task scheduler that we wanted to have our continuations on the global work item queue, we can schedule any tasks to the global queue:

Notice how we use Task.Factory.StartNew instead of Task.Run. Task.Run was introduced in .NET 4.5 and is a little easier to use but it lacks some of the more advanced features that you may want to use. We can also hint the task scheduler to start a long running task which currently will spawn a new thread, which of course should be avoided but if you need a long running method, we use the same pattern and library.

Async/Await

Now this is all well and looks pretty nice but it kind of makes you think about coding in a very different way from traditional synchronous code. I think it crucial for us to know what’s going on behind the scenes if we are to write good parallel / asynchronous code, and now that we have some basic knowledge about it. We can start using the much easier compiler support added to C# 5 for using async / await.

async is a keyword we decorate our methods with to tell the compiler we want the await support in the method.

await is the keyword to tell the compiler to create a continuation for all code after the await.

Now, let’s rewrite the previous example code using async/await!

I don’t know what you think, but this code is much more like the one we’re used to writing and the C# compiler generates the code to create continuations for us.

The main method can not be decorated with async so I’ve created a MainAsync.

In the Main method we use MainAsync().Wait(). We are always able to wait for tasks, but this should be avoided except maybe for in a program Main method. Of course, if you start porting your code to TPL/TAP you will in some cases have to use Wait() or .Result until all your code is asynchronous.

Working with I/O tasks

As we spoke about in post 2, we use the same pattern for working with I/O tasks as CPU/memory tasks. As you remember, we want to avoid wasting resources waiting for other tasks and I/O and continuations are great. We’ve looked at creating and continuing CPU/memory tasks.

A lot of the I/O classes in .NET framework have support for TPL.

Let’s have a look at a couple of examples where we use I/O tasks!

Async SqlConnection/SqlCommand/SqlDataReader

Consider the following helper class to get data from a database to a DataTable. Notice that I’ve used CancellationTokens. They are a way to cooperatively cancel asynchronous operations. We will cover them in a future post.

As you can see – when we use async / await, we can use using-statements which make life much easier for us! Just make sure that you await all asynchronous tasks that need the object being disposed at the end of the using-block. We are also able to use try/catch/finally in asynchronous methods. Exceptions in TPL is covered more in detail in a future post.

To get a DataTable from a database we simply call GetDataTableAsync:

Async support in Entity framework

Async File I/O

ASP.NET MVC And TPL

When we develop Web applications or Web API:s with ASP.NET MVC 4 we have the option and should really make all public controller methods asynchronous. This is an example of the default HomeController, but async. Since ASP.NET is implemented using TPL and the thread pool, we are from this very moment obligated to make all ASP.NET MVC 4+ controller method asynchronous. There is no turning back :).

And to add a little logic to it from the EF example:

In addition to the single responsibility principle (SRP) – this is another good reason to avoid calling methods from your views.

WebApi methods are very similar:

I think you get the hang of it!

In the next post, we’ll dig deeper into TPL and we will start looking at data consistency / locking!!

If you have any questions, you like or hate the things I write about, please leave a comment.

Cheers
Erik Bergman

 

References
  1. Task-based Asynchronous Pattern

3 thoughts on “High speed applications – Parallelism in .NET part 3, TPL

  1. Pingback: High speed applications – Parallelism in .NET part 4 – TPL, exceptions & cancellation – Erik Bergman's .NET blog

Leave a Reply

Your email address will not be published. Required fields are marked *