Examining Parallel LINQ (PLINQ) and the Task Parallel Library (TPL) in .NET 4.0 – Part 1

I haven’t had a chance to play with PLINQ or the TPL prior to .NET 4.0. I started examining it in the Release Candidate. Recently, I’ve spent some time playing with both and evaluating where they make sense to use and where they can be an overkill. I will summarize my findings in a few posts. This is the first part.

I started by writing a simple program to analyze both PLINQ and TPL’s Parallel.For. That would be a good start to get a feel of the performance.

The first version of the program is:

   1: using System;
   2: using System.Collections.Generic;
   3: using System.Linq;
   4: using System.Diagnostics;
   5: using System.Threading.Tasks;
   6:  
   7: namespace ParallelTests
   8: {
   9:     public class Program
  10:     {
  11:         private const int COLLECTION_ITEM_COUNT = 1024;
  12:         private const int TPL_ITERATION_COUNT = 1000;
  13:  
  14:         public static void Main(string[] args)
  15:         {
  16:             // Prepare data collection
  17:             List<KeyValuePair<int, int>> randomData = PrepareData();
  18:             Console.WriteLine("Running SEQ test...");
  19:             RunSequentialTest(randomData);
  20:             Console.WriteLine("Running PAR test...");
  21:             RunParallelTest(randomData);
  22:             Console.WriteLine("Done.");
  23:         }
  24:  
  25:         private static List<KeyValuePair<int, int>> PrepareData()
  26:         {
  27:             List<KeyValuePair<int, int>> randomData = new List<KeyValuePair<int, int>>();
  28:             for (int i = 0; i < COLLECTION_ITEM_COUNT; i++)
  29:             {
  30:                 randomData.Add(new KeyValuePair<int, int>(i, new Random(i).Next()));
  31:             }
  32:             return randomData;
  33:         }
  34:  
  35:         private static void DoWork(long i, List<KeyValuePair<int, int>> randomData)
  36:         {
  37:             for (int j = 0; j < TPL_ITERATION_COUNT; j++)
  38:             {
  39:                 var data = from randomNumber in randomData
  40:                            select new
  41:                            {
  42:                                randomNumber.Key,
  43:                                randomNumber.Value
  44:                            };
  45:             }
  46:         }
  47:  
  48:         #region Sequential
  49:         private static void RunSequentialTest(List<KeyValuePair<int, int>> randomData)
  50:         {
  51:             Stopwatch sw = new Stopwatch();
  52:             sw.Start();
  53:  
  54:             RunLINQSequentialTest(randomData);
  55:  
  56:             sw.Stop();
  57:             Console.WriteLine("PLINQ/SEQ: {0} msecs", sw.ElapsedMilliseconds);
  58:  
  59:             sw.Restart();
  60:  
  61:             RunSequentialForTest(randomData);
  62:  
  63:             sw.Stop();
  64:             Console.WriteLine("TPL/SEQ: {0} msecs", sw.ElapsedMilliseconds);
  65:         }
  66:  
  67:         private static void RunLINQSequentialTest(List<KeyValuePair<int, int>> randomData)
  68:         {
  69:             var data = from randomNumber in randomData
  70:                        select new
  71:                        {
  72:                            randomNumber.Key,
  73:                            randomNumber.Value
  74:                        };
  75:         }
  76:  
  77:         private static void RunSequentialForTest(List<KeyValuePair<int, int>> randomData)
  78:         {
  79:             for (int i = 0; i < COLLECTION_ITEM_COUNT; i++)
  80:             {
  81:                 DoWork(i, randomData);
  82:             }
  83:         }
  84:         #endregion
  85:  
  86:         #region Parallel
  87:         private static void RunParallelTest(List<KeyValuePair<int, int>> randomData)
  88:         {
  89:             Stopwatch sw = new Stopwatch();
  90:             sw.Start();
  91:  
  92:             RunLINQAsParallelTest(randomData);
  93:  
  94:             sw.Stop();
  95:             Console.WriteLine("PLINQ/PAR: {0} msecs", sw.ElapsedMilliseconds);
  96:  
  97:             sw.Restart();
  98:  
  99:             RunParallelForTest(randomData);
 100:  
 101:             sw.Stop();
 102:             Console.WriteLine("TPL/PAR: {0} msecs", sw.ElapsedMilliseconds);
 103:         }
 104:  
 105:         private static void RunLINQAsParallelTest(List<KeyValuePair<int, int>> randomData)
 106:         {
 107:             var data = from randomNumber in randomData.AsParallel()
 108:                        select new
 109:                        {
 110:                            randomNumber.Key,
 111:                            randomNumber.Value
 112:                        };
 113:         }
 114:  
 115:         private static void RunParallelForTest(List<KeyValuePair<int, int>> randomData)
 116:         {
 117:             Parallel.For(0, COLLECTION_ITEM_COUNT, (i) => DoWork(i, randomData));
 118:         }
 119:         #endregion
 120:     }
 121: }

Running this simple version resulted in the following output:

image

Running the program under the new Concurrency Profiler in Visual Studio 2010 can help give us a better idea how the gains were achieved.

To do that, perform the following:

1) If running Visual Studio 2010 on Windows Vista or Windows 7, you will need to run Visual Studio elevated to be able to run the profiler.

2) Choose the “Launch Performance Wizard…” option from the Analyze menu.

image

3) Select the Concurrency radio button and check the “Visualize the behavior of a multithreaded application” checkbox.

image

Notice how the profiler has changed from the one in VS2008. It now includes memory allocation sampling as well as a concurrency profiler and visualizer.

4) In the next two steps of the wizard, select your application for profiling and check the box to have the profiler launched after the wizard finishes.

5) After the application runs, the profiling results will be analyzed, and a report will be generated:

image

6) Click on the “CPU Utilization” link/image:

image

7) Notice that most of the utilization by our program throughout the test was on the first CPU core. The second core was hardly if ever utilized by our program.

8) Click on the Threads tab to see the thread utilization by our program:

image

Now this is amazing information that you have got presented in an easy to understand graph along with some help. This just rocks!

9) Finally, click on the Cores tab:

image

This tab provides us with great information about the details of the program’s execution on each of the two core on my laptop.

I remember back in 2008 when I was still at Microsoft attending one of the early demos of the Concurrency Visualizer tool that was available for running on top of Visual Studio 2008 (on Vista and Windows 7) and thinking “this is really cool”. And to now see it seamlessly integrated into VS2010 and made so simple to use is just thrilling.

Visual Studio 2010 is probably the best Visual Studio release ever. Having worked in DevDiv, I can appreciate the amount of effort, talent, and dedication that was put into this release. And the results make it well worth it.

In the next post, I will go over more details discussing the program and the results as well as examining other variations of the program and how they affect the results.

One final note: If you haven’t had a chance to test-drive VS2010, please do so now. You will not regret it. You can download a trial version from here.

Published 04-13-2010 6:05 AM by Mohammad Jalloul
Powered by Community Server (Non-Commercial Edition), by Telligent Systems