Introduction
Most
.NET developers today are familiar with LINQ, the
technology that brought functional programming ideas into the object-oriented
environment. Parallel LINQ, or ‘PLINQ’, takes LINQ to the next
level by adding intuitive parallel capabilities onto an already powerful
framework.
PLINQ
is a query execution engine that accepts any LINQ-to-Objects or LINQ-to-XML
query and automatically utilizes multiple processors or cores for execution
when they are available. The change in programming model is tiny, meaning you don’t need to be a concurrency
guru to use it.
Using
PLINQ is almost exactly like using LINQ-to-Objects and LINQ-to-XML. You can use
any of the operators available through C# 3.0 syntax or the
System.Linq.Enumerable class, including OrderBy, Join, Select, Where,
and so on.
LINQ-to-SQL and LINQ-to-Entities queries will still be executed by the
respective databases and query providers, so PLINQ does not offer a way to
parallelize those queries. If you wish to process the results of those queries
in memory, including joining the output of many heterogeneous queries, then PLINQ can be quite
useful.
Using
the AsParallel Method
The
AsParallel method is the doorway to PLINQ. It
converts data sequence into a ParallelQuery.
The LINQ engine detects the use of a ParallelQuery as the source in a query and switches
to PLINQ execution automatically. You are likely to use the AsParallel method
every time you use PLINQ.
Sequential
LINQ execution:-
var customers = new[] {
new
Customer { ID = 1, FirstName = "Sandeep" , LastName = "Ramani" },
new
Customer { ID = 2,
FirstName = "Dharmik" , LastName = "Chotaliya" },
new
Customer { ID = 3, FirstName = "Nisar" ,
LastName = "Kalia" }
,
new
Customer { ID = 4, FirstName = "Ravi" , LastName = "Mapara" }
,
new
Customer { ID = 5, FirstName = "Hardik" , LastName = "Mistry" }
new
Customer { ID = 6, FirstName = "Sandy" , LastName = "Ramani" },
};
var results = from c in customers
where c.FirstName.StartsWith("San")
select c;
Parallel LINQ
execution:-
var customers = new[] {
new
Customer { ID = 1, FirstName = "Sandeep" , LastName = "Ramani" },
new
Customer { ID = 2, FirstName = "Dharmik" , LastName = "Chotaliya" },
new
Customer { ID = 3, FirstName = "Nisar" ,
LastName = "Kalia" }
,
new
Customer { ID = 4, FirstName = "Ravi" , LastName = "Mapara" }
,
};
var results = from c in customers.AsParallel()
where c.FirstName.StartsWith("San")
select c;
With
the simple addition of the
AsParallel() extension method, the .NET runtime will automatically
parallelize the operation across multiple cores. In fact, PLINQ will take full
responsibility for partitioning your data into multiple chunks that can be
processed in parallel.
PLINQ
partitioning is out of the scope for this article, but if you’re curious about
the inner workings of it, this blog post from Microsoft’s own Parallel Programming
team does a great job of explaining the details.
When
you will run the above sample queries, you might get the same output but
possibly in different order. Here Sample code 1 is an example of Sequential LINQ
execution, while Sample code 2 is an
example of Parallel LINQ
execution.
Limitations
1.
PLINQ
only works against local collections. This means that if you’re using LINQ
providers over remote data, such as LINQ to SQL or ADO.NET Entity Framework,
then you’re out of luck for this version.
2.
Since
PLINQ chunks the collection into multiple partitions and executes them in
parallel, the results that you would get from a PLINQ query may not be in the
same order as the results that you would get from a serially executed LINQ
query.
However,
you can work around this by introducing the
AsOrdered() method into your query, which will force a specific
ordering into your results. Keep in mind, however, that the AsOrdered() method does incur a performance hit for large
collections, which can erase many of the performance gains of parallelizing
your query in the first place.
Preserving the Order of PLINQ Query Results Using the
AsOrdered Method
var results = from c in customers.AsParallel().AsOrdered()
where c.FirstName.StartsWith("San")
select c;
Controlling
Parallelism
1. Forcing Parallel Execution
In
some cases, PLINQ may decide that your query is better dealt with sequentially.
You can control this by using the
WithExecutionMode extension method, which is applied to the ParallelQuery type. The WithExecutionModemethod takes a value from the ParallelExecutionMode enumeration. There are two such values: the default
(let PLINQ decide what to do) and ForceParallelism (use PLINQ even if the overhead of parallel execution
is likely to outweigh the benefits).
Here
is sample code which demonstrates the use of this method :
var results = from c in
customers.AsParallel().WithExecutionMode
(ParallelExecutionMode.ForceParallelism)
where c.FirstName.StartsWith("San")
select c;
2. Limiting the
Degree of Parallelism
You can request that
PLINQ limit the number of partitions that are processed simultaneously using the
WithDegreeofParallelism extension method, which operates on the ParallelQuery type. This method takes anint argument that states the maximum number of partitions
that should be processed at once; this is known as the degree of parallelism. Setting the degree of parallelism
doesn’t force PLINQ to use that many. It just sets an upper limit. PLINQ may
decide to use fewer than you have specified or, if you have not used the WithExecutionModemethod, may decide to execute the query sequentially.
Here is sample code
which demonstrates the use of this
method :
var results = from c in
customers.AsParallel().WithDegreeOfParallelism(2)
where c.FirstName.StartsWith("San")
select
c;
3. Generating and
Using a Parallel Sequence
IEnumerable<int> evens
= ((ParallelQuery<int>) ParallelEnumerable.Range(0, 50000))
.Where(i => i % 2 == 0)
.Select(i => i);
The
above code uses the
Range method to create a sequence of 50,000
integers starting with the zero. The firstargument
to the method is the start index; the second is the number of values you require. Notice that we have cast
the result from the Range method to a ParallelQuery. If we don’t do this, LINQ doesn’t
recognize the sequence as supporting parallel execution and will execute the
query sequentially.
4. Generating and
Using a Repeating Sequence
int sum = ParallelEnumerable.Repeat(1, 50000)
.Select(i => i)
.Sum();
The
static Repeat method takes an object and a count and creates a
sequence where the object is repeated the specified number of times.
_________________________________________________________________________________
Reach us At: - 0120-4029000; 0120-4029024;
0120-4029025, 0120-4029027; 0120-4029029
Mbl: 9953584548
No comments:
Post a Comment