Language Integrated Query
Website | docs |
---|---|
Influenced by | |
SQL, Haskell |
Language Integrated Query (LINQ, pronounced "link") is a Microsoft .NET Framework component that adds native data querying capabilities to .NET languages, although ports exist for Java[1], PHP and JavaScript.
LINQ defines a set of method names (called standard query operators, or standard sequence operators), along with translation rules from so-called query expressions to expressions using these method names, lambda expressions and anonymous types. These can, for example, be used to project and filter data in arrays, enumerable classes, XML (LINQ to XML), relational databases, and third party data sources. Other uses, which utilize query expressions as a general framework for readably composing arbitrary computations, include the construction of event handlers[2] or monadic parsers.[3]
Many of the concepts that LINQ has introduced were originally tested in Microsoft's Cω research project. LINQ was released as a part of .NET Framework 3.5 on November 19, 2007.
Architecture of LINQ in the .NET Framework
Standard Query Operators
In what follows, the descriptions of the operators are based on the application of working with collections.
The set of query operators defined by LINQ is exposed to the user as the Standard Query Operator API. The query operators supported by the API are:[4]
- Select
The Select operator performs a projection on the collection to select interesting aspects of the elements. The user supplies an arbitrary function, as a delegate or lambda expression, which projects the data members.
- Where
The Where operator allows the definition of a set of predicate rules that are evaluated for each object in the collection, while objects that do not match the rule are filtered away. The predicate is supplied to the operator as a delegate.
- SelectMany
For a user-provided mapping from collection elements to collections, semantically two steps are performed. First, every element is mapped to its corresponding collection. Second, the result of the first step is flattened by one level. Note: Select and Filter are both implementable in terms of SelectMany, as long as singleton and empty collections are available. The translation rules mentioned above still make it mandatory for a LINQ provider to provide the other two operators.
- Sum / Min / Max / Average
These operators optionally take a lambda that retrieves a certain numeric value from each element in the collection and uses it to find the sum, minimum, maximum or average values of all the elements in the collection, respectively. Overloaded versions take no lambda and act as if the identity is given as the lambda.
- Aggregate
A generalized Sum / Min / Max. This operator takes a lambda that specifies how two values are combined to form an intermediate or the final result. Optionally, a starting value can be supplied, enabling the result type of the aggregation to be arbitrary. Furthermore, a finalization function, taking the aggregation result to yet another value, can be supplied.
- Join / GroupJoin
- The Join operator performs an inner join on two collections, based on matching keys for objects in each collection. It takes two functions as delegates, one for each collection, that it executes on each object in the collection to extract the key from the object. It also takes another delegate via which the user specifies which data elements, from the two matched elements, should be used to create the resultant object. The GroupJoin operator performs a group join. Like the Select operator, the results of a join are instantiations of a different class, with all the data members of both the types of the source objects, or a subset of them.
- Take / TakeWhile
- The Take operator selects the first n objects from a collection, while the TakeWhile operator, which takes a predicate, selects those objects that match the predicate.
- Skip / SkipWhile
- The Skip and SkipWhile operators are complements of Take and TakeWhile - they skip the first n objects from a collection, or those objects that match a predicate (for the case of SkipWhile).
- OfType
- The OfType operator is used to select the elements of a certain type.
- Concat
- The Concat operator concatenates two collections.
- OrderBy / ThenBy
- The OrderBy operator is used to specify the primary sort ordering of the elements in a collection according to some key. The default ordering is in ascending order, to reverse the order, the OrderByDescending operator is to be used. ThenBy and ThenByDescending specifies subsequent ordering of the elements. The function to extract the key value from the object is specified by the user as a delegate.
- Reverse
- The Reverse operator reverses a collection.
- GroupBy
- The GroupBy operator takes a delegate that extracts a key value and returns a collection of
IGrouping<Key, Values>
objects, for each distinct key value. TheIGrouping
objects can then be used to enumerate all the objects for a particular key value. - Distinct
- The Distinct operator removes duplicate instances of a key value from a collection. The function to retrieve the key value is to be supplied as a delegate.
- Union / Intersect / Except
- These operators are used to perform a union, intersection and difference operation on two sequences, respectively.
- SequenceEqual
- The SequenceEqual operator determines whether all elements in two collections are equal and in the same order.
- First / FirstOrDefault / Last / LastOrDefault
- These operators take a predicate. The First operator returns the first element for which the predicate yields true or throws an exception, if nothing matches. The FirstOrDefault operator is like the First operator except that it returns the default value for the element type (usually a null reference) in case nothing matches the predicate. The last operator retrieves the last element to match the predicate, or throws an exception in case nothing matches. The LastOrDefault returns the default element value, if nothing matches.
- Single
- The Single operator takes a predicate and returns the element that matches the predicate. An exception is thrown, if none or more than one element match the predicate.
- ElementAt
- The ElementAt operator retrieves the element at a given index in the collection.
- Any / All / Contains
- The Any operator checks, if there are any elements in the collection matching the predicate. It does not select the element, but returns true for a match. The All operator checks, if all elements match the predicate. The Contains operator checks, if the collection contains a given value.
- Count
- The Count operator counts the number of elements in the given collection.
The Standard Query Operator API also specifies certain operators that convert a collection into another type:[4]
- AsEnumerable: converts the collection to
IEnumerable<T>
type. - AsQueryable: converts the collection to
IQueryable<T>
type. - ToArray: converts the collection to an array.
- ToList: converts the collection to
IList<T>
type. - ToDictionary: converts the collection to
IDictionary<K, T>
type, indexed by the key K. - ToLookup: converts the collection to
ILookup<K, T>
type, indexed by the key K. - Cast: converts a non-generic
IEnumerable
collection to one ofIEnumerable<T>
by casting each element to typeT
. Throws an exception for incompatible types. - OfType: converts a non-generic
IEnumerable
collection to one ofIEnumerable<T>
. Only elements of typeT
are included.
Language extensions
While LINQ is primarily implemented as a library for .NET Framework 3.5, it also defines optional language extensions that make queries a first-class language construct and provide syntactic sugar for writing queries. These language extensions have initially been implemented in C# 3.0, VB 9.0 and Oxygene, with other languages like F# and Nemerle having announced preliminary support. The language extensions include:[5]
- Query syntax: A language is free to choose a query syntax that it will recognize natively. These language keywords must be translated by the compiler to appropriate LINQ method calls.
- Implicitly typed variables: This enhancement allows variables to be declared without specifying their types. The languages C# 3.0 and Oxygene declare them with the
var
keyword. In VB9.0, theDim
keyword without type declaration accomplishes the same. Such objects are still strongly typed; for these objects the compiler infers the types of variables via type inference, which allows the results of the queries to be specified and defined without declaring the type of the intermediate variables. - Anonymous types: Anonymous types allow classes that contain only data-member declarations to be inferred by the compiler. This is useful for the Select and Join operators, whose result types may differ from the types of the original objects. The compiler uses type inference to determine the fields contained in the classes and generates accessors and mutators for these fields.
- Object Initializer: Object initializers allow an object to be created and initialized in a single scope, as required for Select and Join operators.
- Lambda expressions: Lambda expressions denote delegates or expression trees, allowing predicates and extraction functions to be written inline with queries.
For example, in the query to select all the objects in a collection with SomeProperty
less than 10,
int someValue = 5;
var results = from c in SomeCollection
where c.SomeProperty < someValue * 2
select new {c.SomeProperty, c.OtherProperty};
foreach (var result in results)
{
Console.WriteLine(result);
}
the types of variables result, c and results all are inferred by the compiler in accordance to the signatures of the methods eventually used. The basis for choosing the methods is formed by the query expression-free translation result
int someValue = 5;
var results =
SomeCollection
.Where(c => c.SomeProperty < someValue * 2)
.Select(c => new {c.SomeProperty, c.OtherProperty});
foreach (var result in results)
{
Console.WriteLine(result.ToString());
}
LINQ Providers
The C# 3.0 specification defines a so-called Query Expression Pattern along with translation rules from a LINQ expression to an expression in a subset of C# 3.0 without LINQ expressions. The translation thus defined is actually un-typed, which, in addition to lambda expressions being interpretable as either delegates or expression trees, allows for a great degree of flexibility for libraries wishing to expose parts of their interface as LINQ expression clauses. For example, LINQ to Objects works on
IEnumerable<T>
s and with delegates, whereas LINQ to SQL makes use of the expression trees.
The expression trees are at the core of the LINQ extensibility mechanism, by which LINQ can be adapted for many data sources. The expression trees are handed over to LINQ Providers, which are data source-specific implementations that adapt the LINQ queries to be used with the data source. If they choose so, the LINQ Providers analyze the expression trees contained in a query in order to generate essential pieces needed for the execution of a query. This can be SQL fragments or any other completely different representation of code as further manipulatable data. LINQ comes with LINQ Providers for in-memory object collections, SQL Server databases, ADO.NET datasets and XML documents. These different providers define the different flavors of LINQ:
LINQ to Objects
The LINQ to Objects provider is used for in-memory collections, using the local query execution engine of LINQ. The code generated by this provider refers to the implementation of the standard query operators as defined on the Sequence
pattern and allows IEnumerable<T>
collections to be queried locally. Current implementation of LINQ to Objects uses e.g. O(n) linear search for simple lookups, and is not optimised for complex queries.[6]
LINQ to XML (formerly called XLINQ)
The LINQ to XML provider converts an XML document to a collection of XElement
objects, which are then queried against using the local execution engine that is provided as a part of the implementation of the standard query operator.[7]
LINQ to SQL (formerly called DLINQ)
The LINQ to SQL provider allows LINQ to be used to query SQL Server databases, including SQL Server Compact databases. Since SQL Server data may reside on a remote server, and because SQL Server has its own query engine, LINQ to SQL does not use the query engine of LINQ. Instead, it converts a LINQ query to a SQL query that is then sent to SQL Server for processing.[8] However, since SQL Server stores the data as relational data and LINQ works with data encapsulated in objects, the two representations must be mapped to one another. For this reason, LINQ to SQL also defines a mapping framework. The mapping is done by defining classes that correspond to the tables in the database, and containing all or a subset of the columns in the table as data members.[9] The correspondence, along with other relational model attributes such as primary keys, are specified using LINQ to SQL-defined attributes. For example,
[Table(Name="Customers")]
public class Customer
{
[Column(IsPrimaryKey = true)]
public int CustID;
[Column]
public string CustName;
}
This class definition maps to a table named Customers
and the two data members correspond to two columns. The classes must be defined before LINQ to SQL can be used. Visual Studio 2008 includes a mapping designer that can be used to create the mapping between the data schemas in the object as well as the relational domain. It can automatically create the corresponding classes from a database schema, as well as allow manual editing to create a different view by using only a subset of the tables or columns in a table.[9]
The mapping is implemented by the DataContext
that takes a connection string to the server, and can be used to generate a Table<T>
where T is the type to which the database table will be mapped. The Table<T>
encapsulates the data in the table, and implements the IQueryable<T>
interface, so that the expression tree is created, which the LINQ to SQL provider handles. It converts the query into T-SQL and retrieves the result set from the database server. Since the processing happens at the database server, local methods, which are not defined as a part of the lambda expressions representing the predicates, cannot be used. However, it can use the stored procedures on the server. Any changes to the result set are tracked and can be submitted back to the database server.[9]
LINQ to DataSets
The LINQ to SQL provider works only with Microsoft SQL Server databases; to support any generic database, LINQ also includes the LINQ to DataSets, which uses ADO.NET to handle the communication with the database. Once the data is in ADO.NET Datasets, LINQ to DataSets execute queries against these datasets.[10]
Other providers
The LINQ providers can be implemented by third parties for various data sources as well. Several database server specific providers are available from the database vendors. Some of the popular providers include:
- Data Services: LINQ to ADO.NET Data Services[11]
- dotConnect: LINQ to Oracle, MySQL, PostgreSQL, and SQLite
- Entity Framework: LINQ to Entities[12]
- SSAS Entity Framework Provider[13]: LINQ to OLAP cubes in SSAS.
- Windows Search: LINQ to System Search[14]
- Google search: LINQ to Google[15]
- DbLinq: LINQ to MySQL, PostgreSQL, Oracle, Ingres, SQLite and Microsoft SQL Server[16]
- NHibernate:
- DataObjects.NET: LINQ to DataObjects.NET[19]
- LLBLGen Pro: LINQ to LLBLGen[20]
- OpenMapi: LINQ to MAPI[21]
- CSV: LINQ to CSV[22]
- Twitter: LINQ to Twitter[23]
- db4o: LINQ to db4o[24]
- Wikipedia: LINQ to Wikipedia[25]
- LINQ to XSD: LINQ to XML Schema[26]
Performance
Some benchmark on simple use cases tend to show that LINQ to Objects performance has a large overhead compared to normal operation[27]
LINQ to XML and LINQ to SQL performance compared to ADO.NET depends on the use case[28].
PLINQ
Version 4 of the .NET framework includes PLINQ, or Parallel LINQ, a parallel execution engine for LINQ queries. It defines the IParallelEnumerable<T>
interface. If the source collection implements this interface, the parallel execution engine is invoked. The PLINQ engine can execute parts of a query concurrently on multiple threads, providing faster results.[29]
File:Http://www.embarcadero.com/products/prism==Other language implementations==
- Saffron is an extension to Java incorporating SQL-like relational expressions. Relations can be in-memory collections, database tables, or other data sources. It was developed independently of LINQ in 2001 by Julian Hyde, who later authored the Mondrian OLAP server.
- jLinq jLinq is a fully extensible Javascript library that allows you to perform LINQ style queries on arrays of object.
- JSINQ is Kai Jäger's JavaScript implementation of LINQ to Objects. Also provides a compiler that translates LINQ-style query expressions into JavaScript code.
- JSLINQ JSLINQ is yet another Javascript library that allows you to perform LINQ style queries on data.
- Chris Pietschmann's LINQ to JavaScript is a LINQ implementation that extends JavaScript's Array object with LINQ capabilities.
- PHPLinq is Maarten Balliauw's PHP implementation of LINQ.
- Quaere is a Java implementation of LINQ.
- JaQue is a typesafe Java implementation of LINQ.
- JaQu a Java implementation of LINQ.
- Querydsl is a typesafe Java implementation of LINQ.
- SBQL4J is a Java extension with capabilities of LINQ, based on Stack-Based Approach. It provides type safe queries to Java with powerful engine.
- hxLINQ is a haXe port of Chris Pietschmann's LINQ to JavaScript.
- asq is a Python implementation of LINQ-to-objects and Parallel LINQ-to-objects (PLINQ).
- Embarcadero Prism, also known as Delphi Prism, supports LINQ.
See also
References
- ^ "Pure Java library with LINQ-style implementation". Retrieved 2011-10-08.
- ^ "Rx framework".
- ^ "Monadic Parser Combinators using C#3". Retrieved 2009-11-21.
- ^ a b "Standard Query Operators". Microsoft. Retrieved 2007-11-30.
- ^ "LINQ Framework". Retrieved 2007-11-30.
- ^ "Performance Engineering for LINQ". Retrieved 2008-08-16.
- ^ ".NET Language-Integrated Query for XML Data". Retrieved 2007-11-30.
- ^ "LINQ to SQL". Retrieved 2007-11-30.
- ^ a b c "LINQ to SQL: .NET Language-Integrated Query for Relational Data". Retrieved 2007-11-30.
- ^ "LINQ to DataSets". Retrieved 2007-11-30.
- ^ "LINQ to ADO.NET Data Services". Retrieved 2007-12-11.
- ^ "ADO.NET Entity Framework Overview". Retrieved 2007-12-11.
- ^ "SSAS Entity Framework Provider for LINQ to MDX / SSAS OLAP cubes".
- ^ "System Search to LINQ".
- ^ "Glinq".
- ^ "DbLinq Project: Linq Provider for MySql, Oracle and PostgreSQL". Retrieved 2008-04-29.
- ^ "LINQ to NHibernate".
- ^ "Linq to NHibernate Progress Report - A Christmas Gift?".
- ^ "LINQ in DataObjects.Net".
- ^ "LINQ to LLBLGEN".
- ^ "LINQ to MAPI".
- ^ "LINQ to CSV".
- ^ "LINQ to Twitter".
- ^ "LINQ to db4o".
- ^ "LINQ to Wikipedia".
- ^ "LINQ to XSD".
- ^ Vider, Guy (2007-12-21). "LINQ Performance Test: My First Visual Studio 2008 Project". Retrieved 2009-02-08.
- ^ Kshitij, Pandey (2008-05-25). "Performance comparisons LinQ to SQL, ADO, C#". Retrieved 2009-02-08.
- ^ "Programming in the Age of Concurrency: Concurrent Programming with PFX". Retrieved 2007-10-16.
External links
- Official Microsoft LINQ Project
- 101 C# LINQ Samples
- 101 Visual Basic LINQ Samples
- LINQ to XML Documentation
- Microsoft LINQ forum
- LINQ to Objects for the .NET developer
- LINQ page on NetFXGuide.com
- LINQ wiki
- LINQ books
- Continuous LINQ
- LINQ To Sharepoint
- LINQ To Active Directory
- LINQ for Novell.Directory.Ldap
- LinqToWikipedia
- LINQ Tutorials and Active Articles
- Looking to LINQ - Will Microsoft's Language Integrated Query transform programmatic data access?
- Obtics (Observable Object LINQ)
- LINQ to SNMP
- Different Ways Of Retrieving Data From Collections
- Future of LINQ to SQL
- LINQ Exchange - Learn LINQ and Lambda Expressions
- MoreLINQ - Extensions to LINQ to Objects by Jon Skeet
- 50 LINQ Examples, Tips and How To's