News
Photos
Articles
Components
Applications
Kleinkunst

.NET - Functions in the Entity Framework 4.0: part 3

Over the last few weeks I've been writing a series of articles that cover some great improvements in the Entity Framework 4.0 In this third article I will explore some SqlClr features and show how to call custom .NET functions and aggregates in LINQ to Entities queries.


SqlClr

SQL CLR or the SQL Common Language Runtime is a technology which is used to host .NET managed code in Microsoft SQL Server. It was introduced in SQL Server 2005. SqlClr can be used to write stored procedures, user defined functions, user defined types, triggers, ... in a .NET language instead of T-SQL. I prefer to implement SqlClr C# user defined functions to execute complex calculations, parse strings with regular expressions or search with fuzzy string algorithms (Levenshtein, Sift, ...).

In this article I will guide you how to implement your own .NET functions and aggregates and how to map them to Model Defined Functions.

Creating SqlClr user defined functions

First of all you have to start with a new project. Select the Visual C# SQL CLR Database Project template. Make sure to set the .NET environment to .NET 3.5 because SQL Server 2008 SE does not support the CLR 4.0!

Visual C# SQL CLR Database Project template

A test script, pre deployment and post deployment script will be generated.

Visual C# SQL CLR Database Project

Before you can use the CLR in SQL Server you need to execute a small script to enable this feature. You can add this statement in the PreDeployScript.sql file so it will be executed when deploying.

EXEC sp_configure 'clr enabled', 1 RECONFIGURE

Now you have to create a static class with static functions. These functions should be decorated with the SqlFunction attribute that can be found in the Microsoft.SqlServer.Server namespace. All input parameters and the return values should use SqlClr compliant datatypes like SqlString, SqlBoolean, SqlInt32, ...

I have created two very simple SqlClr functions in the MySqlClrFunctions class. The first function calls the Contains method on a given string. And the GetQuantityBottles function uses a regular expression to parse the QuantityPerUnit string and return the number of bottles.

namespace ScipBe.SqlClr
{
    public partial class MySqlClrFunctions
    {
        [SqlFunction()]
        public static SqlBoolean ContainsBottles(SqlString quantityPerUnit)
        {
            return quantityPerUnit.Value.Contains("bottles");
        }
 
        [SqlFunction()]
        public static SqlString GetQuantityBottles(SqlString quantityPerUnit)
        {
            var pattern = @"(?<Quantity>.*)bottles";
            var match = Regex.Match(quantityPerUnit.Value, pattern);
            if (match.Success)
            {
                return match.Groups["Quantity"].Value;
            }
 
            return string.Empty;
        }
    }
}

In case you were wondering why to parse the QuantityPerUnit property, this an example of the content of the QuantityPerUnit field in the Northwind Products table.

Northwind Products QuantityPerUnit


Deploying SqlClr functions

Deploying a SqlClr assembly is very easy. Just right click on the project and choose Deploy.

Configuration option 'clr enabled' changed from 0 to 1. 
Run the RECONFIGURE statement to install. 
Dropping [ScipBe.SqlClr].[SqlAssemblyProjectRoot]...   
Dropping [dbo].[GetQuantityBottles]...   
Dropping [dbo].[ContainsBottles]...   
Dropping [ScipBe.SqlClr]...   
Creating [ScipBe.SqlClr]...   
Adding files to assembly [ScipBe.SqlClr]   
Creating [dbo].[GetQuantityBottles]...   
Creating [dbo].[ContainsBottles]...   
Creating [ScipBe.SqlClr].[SqlAssemblyProjectRoot]...   
Creating [dbo].[GetQuantityBottles].[SqlAssemblyFile]...   
Creating [dbo].[GetQuantityBottles].[SqlAssemblyFileLine]...   
Creating [dbo].[ContainsBottles].[SqlAssemblyFile]...   
Creating [dbo].[ContainsBottles].[SqlAssemblyFileLine]...   
The transacted portion of the database update succeeded. 

If you take a closer look to the Northwind database with SQL Management Studio you will see that our assembly has been added and referenced by the database.

SqlClr reference

When you take a look at the details of the Scalar-valued Functions, you will see that T-SQL user defined functions have been generated that refer to the .NET functions in our assembly.

ALTER FUNCTION [dbo].[ContainsBottles]
  (@quantityPerUnit [nvarchar](4000)) 
  RETURNS [bit] 
  WITH EXECUTE AS CALLER AS  
  EXTERNAL NAME [ScipBe.SqlClr].[ScipBe.SqlClr.MySqlClrFunctions].[ContainsBottles]

 

Calling the SqlClr functions in SQL

UDF's can be used in SELECT and WHERE statements and calling our SqlClr UDF's in a SQL statement works just the same as calling T-SQL UDF's.

SELECT 
  p.ProductID,
  p.ProductName,
  p.QuantityPerUnit,
  dbo.GetQuantityBottles(p.QuantityPerUnit) as Quantity 
FROM Products p WHERE    
  dbo.ContainsBottles(p.QuantityPerUnit) = 1

Northwind Products QuantityPerUnit

 

Calling the SqlClr functions in LINQ to Entities

Now we can follow the steps that I described in previous articles to map UDF's to Model Defined Functions. First these SqlClr user defined functions should be imported in the storage model of the Entity Data Model.

Then we have to define Model Defined Functions which refer to the UDF's in the storage model.

public static class MySqlLinqFunctions
{
    [EdmFunction("NorthwindModel.Store", "ContainsBottles")]
    public static bool ContainsBottles(string quantityPerUnit)
    {
        throw new NotSupportedException("This function can only be used in a LINQ to Entities query");
    }
 
    [EdmFunction("NorthwindModel.Store", "GetQuantityBottles")]
    public static string GetQuantityBottles(string quantityPerUnit)
    {
        throw new NotSupportedException("This function can only be used in a LINQ to Entities query");
    }
}

And finally we can create LINQ to Entities queries.

var products =
    from p in context.Products
    where MySqlLinqFunctions.ContainsBottles(p.QuantityPerUnit)
    select new
    {
        p.ProductID,
        p.ProductName,
        p.QuantityPerUnit,
        Quantity =MySqlLinqFunctions.GetQuantityBottles(p.QuantityPerUnit)
    };

As you can see, it is quite easy to implement custom .NET functions, add them as UDF's in the database and map these UDF's to Model Defined Functions in the Entity Framework. The real beauty of this approach is that you can call all kind of .NET logic in LINQ to Entities queries and that these queries will be executed in the database.

 

Sharing functions between LINQ to Objects and LINQ to Entities

A last improvement I want to show is how to reuse the same code in LINQ to Objects queries an LINQ to Entities queries. Therefore you have to implement all logic in a standard .NET static class and include it in your main project or one of its libraries. It created a MyLinqFunctions class with static methods.

public partial class MyLinqFunctions
{
    public static bool ContainsBottles(string quantityPerUnit)
    {
        return quantityPerUnit.Contains("bottles");
    }
 
    public static string GetQuantityBottles(string quantityPerUnit)
    {
        var pattern = @"(?<Quantity>.*)bottles";
        var match = Regex.Match(quantityPerUnit, pattern);
        if (match.Success)
        {
            return match.Groups["Quantity"].Value;
        }
 
        return string.Empty;
    }
}

Add this file in your SqlClr project but make sure to add it as a shared link.

Shared link

Then call the MyLinqFunctions methods in the MySqlClrFunctions methods in the SqlClr project.

public partial class MySqlClrFunctions
{
    [SqlFunction()]
    public static SqlBoolean ContainsBottles(SqlString quantityPerUnit)
    {
        return MyLinqFunctions.ContainsBottles(quantityPerUnit.Value);
    }
 
    [SqlFunction()]
    public static SqlString GetQuantityBottles(SqlString quantityPerUnit)
    {
        return MyLinqFunctions.GetQuantityBottles(quantityPerUnit.Value);
    }
}

Now this .NET logic can be used in either a LINQ to Objects or LINQ to Entities query.

var productsLinqToEntities =
    from p in context.Products
    where MySqlLinqFunctions.ContainsBottles(p.QuantityPerUnit)
    select new
    {
        p.ProductID,
        p.ProductName,
        p.QuantityPerUnit,
        Quantity = MySqlLinqFunctions.GetQuantityBottles(p.QuantityPerUnit)
    };
 
var productsLinqToObjects =
    from p in context.Products.ToList()
    where MyLinqFunctions.ContainsBottles(p.QuantityPerUnit)
    select new
    {
        p.ProductID,
        p.ProductName,
        p.QuantityPerUnit,
        Quantity = MyLinqFunctions.GetQuantityBottles(p.QuantityPerUnit)
    };

Of course always try to implement LINQ to Entities queries because these queries will be executed in the database which will make them a lot faster.

 

Creating SqlClr User Defined aggregates

I was also looking for a way to concatenate a list of strings. In LINQ to Objects this can be easily done by calling the Aggregate<T> extension method but there is no equivalent for LINQ to Entities.

var productNames = context.Products.Select(p => p.ProductName)
   .AsEnumerable().Aggregate((a, b) => a + ", " + b);

I tried a lot of things with SqlClr functions but it seems that it is not possible to use collection/table-valued input parameters. Finally I found a nice example on the MSDN website on how to create SqlClr User Defined Aggregates. The aggregate class should be decorated with the SqlUserDefinedAggregate attribute and it should implement the IBinarySerialize interface. Four methods are required: Init, Accumulate, Merge, and Terminate. Following StringAggregate class will concatenate all strings in the collection. Just add this class in the SqlClr project, build it and deploy it.

[Serializable]
[SqlUserDefinedAggregate(Format.UserDefined, MaxByteSize = 8000)]
public class StringAggregate : IBinarySerialize
{
    /// <summary> 
    /// Variable holds intermediate result of the concatenation 
    /// </summary> 
    private StringBuilder intermediateResult;
 
    /// <summary> 
    /// Initialize the internal data structures 
    /// </summary> 
    public void Init()
    {
        intermediateResult = new StringBuilder();
    }
 
    /// <summary> 
    /// Accumulate the next value, nop if the value is null 
    /// </summary> 
    /// <param name="value"></param> 
    public void Accumulate(SqlString value)
    {
        if (value.IsNull)
        {
            return;
        }
 
        intermediateResult.Append(value.Value).Append(", ");
    }
 
    /// <summary> 
    /// Merge the partially computed aggregate with this aggregate. 
    /// </summary> 
    /// <param name="other"></param> 
    public void Merge(StringAggregate other)
    {
        intermediateResult.Append(other.intermediateResult);
    }
 
    /// <summary> 
    /// Called at end of aggregation, to return results. 
    /// </summary> 
    public SqlString Terminate()
    {
        var output = string.Empty;
        if (intermediateResult != null && intermediateResult.Length > 0)
        {
            output = intermediateResult.ToString(0, intermediateResult.Length - 2);
        }
 
        return new SqlString(output);
    }
 
    public void Read(BinaryReader r)
    {
        intermediateResult = new StringBuilder(r.ReadString());
    }
 
    public void Write(BinaryWriter w)
    {
        w.Write(intermediateResult.ToString());
    }
}

 

Calling the SqlClr aggregates in SQL

After deploying the aggregate to the database it can be used in SQL queries.

SELECT
  c.CategoryName,
  Products = dbo.StringAggregate(p.ProductName)
FROM Categories c, Products p
WHERE p.CategoryID = c.CategoryID
GROUP BY c.CategoryName

AggregateString SQL example

 

Calling the SqlClr aggregates in LINQ to Entities

The Update Model from Database wizard of the Entity Data Model will recognize the Aggregate as a stored procedure and because of this it can be imported in the storage model (SSDL). There is only one problem. The import of aggregates in EF4 does not work correctly. If you take a look at the generated function import in the XML of the SSDL file, then you will see that the Aggregate property is set to False and that the parameter Type is nvarchar. This is a bug and I already submitted it to Microsoft.

<Function Name="StringAggregate" ReturnType="nvarchar" Aggregate="false" 
  BuiltIn="false" NiladicFunction="false" 
  IsComposable="true" ParameterTypeSemantics="AllowImplicitConversion" Schema="dbo">
  <Parameter Name="value" Type="nvarchar" Mode="In" />
</Function> 

You should correct this manually. Set Aggregate to True and the Type should be Collection(nvarchar).

<Function Name="StringAggregate" ReturnType="nvarchar" Aggregate="true" 
  BuiltIn="false" NiladicFunction="false" 
  IsComposable="true" ParameterTypeSemantics="AllowImplicitConversion" Schema="dbo">
  <Parameter Name="value" Type="Collection(nvarchar)" Mode="In" />
</Function> 

If you don't modify it, LINQ to Entities queries will not work and throw "Column 'dbo.Categories.CategoryName' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause." exceptions.

Now you can create a Model Defined Function by mapping the aggregate with the EdmFunction attribute. Make sure to implement it as an extension method of IEnumerable<string>.

public static class MySqlLinqFunctions
{
    [EdmFunction("NorthwindModel.Store", "StringAggregate")]
    public static string StringAggregate(this IEnumerable<string> source)
    {
        throw new NotSupportedException("This function can only be used in a LINQ to Entities query");
    }

And finally this aggregate can be used in LINQ to Entities queries. Really cool, isn't it?

var categories =
    from c in context.Categories
    select new
    {
        c.CategoryID,
        c.CategoryName,
        ProductNames = c.Products.Select(p => p.ProductName).StringAggregate()
    };

Be careful when updating your Entity Data Model because you should modify the imported aggregates in the storage model manually after each update. Hopefully this bug will be solved in a future service pack.

 

I hope you liked these 3 articles about stored procedures, user defined functions, model defined functions and SqlClr functions and aggregates in the Entity Framework 4.0. Model Defined Functions are really a great asset to LINQ to Entities and I hope you can take advantage of these examples.