Using Expression Trees and Lambda Expressions to perform CAML queries – Part 1

I have a problem.
I live for writing “good looking” and “easy to read” (maintainable) code “despite” using SharePoint as a backend framework. That combination is sometimes not so easy though.
Imagine this example of such an occasion:

I want to be able to write queries to execute in my C# project that uses SharePoint as the backend. SPSiteDataQuery is the prefered option and such a query is constructed using CAML (Collaborative Application Markup Language), a special form of query language used in SharePoint.
The problem with CAML is that it is somewhat hard to understand and maintain. Imagine that I want to find all items that are of a certain type (in SharePoint a content type), i.e. a book in my backend system.

Before going further, here is a list of the other posts in the series:

Let´s continue…
Normally you construct a sort of business layer (some might use a sort of repository in if you use  the ideas of domain driven design DDD) with a method that can take a string that is the CAML, something like this:

public static IList<SPListItem> FindAll(string caml)...

When we call that method it looks something like this to find items of a certain content type:

FindAll("<Where><BeginsWith><FieldRef Name=\"ContentTypeId\" /><Value Type=\"Text\">0x010101008103BAF1CBB541CB9F9A6DF623AEFEDD</Value></BeginsWith></Where>")

That isn’t a pretty looking peace of code if you ask me, it´s a definite code smell.
Why?
Well, first of all if you aren´t fluent in CAML you won´t have a chance of reading that and understanding it right away. Ok, so this example isn´t very hard but imagine that you had a more complex expression with a couple of ANDs and ORs in it.
Some might say that we could encapsulate the CAML using some like CamlDotNet to construct the query. That´s great, I would respond but the problem is still there though (that is just a more fancy way of constructing CAML). CAML is bad when you would like your code to be easy to read, maintainable, and expressive.

I have for a while looked at LinqToSharePoint to solve this for me but I still had some problems with it. First of all I think it is too large for me, I only want to use something like LINQ (lambda expressions) to ask the questions and then translate them into CAML. LinqToSharePoint is great (and probably will be even greater in 2010) but in this situation it doesn´t quite do the trick. The big advantage with LINQ (and Expression Trees) is that it is a uniform (and easy to read) way to construct a filter (some might call that a query) over a collection of items.

So after reading up on Expression Tree parsing, here´s a list of some of the blog posts I have found:

…I started to construct my own parser, starting with the new method signature that will look like this (which now makes it possible for me to call this function with an expression):

public static IList<SharePointItem> FindAll(Expression<Func<SharePointItem, bool>> expression)

Note: SharePointItem is just a class that I have constructed to encapsulate some of the stuff I need to know about a SPListItem:

    class SharePointItem
    {
        public string ContentTypeId { get; set; }
        public int ListItemId { get; set; }
        public Guid SiteId { get; set; }
        public Guid ListId { get; set; }

The Func<SharePointItem, bool> delegate encapsulates a method that will return a bool and will take a SharePointItem as an inputparameter.

The following lambda expression will meet that expectation:

SPItem => SPItem.ContentTypeId.StartsWith(“0x010101008103BAF1CBB541CB9F9A6DF623AEFEDD”)

 The lambda expression SPItem => SPItem.ContentTypeId.StartsWith(contentTypeId) is really just a fancy way of writing a method that looks like this :

public static bool ContentTypeIs(SharePointItem item, string contentTypeId)
{
    return item.ContentTypeId.StartsWith(contentTypeId);
}

OK, so now I think I have removed the smell and the code is now again easy to read and maintain.
The issue now is how do I translate an expression tree into CAML? The answer is really quite easy (in theory atleast), you only have to recursively visit the different nodes in the expression tree and translate them into the corrresponding CAML.

Let´s look a the code for the ExpressionToCamlParser.
First of all the parser will have a method called Translate:

private XElement query = new XElement("Where");

public virtual string Translate(Expression expression)
{
   query.Add(Visit(expression));
   return query.ToString(SaveOptions.DisableFormatting);
}

Note: You will see that I have “hardcoded” the startingpoint to “Where”. For this example that is fine but feel free to change the implementation to fit your needs. Remember that I will take babysteps here and this is not meant to be a fullblown CAML parser just yet.

I will add the contents of the call to Visit as a childnode to “Where”. The visit method looks like this:

protected virtual XElement Visit(Expression expression)
{
  if (expression == null)
  {
    return null;
  }

  switch (expression.NodeType)
  {
     case ExpressionType.Call:
       return VisitMethodCall(expression as MethodCallExpression);
     case ExpressionType.MemberAccess:
       return VisitMemberAccess(expression as MemberExpression);
     case ExpressionType.Constant:
       return VisitConstant(expression as ConstantExpression);
     case ExpressionType.And:
     case ExpressionType.AndAlso:
     case ExpressionType.Or:
     case ExpressionType.OrElse:
     case ExpressionType.LessThan:
     case ExpressionType.LessThanOrEqual:
     case ExpressionType.GreaterThan:
     case ExpressionType.GreaterThanOrEqual:
     case ExpressionType.Equal:
     case ExpressionType.NotEqual:
       return VisitBinary(expression as BinaryExpression);
     default:
       return VisitUnknown(expression);
  }
}

You will notice that I have taken care of calls of type binary (AND, OR, EQ and so on), constants ( == 1), MemberAccess (will come to that), and MethodCalls( .StartWith() and so on). As I pointed out, this is babysteps so it will not work for all types of expression (not yet atleast).

Let´s first look at binary calls:

protected virtual XElement VisitBinary(BinaryExpression binary)
{
  XElement node = ParseNodeType(binary.NodeType);

  XElement left = Visit(binary.Left);
  XElement right = Visit(binary.Right);

  if (left != null && right != null)
  {
     node.Add(left, right);
  }

  return node;
}

So, I first take case of parsing the node type (code below) and then I will visit (this is the recursion) the left part of the tree and the the right part. Then I will add the left and right to the node that I just parsed. This will take care of the AND or OR that have right and left branches (you will see an example of this later on).

protected virtual XElement ParseNodeType(ExpressionType type)
{
   XElement node;

   switch (type)
   {
      case ExpressionType.AndAlso:
      case ExpressionType.And:
        node = new XElement("And");
        break;
      case ExpressionType.Or:
      case ExpressionType.OrElse:
        node = new XElement("Or");
        break;
      case ExpressionType.Equal:
        node = new XElement("Eq");
        break;
      case ExpressionType.GreaterThan:
        node = new XElement("Gt");
        break;
      case ExpressionType.GreaterThanOrEqual:
        node = new XElement("Geq");
        break;
      case ExpressionType.LessThan:
        node = new XElement("Lt");
        break;
      case ExpressionType.LessThanOrEqual:
        node = new XElement("Leq");
        break;
      default:
        throw new Exception(string.Format("Unhandled expression type: '{0}'", type));
   }

       return node;
}

Let´s continue with what happens when a call is made to .StartWith (as it is inside my query):

protected virtual XElement VisitMethodCall(MethodCallExpression methodcall)
{
  XElement node;
  XElement left = Visit(methodcall.Object);
  XElement right = Visit(methodcall.Arguments[0]);

  switch (methodcall.Method.Name)
  {
     case "Contains":
       node = new XElement("Contains");
       break;
     case "StartsWith":
       node = new XElement("BeginsWith");
       break;
     default:
       throw new Exception(string.Format("Unhandled method call: '{0}'", methodcall.Method.Name));
  }

  if (left != null && right != null)
  {
     node.Add(left, right);
  }

  return node;

}

It´s really quite the same as what happens in the binary translation. I visit the left and right subtrees and then I check to see which method is called (and if I can handle that) and lastly a concatination is made.

Ok, now let´s look an how we get the FieldRef and Value out of the expression tree:

protected virtual XElement VisitMemberAccess(MemberExpression member)
{

   var expr = member.Expression;
   if (expr.NodeType == ExpressionType.Constant)
   {
      LambdaExpression lambda = Expression.Lambda(member);
      Delegate fn = lambda.Compile();
      return VisitConstant(Expression.Constant(fn.DynamicInvoke(null), member.Type));

   }
   else
   {
      return new XElement("FieldRef", new XAttribute("Name", member.Member.Name));
   }
}

Firstly, note that we will have to compile the lambda expression (in this case that is the SPItem => SPItem.StartWith(…)) and invoke it to get the actual values.
More on this can be found here.

A call to VisitContant is made inside the method and can also be called from the Visit method:

protected virtual XElement VisitConstant(ConstantExpression constant)
{
  return new XElement("Value", ParseValueType(constant.Type), constant.Value);
}

ParseValueType is just a helper method to construct the attribute and looks like this (for now atleast)

protected virtual XAttribute ParseValueType(Type type)
{
   string name = "Text";

   switch (type.Name)
   {
     case "DateTime":
       name = "DateTime";
       break;
     case "String":
       name = "Text";
       break;
     default:
       throw new Exception(string.Format("Unhandled value type parser for: '{0}'", type.Name));
   }

   return new XAttribute("Type", name);
}

So does it work?

The entire demo code for the Find method looks like this:

        public static IList<SharePointItem> FindAll(Expression<Func<SharePointItem, bool>> expression)
        {
            List<SharePointItem> items = new List<SharePointItem>();

            ExpressionToCamlParser parser = new ExpressionToCamlParser();
            string caml = parser.Translate(expression.Body);

            SPSiteDataQuery query = new SPSiteDataQuery
            {
                Query = caml,
                Lists = "<Lists MaxListLimit=\"0\" BaseType=\"1\" />",
                Webs = "<Webs Scope=\"SiteCollection\" />"
            };

            using (SPSite sitecollection = new SPSite("http://MOSSDEV/sites/books"))
            using (SPWeb site = sitecollection.OpenWeb())
            {
                var result = site.GetSiteData(query);

                foreach (DataRow row in result.AsEnumerable())
                {

                    int itemId = int.Parse(row.Field<string>("ID"));
                    Guid listId = new Guid(row.Field<string>("ListID"));
                    Guid siteId = new Guid(row.Field<string>("WebID"));

                    items.Add(new SharePointItem
                    {
                        ContentTypeId = Book.ContentTypeId,
                        ListId = listId,
                        SiteId = siteId,
                        ListItemId = itemId
                    });
                }
            }

            return items.AsReadOnly();
        }

And the output:

Note: I have made a print of the actual CAML which isn´t in the code though.

So, it does work (with this little simple sample). In the next post I will make it work on some more and tougher CAML and probably refactor my parser a bit, remember I sad babysteps…

Stay tuned…

, , , , , ,

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: