linq2couch (look, ma, no puns!)
23 Sep 2009 11:39 AM UncategorizedBoth my work and side projects lately have taken me in for a closer look at http://couchdb.apache.org/. I'll assume that if you're reading this, you've run into it in the past, and will forgo an explanation of what couchdb is.
One of the ndjango contributors had mentioned in passing that it would be nice to have a .net library that would wrap couchdb in an IQueriable, which got me thinking - what would a LINQ implementation for couch look like? Obviously, a full on "orm" implementation would be impractical - the whole point of couch is that querying (views) is done using a limited set of operations on predefined views. The couchdb wiki even goes so far as to say that development of views in couchdb should be treated like schema in a conventional database: do it at the start of the project, don't change it later. Changing a view once the database already has several million documents in it becomes an expensive endeavor. That said, there is still merit in a simplified, type-checked querying approach.
I started off using a great open source .net couch wrapper - Divan, and forked it (you can find mine here, but Göran has already pulled my changes into the official repository), and then used the LINQ provider code from linq in action as a template started off.
Divan already has a full implementation of the couch view api, so this became a simple matter of mapping a linq expression tree onto that api. Using the ExpressionVisitor concept from linq in action, I ended up decomposing the expression tree like this:
/// Recursivley processes the expression tree protected virtual void VisitExpression(Expression expression) { switch (expression.NodeType) { case ExpressionType.AndAlso: hasAnd = true; VisitBinary((BinaryExpression)expression); break; case ExpressionType.OrElse: hasOr = true; VisitBinary((BinaryExpression)expression); break; case ExpressionType.GreaterThan: case ExpressionType.LessThan: throw new NotSupportedException(expression.NodeType + " is not a supported expression"); case ExpressionType.Equal: if (hasAnd) throw new NotSupportedException("'and' operations cannot be performed on key sets. All key sets are 'or' operations."); if (startKeySet || endKeySet) throw new NotSupportedException("Key and range operations cannot be combined in a single query"); CallIfPresent((BinaryExpression)expression, (val) => keys.Add(val)); break; case ExpressionType.LessThanOrEqual: if (endKeySet || hasOr) throw new NotSupportedException("Range queries over multiple ranges are not supported"); if (keys.Count > 0) throw new NotSupportedException("Key and range operations cannot be combined in a single query"); CallIfPresent((BinaryExpression)expression, (val) => Query.EndKey(val)); endKeySet = true; break; case ExpressionType.GreaterThanOrEqual: if (startKeySet || hasOr) throw new NotSupportedException("Range queries over multiple ranges are not supported"); if (keys.Count > 0) throw new NotSupportedException("Key and range operations cannot be combined in a single query"); CallIfPresent((BinaryExpression)expression, (val) => Query.StartKey(val)); startKeySet = true; break; case ExpressionType.Lambda: VisitExpression(((LambdaExpression)expression).Body); break; default: if (expression is MethodCallExpression) VisitMethodCall((MethodCallExpression)expression); break; } }
A few things to point out here:
- Range operations get translated to key ranges in couchdb parlance. That means the expression "where a.b >= 5" will be translated a "startKey" parameter. This also means that a straight GreaterThan and LessThan won't be supported, because the startKey/endKey parameters in couch are inclusive. Also, notice the check for whether an equality operation has been recorded. If it has, we won't allow a range operation, as you can't combine a set of keys with a key range
- Equality operations get grouped together, and then translated to either a GET with a key="" or a POST with a key range. Again, note that you can't mix equality operators and range operators
- Boolean operators are also restricted by first usage. You can use ORs to build a key list, and you can use ANDs to build a range (a greater than b and c less than d). ANDs and ORs can't be mixed, just like equality and ranges
So what we end up with is the ability to specify a couchdb view api call using a LINQ syntax. This certainly won't allow you make arbitrary calls or generate views, but it does make your c# apps that interact with couchdb a bit more homogeneous:
Before
var hps = new int[] {176, 177}; var twoMoreCars = query .Keys(hps) .GetResult() .ValueDocuments<Car>(); foreach (var c in twoMoreCars) Console.WriteLine(c.Make + " " + c.Model + " with " + c.HorsePowers + "HPs");
After
var hps = new int[] {176, 177}; var twoMoreCars = from c in linqCars where hps.Contains(c.HorsePowers) select c.Make + " " + c.Model + " with " + c.HorsePowers + "HPs"; foreach (var twoCar in twoMoreCars) Console.WriteLine(twoCar);
A small digression
While putting this together, I ran into an interesting question. How do you apply the select clause to the result set? The simple answer is to build a facade enumerator for the IQueriable to return. the problem is what to actually return? We need to somehow go from the expression tree within the select clause to a value that can be returned. What I did in linq2couch was to extract out the Select expression as part of my parsing:
/// Processes a "MethodCall" node private void VisitMethodCall(MethodCallExpression expression) { ... else if ((expression.Method.DeclaringType == typeof(Queryable)) && (expression.Method.Name == "Select")) { SelectExpression = expression; VisitExpression(expression.Arguments[0]); } ... }
Then, generate a delegate out of it:
public TransformingEnumerator(IEnumerator e, MethodCallExpression transformer) { this.e = e; var t = (UnaryExpression)transformer.Arguments[1]; this.transformer = ((LambdaExpression)t.Operand).Compile(); }
And then apply it as objects are pulled out
public TReturn Current { get { return (TReturn)transformer.DynamicInvoke(e.Current); } }
The trick is that the the Func that the select wraps is Func<T1, T2>, where the inbound parameter is of whatever type the queriable is made to be, and the return value is whatever the expression evaluates to be, so the whole thing lines up quite nicely.
Thanks for the link to Divan… nice example for LINQ use as well. We are looking at CouchDb as another backing store for our hierarchical data framework project. One of my devs is a big Ruby fan and has had good experiences with CouchDb and is looking forward to folding that experience into our .NET world.
again – glad you’ve found it useful
. divan is an excellent library with a growing number of contributors. we’ve built out http://friendsell.com using it and have had nothing but good luck (so far *crosses fingers)