Wednesday, November 2, 2011

Best open-source .NET text file differencing library

After a fairly thorough search for open-source .NET libraries for text file (e.g. source code) differencing, I've concluded that there are only two serious contenders :
Both look like they would meet my needs.

DiffPlex is C# only, whereas google-diff-match-patch contains equivalent implementations in Java, Javascript, C++, Objective-C, and more, so if you like the idea of learning an API once and using it e.g. in iOS projects or in web browsers (Javascript) directly, google-diff-match-patch is for you.

The DiffPlex API seems a little nicer if all you want is a simple diff.

google-diff-match-patch supports - as its name suggests - producing and processing patch files - so if you need the extra features, google-diff-match-patch wins again.

DiffPlex contains what appears to be a very nice & simple API to drive diff viewers. But the google-diff-match-patch does have some similar thing, even if the API is not as nice.

Both support a line-by-line mode.

google-diff-match-patch has a nice feature where it can simplify diffs down from "perfect" diffs to more semantically-meaningful diffs. It calls this a "cleanup" operation, and depending on your needs, that could be a deciding feature. My immediate needs are so simple that even cleanup isn't relevant, but if its relevant for you, google-diff-match-patch might be the go (unless I missed a similar feature in DiffPlex, but I'm pretty sure I didn't miss that feature).

In short, it seems both are suitable. DiffPlex has a nicer API for the world of .NET (e.g. C#-style naming conventions used throughout) whilst google-diff-match-patch has more features. For my needs - an open-source, native .NET differencing library - both libraries look very suitable and DiffPlex looks a little easier to learn and use (not that either are hard). But I think in the end I'm going to start with google-diff-match-patch on account of the multiple platforms it supports with a uniform API, and the cleanup facility which whilst not relevant immediately is perfect for something I'm planning to do in the future...

If you know of any other serious contenders, let me know, but I'm only interested in native .NET open-source libraries that can be downloaded and used without modification (so that excludes repurposing code in open-source diff viewers). And I did review a few options on Code Project but nothing there compelled me to believe their performance would be any better than the two projects I shortlisted, whereas I expect that these two shortlisted projects will have much better ongoing support.

Posted largely for my own future reference, but also to help other wandering developers. :o)