The most well known open source search engine, Apache Lucene/Solr, has a rival in Elasticsearch, also based on Apache Lucene. Or maybe it doesn’t. I’m not convinced that there’s an actual battle going on here, above and beyond the fact that the commercial companies formed to support each technology (Lucidworks and Elasticsearch [the company]) are obviously competitors. Let’s look at the evidence:
- Elasticsearch contains (by some measures) 64 years of effort, Solr only 55 years….a point to Elasticsearch!
- Elasticsearch commits are 31% down on last year, Solr commits are 85% up…a point to Solr!
- There are more books about Solr than Elasticsearch…a point to Solr!
- Elasticsearch, sorry elasticsearch, has a cool lower case logo and fancy website…a point to Elasticsearch!
This is of course before we get to any actual technical differences in terms of performance, scalability, ease-of-use etc. which are probably a lot more important than the list above. There are vocal critics and supporters of each project on Twitter and other media, but the great thing in our view is that there is a choice of two such excellent search technologies, both open source, so for real world applications one can try both at little cost and choose whichever is most appropriate (there are even proven migration routes between the two – we’ve helped one client with this process).
Charlie, I’m curious about the ‘years’ statistic.. can you elaborate?
Apparently Ohloh uses the COCOMO model which (according to Wikipedia) “computes software development effort (and cost) as a function of program size. Program size is expressed in estimated thousands of source lines of code” – so a) it’s a little surprising Elasticsearch has more lines of code than Solr and b) bigger certainly isn’t better when it comes to software as anyone who has dealt with an old, bloated codebase will tell you.
My point was that Ohloh’s statistics, although interesting, probably aren’t that useful in determining which is the “better” open source project in this instance. However certain people have been linking to these pages and implying they are, which is rather silly.
If you count cut and paste from solr analyzers code, and inclusion of libraries like joda time, xtream and guava as LOC, sure es has way more LOC than solr, hence more years of dev…
just my two cents.