Lucene/Solr commiter, Mahout co-creator, LucidWorks co-founder and general all-round search expert Grant Ingersoll visited us last week on his way to the SIGIR conference in Dublin. We visited the European Bioinformatics Institute on the Wellcome Trust Genome Campus to hear about some fascinating projects using Lucene/Solr to index genomes, phenomes and proteins and for Grant to give a talk on recent developments in both Lucene/Solr and Mahout – it was gratifying that over 50 people turned up to listen and at least 30 of these indicated they were using the technology.
After a brief rest it was then time to travel to London so Grant could talk at the Enterprise Search London Meetup on both recent developments in Lucene/Solr and what he dubbed ‘Search engine (ab)use’ – some crazy use cases of Lucene/Solr including for very fast key/value storage. Some great statistics including how Twitter make new tweets searchable in around 50 microseconds using only 8-10 indexing servers.
Next it was back to Cambridge for our own Lucene/Solr hack day in a great new co-working space. Attendees ranged from those who had never used Lucene/Solr to those with significant search expertise, and some had come from as far away as Germany – after a brief introduction we split into several groups each mentored by a member of the Flax team. Two groups (one comprised entirely of those who had never used Lucene) worked on a dataset of tweets from UK members of parliament and a healthy sense of competition developed between them – you can see some of the code they developed at in our Github account including an entity extractor webservice. Another group, led by Grant, created a SolrCloud cluster, with around 1-2 million documents split into 2 shards – running on ten laptops over a wireless connection! Impressively this was set up in less than ten minutes. Others worked on their own applications including an index of proteins and there was even some work on the Lucene/Solr code itself.
We’re hoping to put the results of some of these projects live very soon, so you can see just what can be built in a single day using this powerful open source software. Thanks to all who came, our hosts at Cambridge Business Lounge and of course Grant for his considerable energy and invaluable expertise. If nothing else, we’ve introduced a lot more people to open source search and sparked some ideas, and we ended off the week with beer in a sunny pub garden which is always nice!