marple – Flax http://www.flax.co.uk The Open Source Search Specialists Thu, 10 Oct 2019 09:03:26 +0000 en-GB hourly 1 https://wordpress.org/?v=4.9.8 London Lucene/Solr Meetup – Introducing Marple & Solr Classification http://www.flax.co.uk/blog/2017/03/27/london-lucenesolr-meetup-introducing-marple-solr-classification/ http://www.flax.co.uk/blog/2017/03/27/london-lucenesolr-meetup-introducing-marple-solr-classification/#respond Mon, 27 Mar 2017 13:16:36 +0000 http://www.flax.co.uk/?p=3454 A small crowd for this month’s London Lucene/Solr Meetup, kindly hosted by Barclays in their sumptuous Canary Wharf offices. I introduced the Meetup and spoke briefly on how Flax is currently looking for team members (want to work on a … More

The post London Lucene/Solr Meetup – Introducing Marple & Solr Classification appeared first on Flax.

]]>
A small crowd for this month’s London Lucene/Solr Meetup, kindly hosted by Barclays in their sumptuous Canary Wharf offices. I introduced the Meetup and spoke briefly on how Flax is currently looking for team members (want to work on a variety of cutting-edge open source search projects in the UK and abroad? Get in touch!) before introducing Flax’s Alan Woodward who introduced our new Lucene index inspection tool, Marple.

Alan told us how Marple was conceived at the Lucene4IR event in Glasgow last year and how coding started at our Lucene Hackday in London. Although the well-known tool Luke allows one to dive deep into Lucene indexes, it hasn’t kept up with recent additions to Lucene index structures and we also wanted to build a tool with a RESTful API and separate GUI to allow it to be run easily on our client’s indexes in a read-only mode. Alan demonstrated Marple’s features including how it allows one to see the ‘hidden’ Lucene index fields that Elasticsearch creates. The first release of Marple is out and we’d welcome any feedback and contributions.

Next up was Alessandro Benedetti with an engaging talk about Solr’s built-in document classification features, useful for everything from spam filtering to automatic product categorisation. Unlike many classification methods, this uses the Lucene index itself as the training set – this index must contain some documents with manually assigned classification fields. Either K-Nearest-Neighbour and Naive Bayes algorithms can be used to perform the classification via Solr’s UpdateRequestProcessor chain, in Solr versions after 6.1. You can read more detail on Alessandro’s excellent blog.

We concluded with a brief Q&A session and then popped downstairs to a pub for some snacks and drinks. Thanks to both our speakers, our hosts and all who came – we’ll return in a couple of months with talks that will include René Kriegler on his neat Querqy query processor.

The post London Lucene/Solr Meetup – Introducing Marple & Solr Classification appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2017/03/27/london-lucenesolr-meetup-introducing-marple-solr-classification/feed/ 0
Release 1.0 of Marple, a Lucene index detective http://www.flax.co.uk/blog/2017/02/24/release-1-0-marple-lucene-index-detective/ http://www.flax.co.uk/blog/2017/02/24/release-1-0-marple-lucene-index-detective/#respond Fri, 24 Feb 2017 14:34:05 +0000 http://www.flax.co.uk/?p=3424 Back in October at our London Lucene Hackday Flax’s Alan Woodward started to write Marple, a new open source tool for inspecting Lucene indexes. Since then we have made nearly 240 commits to the Marple GitHub repository, and are now … More

The post Release 1.0 of Marple, a Lucene index detective appeared first on Flax.

]]>
Back in October at our London Lucene Hackday Flax’s Alan Woodward started to write Marple, a new open source tool for inspecting Lucene indexes. Since then we have made nearly 240 commits to the Marple GitHub repository, and are now happy to announce its first release.screen-shot-2017-02-24-at-12-34-30

Marple was envisaged as an alternative to Luke, a GUI tool for introspecting Lucene indexes. Luke is a powerful tool but its Java GUI has not aged well, and development is not as active as it once was. Whereas Luke uses Java widgets, Marple achieves platform independence by using the browser as the UI platform. It has been developed as two loosely-coupled components: a Java and Dropwizard web service with a REST/JSON API, and a UI implemented in React.js. This approach should make development simpler and faster, especially as there are (arguably) many more React experts around these days than native Java UI developers, and will also allow Marple’s index inspection functionality to be easily added to other applications.

Marple is, of course, named in honour of the famous fictional detective created by Agatha Christie.

What is Marple for? We have two broad use cases in mind: the first is as an aid for solving problems with Lucene indexes. With Marple, you can quickly examine fields, terms, doc values, etc. and check whether the index is being created as you expect, and that your search signals are valid. The other main area of use we imagine is as an educational tool. We have made an effort to make the API and UI designs reflect the underlying Lucene APIs and data structures as far as is practical. I have certainly learned a lot more about Lucene from developing Marple, and we hope that other people will benefit similarly.

The current release of Marple is not complete. It omits points entirely, and has only a simple UI for viewing documents (stored fields). However, there is a reasonably complete handling of terms and doc values. We’ll continue to develop Marple but of course any contributions are welcome.

You can download this first release of Marple here together with a small Lucene index of Project Gutenberg to inspect. Details of how to run Marple (you’ll need Java) are available in the README. Do let us know what you think – bug reports or feature requests can be submitted via Github. We’ll also be demonstrating Marple in London on March 23rd 2017 at the next London Lucene/Solr Meetup.

The post Release 1.0 of Marple, a Lucene index detective appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2017/02/24/release-1-0-marple-lucene-index-detective/feed/ 0
A tale of two cities (and two Lucene Hackdays) http://www.flax.co.uk/blog/2016/10/21/tale-two-cities-two-lucene-hackdays/ http://www.flax.co.uk/blog/2016/10/21/tale-two-cities-two-lucene-hackdays/#respond Fri, 21 Oct 2016 10:27:00 +0000 http://www.flax.co.uk/?p=3365 To mark Flax’s 15th anniversary we ran two Lucene Hackdays recently, in London and Boston. I even made some Flax cakes! The London event was attended by around 20 people from companies both large and small and kindly hosted by … More

The post A tale of two cities (and two Lucene Hackdays) appeared first on Flax.

]]>
cuj9dqlvyaak6uc-jpg-large
To mark Flax’s 15th anniversary we ran two Lucene Hackdays recently, in London and Boston. I even made some Flax cakes! The London event was attended by around 20 people from companies both large and small and kindly hosted by Bloomberg (who are currently very active in the Lucene/Solr community). We split up into a number of groups to work on a range of projects. Erica Sundberg from Blackrock took a group of beginners through installing Solr and indexing their first collection, while also considering how a minimal Solr example could be built (some of the shipped examples being rather complex). Another team led by Christine Poerschke of Bloomberg looked at a way to avoid slightly different statistics being returned from different Solr replicas (which can cause result ordering to appear to ‘jump’) and Diego Ceccarelli looked at adding BM25F ranking to Lucene. Other groups looked at SQL streaming with Solr (committer Joel Bernstein dialed in via Skype to help) and Flax’s Alan Woodward worked on Marple, a browser-based explorer for Lucene indexes. The day finished with a curry dinner kindly sponsored by Alfresco.

Several days later we ran a similar Hackday in Boston, as many Lucene people were in town for Lucene Revolution. Many more Lucene/Solr committers attended this time and enjoyed a chance to work on their own projects or to continue some of the work we’d started in London. Doug Turnbull came up with a way to do BM25F ranking with existing Lucene features while Alexandre Ravalovitch and I had a long conversation about minimal Solr examples and improving the way beginners can start with Solr. Other projects included new field types for Lucene, improved highlighters and DocValues. BA Insight were kind enough to provide the venue and Lucidworks sponsored drinks and snacks later in the pub downstairs.

We’ve gathered notes on what we worked on with links to some of the software we developed here – please do get involved if you can! In particular the Marple project is attracting further contributions (and interest from those who developed and maintain the existing Luke Lucene index inspector).

I’d like to thank everyone who came to the Hackdays, our generous sponsors for providing venues, food and drink and to those who helped organise the events. The feedback has been excellent (and do let us know if you have any further comments) and people seem keen for this to be a regular event before the annual Lucene Revolution conference – a chance to work on Lucene-based projects outside of regular work, to meet, network and spend time with other contributors and to enjoy being part of a great open source community. We’ll be back!

The post A tale of two cities (and two Lucene Hackdays) appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2016/10/21/tale-two-cities-two-lucene-hackdays/feed/ 0