Search Solutions is one of my favourite search events of the year - small, focused and varied, with presentations from both the largest and smallest players in the world of search, drawn from both industry and academia. This year's event started with Edgar Meij of Bloomberg, who Flax have helped in the past with their larg...Continue reading
Tag Archives: elasticsearch
A search-based suggester for Elasticsearch with security filters
Both Solr and Elasticsearch include suggester components, which can be used to provide search engine users with suggested completions of queries as they type: Query autocomplete has become an expected part of the search experience. Its benefits to the user include les...Continue reading
Elastic acquires Swiftype and broadens its offering to include enterprise search
The news today that Elastic (the company behind the open source Elasticsearch software) has acquired Swiftype will have surprised a few people, even though Elastic has already acquired a good number of other companies. Swiftype have a couple of products that deliver cloud-based site and enterprise search and under the hood, all of this is built on Elasticsearch. Swiftype are part of a new ...Continue reading
Elastic London Meetup: Rightmove & Signal Media and a new free security plugin for Elasticsearch
I finally made it to a London Elastic Meetup again after missing a few of the recent events: this time Rightmove were the hosts and the first speakers. They described how they had used Elasticsearch Percolator to run 3.5 million stored searches on new property listings as part of an overall migration from the Exalead search engine and Oracle database to a new stack bas...Continue reading
Better performance with the Logstash DNS filter
We've been working on a project for a customer which uses Logstash to read messages from Kafka and write them to Elasticsearch. It also parses the messages into fields, and depending on the content type does DNS lookups (both forward and reverse.) While performance testing I noticed that adding caching to the Logstash DNS filter actually reduced performance, contrary to expectations. With four filter worker threads, and the following configuration:
dns { resolve => [ ...Continue reading
Elasticsearch, Kibana and duplicate keys in JSON
JSON has been the lingua franca of data exchange for many years. It's human-readable, lightweight and widely supported. However, the JSON spec does not define what parsers should do when they encounter a duplicate key in an object, e.g.:
{ "foo": "spam", "foo": "eggs", ... }Implementations are free to interpret this how they like. When different systems have different interpretations this can cause problems. We recently encounter...Continue reading
London Lucene/Solr Meetup – Introducing Marple & Solr Classification
A small crowd for this month's London Lucene/Solr Meetup, kindly hosted by Barclays in their sumptuous Canary Wharf offices. I introduced the Meetup and spoke briefly on how Flax is currently looking for team members (want to work on a variety of cutting-edge open source search projects in the UK and abroad? Get in touch!) before introducing Flax's Alan Woodwar...Continue reading
Making sense of Big Data with open source search
Not one, but three Lucene hackdays coming soon!
We're always keen to get more people involved in the Lucene search community - there's always lots to do, from deep hacking of the core code, to testing with different frameworks and clients, to creating documentation and examples. It's also just over fifteen years since Tom Mortimer and I founded Flax and we thought we should mark this birthday with some kind of event! So I'm thus very happy to announce we'll be involved in three Lucene hackday events over the next two months: Firstly, Continue reading
Boosts Considered Harmful – adventures with badly configured search
During a recent client visit we encountered a common problem in search - over-application of 'boosts', which can be used to weight the influence of matches in one particular field. For example, you might sensibly use this to make results that match a query on their title field come higher in search results. However in this case we saw huge boost values used (numbers in the hundreds) which were probably swamping everything else - and it wasn't at all clear where the values had come from, be it ex...Continue reading