relevancy – Flax http://www.flax.co.uk The Open Source Search Specialists Thu, 10 Oct 2019 09:03:26 +0000 en-GB hourly 1 https://wordpress.org/?v=4.9.8 Defining relevance engineering part 2: learning the business http://www.flax.co.uk/blog/2018/06/26/defining-relevance-engineering-part-2-learning-the-business/ http://www.flax.co.uk/blog/2018/06/26/defining-relevance-engineering-part-2-learning-the-business/#respond Tue, 26 Jun 2018 11:16:57 +0000 http://www.flax.co.uk/?p=3845 Relevance Engineering is a relatively new concept but companies such as Flax and our partners Open Source Connections have been carrying out relevance engineering for many years. So what is a relevance engineer and what do they do? In this … More

The post Defining relevance engineering part 2: learning the business appeared first on Flax.

]]>
Relevance Engineering is a relatively new concept but companies such as Flax and our partners Open Source Connections have been carrying out relevance engineering for many years. So what is a relevance engineer and what do they do?

In this series of blog posts I’ll try to explain what I see as a new, emerging and important profession.

Before a relevance engineer can install or configure a search engine they need to understand the business concerned. I’ve called this ‘learning the business’ and it’s something that Flax has to do on a weekly basis. One week we may be talking to a recruitment business that thinks and operates in terms of jobs, skills, candidates and roles; the next week it could be a company that sells specialised products and is more concerned with features, prices, availability, stock levels and pack sizes. Even within a single sector, each business will work in a slightly different way, although there will be some common factors.

Example data is key to learning how a business works, but is next to useless without someone to explain it in context. In some cases the business has lost some of the internal knowledge about how their own systems work: “Jeff built that database, but he left two years ago.”. What seems obvious to them may not be obvious to anyone else. Generic terms e.g. “products”, “location”, “keywords” can mean completely different things in each business context. If they exist, corporate glossaries, dictionaries or taxonomies are very useful, but again they may need annotating to explain what each entry means. If a glossary doesn’t exist, it’s a good first step to start one.

Finding the right people to talk to is also vital. Although relevance engineers are usually engaged or recruited by the IT department, this may not be the best place to learn about the business. The marketing department may have the best view of how the business interacts with its clients; the CEO or Managing Director will know the overall direction and objectives but may not have time for the detail; the content creators (which could be librarians, web editors or product information managers) will know about the items the search engine will need to find.

In many companies there are hierarchies and structures that sometimes actively prevent the sharing of information: it’s common to discover who blames who for past bad decisions and to be used as a sounding board by those with axes to grind. At Flax we try to make sure we talk to people at all levels in the client organisation: sometimes the most junior employees – and especially those who are customer-facing – have the most useful information as they have to deal with problems on a day-to-day basis. As external consultants one of our most useful skills is being able to listen without making sudden judgements or assumptions.

The end result of these many conversations is an understanding of where source data is created, gathered and stored; what a ‘search result’ is in the context of a particular business (a product on sale? A contract? A CV or resumé?) and how it might be constructed from this data; what a ‘relevant’ result is in this context (a more valuable product to sell? The most recent contract version? The best candidate for a job?) and how good/bad/nonexistent the current search solution is. This is vital information to be gathered before one even begins thinking about how to install, develop and/or configure and test a search solution.

In the next post I’ll cover how a relevance engineer might assess the technical capability of a business with respect to search. In the meantime you can read the free Search Insights 2018 report by the Search Network. Of course, feel free to contact us if you need help with relevance engineering.

The post Defining relevance engineering part 2: learning the business appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2018/06/26/defining-relevance-engineering-part-2-learning-the-business/feed/ 0
Haystack, the relevance conference – birth of a new profession? http://www.flax.co.uk/blog/2018/04/16/birth-new-profession-haystack-relevance-conference/ http://www.flax.co.uk/blog/2018/04/16/birth-new-profession-haystack-relevance-conference/#respond Mon, 16 Apr 2018 15:34:13 +0000 http://www.flax.co.uk/?p=3773 I’ve just returned from Charlottesville, Virginia and the Haystack search relevance conference hosted by our partners Open Source Connections. The venues were their own office and the Random Row brewery next door – added once they realised that the event … More

The post Haystack, the relevance conference – birth of a new profession? appeared first on Flax.

]]>
I’ve just returned from Charlottesville, Virginia and the Haystack search relevance conference hosted by our partners Open Source Connections. The venues were their own office and the Random Row brewery next door – added once they realised that the event had outgrown its humble beginnings as a small, informal event for maybe 50 people into a professional conference for over twice that number with attendees from as far afield as the west coast of the US, Poland and of course the UK. I’ll be writing up each day of the event and what I learned from the talks in blogs to follow, but wanted to start with my overall impressions.

I don’t think I’ve been to any other conference with such a strong sense of community or such a high quality of presentations. It was particularly refreshing to be among a group of people with such a level of search expertise and experience that at no point did anything have to be ‘dumbed down’ or over-explained. The attendee list included open source committers from projects including Apache Lucene/Solr and Apache Tika, experts in commercial search, authors of books I’ve long regarded as essential for anyone working in this field, independent consultants and those working for huge global companies. The talks were well programmed, ran exactly to schedule and covered cutting-edge topics. Between these talks the networking was relaxed and friendly and I had a chance to get to know several people in real life that I’ve previously only connected with online.

I think this conference may also have signalled the birth of a new profession of “relevance engineer” – someone who can understand both the business and technical aspects of search relevance, work with a variety of underlying search engines and expertly use the correct tools for the job to drive a continuing process of search quality improvement. Personally, I learnt a huge amount of useful information, made connections with many others in our field and have pages of notes to follow up on.

Last but no means least is to extend my personal thanks to all at OSC who created, planned and ran the event – as a veteran of many events in both technical and non-technical fields I understand very well how much work goes into them, especially if you’re not an event planner by profession! You opened your doors to us and made us all feel very welcome and you all worked extremely hard to make this one of the best conferences I’ve ever attended.

More to follow on day 1 and day 2 soon.

The post Haystack, the relevance conference – birth of a new profession? appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2018/04/16/birth-new-profession-haystack-relevance-conference/feed/ 0
Setting up your first Quepid test case http://www.flax.co.uk/blog/2016/07/08/setting-first-quepid-test-case/ http://www.flax.co.uk/blog/2016/07/08/setting-first-quepid-test-case/#respond Fri, 08 Jul 2016 11:10:20 +0000 http://www.flax.co.uk/?p=3316 Quepid is an innovative tool from our partners Open Source Connections, which allows you to bridge the gap between content owners (who really know what’s in your search index and how people might search for it) and search developers (who … More

The post Setting up your first Quepid test case appeared first on Flax.

]]>
Quepid is an innovative tool from our partners Open Source Connections, which allows you to bridge the gap between content owners (who really know what’s in your search index and how people might search for it) and search developers (who can tweak the search engine to improve relevance, given some examples of ‘good’ and ‘bad’ results for a query). We’re increasingly using it in client projects – but how do you get started with creating test cases in Quepid? Viewing the various Quepid videos at http://quepid.com/support/ is the best place to get a sense of how Quepid works – so this is probably a good first step.

Now, let’s assume you have Quepid running in your browser – there’s a 30 day free trial which lets you create a single test case, which is a great way to try it out. A Case is used to illustrate a particular problem with search relevancy – say, how searching for ‘iPhone’ shows iPhone cases higher up the list than actual iPhones. Each Case contains a number of Queries. Note in this example we’re using Solr, but Quepid also works with Elasticsearch.

1. Hooking Quepid up to your search engine.

You’re going to need the help of your search developer for this one! He’ll need to tell you the URL of your Solr or Elasticsearch engine – and this will need to be accessible from the PC you’re running Quepid on. Since Quepid runs in the browser (although it stores its data in the Cloud) you shouldn’t have any trouble setting up secure access to your search engine – after all, your own PC is probably already within your corporate network. In Quepid, Click ‘Relevancy cases’ and ‘Create a case’. Give the case a name, like ‘iPhone_problem_English’ or ‘Two_word_queries’.

q6

Enter the URL provided by your developer: for Solr, it will probably look a bit like:
http://your domain/solr/name of a Solr index/select
e.g.
http://www.mycompany.com/solr/myproducts/select

q1

Quepid will then check it can see the Solr index – if it can’t, check that the URL is correct.

2. Setting up the right query fields

Now you need to tell Quepid an ID field (which must be unique) and a title field for each result. If you start typing, Quepid will show some suggestions – check with your developer for which ones to use as these will be defined in the schema configuration for your search engine. You can select any other fields to be displayed for each result: let Quepid suggest some by clicking in the Additional Display Fields box. All the above can be changed the Settings pane of the Tune Relevance panel later, so don’t worry if you don’t add everything now.

q5

3. Adding some queries

You can now add some queries to test – ‘iPhone’, ‘iPhone case’, ‘iphone’ or whatever fits the test you’re creating. Add a few for now, you can add more later. Once you’re done click Continue, then Finish and Quepid will try these queries out. Don’t worry if you don’t get many results for now.

q4

4. Using the right query parameters

By default, Quepid only sends a very simple query to Solr or Elasticsearch (click on Tune Relevance and check the Tune panel, you should see just ‘#$query##’ – a token that represents the various test queries you added above), and your search application almost certainly sends something a lot more complicated! So you can be sure you’re testing the same configuration as your search application uses, you need to tell Quepid what query pattern is being used.

q3

One way to start is to use Solr’s log files to see what actual queries are being run by your search application. Your search developer should be able to find a section that looks like this:

INFO - 2016-06-03 09:12:37.964; [ mydomain.com] org.apache.solr.core.SolrCore; [mydomain.com] webapp=/solr path=/select params={hl.fragsize=70&sort=+score+desc,+date_text+desc&hl.mergeContiguous=true&qf=tm_body:summary^1.0&qf=tm_body:value^1.0&qf=tm_field_product^5.0&hl.simple.pre=[HIGHLIGHT]&json.nl=map&hl.fl=spell&wt=json&hl=true&rows=8&fl=*,score&hl.snippets=3&start=0&q="iphone"&hl.simple.post=[/HIGHLIGHT]&fq=bs_status:"true"&fq=index_id:"node_index"} hits=5147 status=0 QTime=46

Stripping out the query gives us:

hl.fragsize=70&sort=+score+desc,+date_text+desc&hl.mergeContiguous=true&qf=tm_body:summary^1.0&qf=tm_body:value^1.0&qf=tm_field_product^5.0&hl.simple.pre=[HIGHLIGHT]&json.nl=map&hl.fl=spell&wt=json&hl=true&rows=8&fl=*,score&hl.snippets=3&start=0&q="iphone"&hl.simple.post=[/HIGHLIGHT]&fq=bs_status:"true"&fq=index_id:"node_index"

We need to replace the query (highlighted above, we’re searching for ‘iphone’) with a special token so Quepid can use this string to send all its test queries:

hl.fragsize=70&sort=+score+desc,+date_text+desc&hl.mergeContiguous=true&qf=tm_body:summary^1.0&qf=tm_body:value^1.0&qf=tm_field_product^5.0&hl.simple.pre=[HIGHLIGHT]&json.nl=map&hl.fl=spell&wt=json&hl=true&rows=8&fl=*,score&hl.snippets=3&start=0&q=#$query##&hl.simple.post=[/HIGHLIGHT]&fq=bs_status:"true"&fq=index_id:"node_index"

If you paste this string into Quepid’s Tune panel (click Tune Relevance to toggle this) then you know Quepid is sending the same type of queries as your search application. Click ‘Rerun my Searches’ and the results you see should be in a similar, if not identical, order to your search application.

q2

5. Starting the tuning process

You should now have Quepid connected to your actual Solr index and running queries the same way that your search application does – you can now start the process of ranking the results. Once you have some scores, you can ask your search developer to try changing the query in the Tune panel to see if he can improve the relevance scores. Your journey towards better relevance has begun!

Do get in touch if you’d like more information about Quepid or how Flax can help you develop a process of test-based relevancy tuning.

The post Setting up your first Quepid test case appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2016/07/08/setting-first-quepid-test-case/feed/ 0
Developing ongoing search tuning processes http://www.flax.co.uk/blog/2016/04/13/developing-ongoing-search-tuning-processes/ http://www.flax.co.uk/blog/2016/04/13/developing-ongoing-search-tuning-processes/#respond Wed, 13 Apr 2016 09:39:33 +0000 http://www.flax.co.uk/?p=3195 A series of blogs by Karen Renshaw on improving site search: How to get started on improving Site Search Relevancy A suggested approach to running a Site Search Tuning Workshop Auditing your site search performance Developing ongoing search tuning processes … More

The post Developing ongoing search tuning processes appeared first on Flax.

]]>
A series of blogs by Karen Renshaw on improving site search:

  1. How to get started on improving Site Search Relevancy
  2. A suggested approach to running a Site Search Tuning Workshop
  3. Auditing your site search performance
  4. Developing ongoing search tuning processes
  5. Measuring search relevance scores

 


In my last blog I wrote about how to create an audit of your current site search performance. In this blog I cover how to develop search tuning processes.

Once started on your search tuning journey developing ongoing processes is a must. Search tuning is an iterative process and must be treated as such. In the same way that external search traffic – PPC and SEO – is continually reviewed and optimised, so must on site search be: otherwise you have invested a lot of time and money to get people to your site but then leave them wandering aimlessly in the aisles wondering if you have the product or information you so successfully advertised!

There are 2 key areas to focus on when developing search processes:

  1. Ongoing review of search performance
  2. Dedicated resource

1. Ongoing review of search performance

Develop a framework for measuring relevancy scores

It’s good practice to develop a benchmark as to how search queries are performing through creating a search relevancy framework. Simply put, this is a score assigned to each search result based on how well that result answers the original query.

You can customise the scoring system you use to score your search results. Whatever you choose the key is to ensure that your search analysts are consistent in their approach, the best way to achieve that is through providing documented guidelines.

Understanding how query scores change with different configurations is an integral part of search tuning process but you should also run regular reviews on how queries are performing. This way you’ll know the impact loading new documents and products into your site is having on overall relevancy and highlight changes you need to feed into your product backlog.

Process for manually optimising important or problematic queries

Even with a search tuning test and learn plan in place there will be some queries that don’t do as well as well as expected or for which a manual custom build response provides a better customer experience.

Whilst manually tuning a search can sometimes be viewed in a negative light – after all search should ‘just work’ – it shouldn’t be seen as such. Manually optimising important search queries means that you can provide a tailored response for your customer. The queries you optimise will be dependent on your metrics and what you deem as being a good or bad experience.

With manual optimisation you can should also build in continual reviews and take the opportunity to test different landing pages.

Competitive review

I’ve talked about this in a few of my other blogs but it is especially important for eCommerce sites to understand how your competitors are answering your customers’ queries. As you create a search relevancy framework for your site it’s easy to score the same queries on your competitors to draw out any comparisons and understand opportunities for improvements.

2. Dedicated Resource

Creating and maintaining the above reviews needs resource. Ideally you would have a staff member dedicated to reviewing search and responsible for updating product backlog configuration changes, working alongside developers to ensure changes are tested and deployed successfully.

If you don’t have a dedicated person responsible, the right skills will undoubtedly exist within your organisation. You will have teams who understand your product / information set, and within that team you will find a sub-set of individuals who have problem solving skills combined with a passion to improve the customer experience. Once you’ve found them, providing them with some light search knowledge will be enough to get you started.

Whether it’s a full-time role or part-time having someone focus on reviewing search queries should be part of your plan.

What’s next?

Now you have processes and a team in place it’s time to consider what to measure (and how). In my next blog I’ll cover how to measure search relevancy scores.

Karen Renshaw is an independent On Site Search consultant and an associate of Flax. Karen was previously Head of On Site Search at RS Components, the world’s largest electronic component distributor.

Flax can offer a range of consulting, training and support, provide tools for test-driven relevancy tuning and we also run Search Workshops. If you need advice or help please get in touch.

The post Developing ongoing search tuning processes appeared first on Flax.

]]>
http://www.flax.co.uk/blog/2016/04/13/developing-ongoing-search-tuning-processes/feed/ 0