Terminal Terminal | Web Web
Home  //  Play

Manticore Faceting

Difficulty: Beginner
Estimated Time: 10 minutes

Manticoresearch - Faceted searching

In this tutorial you will learn how to do faceted search in Manticore Search.

The dataset used in this course is https://data.world/data-society/imdb-5000-movie-dataset (the same as in the CSV import course - https://play.manticoresearch.com/csv/ ).

Manticore Faceting

Step 1 of 5

Introduction

Faceted search is an augmentation of traditional search that allows user to narrow down search results by applying multiple filters. Once a search is made, the user gets the results, but also a list of facets that allows him to narrow the results. As the user selects one (or more values) of a facet, a new filter is applied to the original query. The facets themselves are in essence aggregations on a specific attribute or expression to extract groups of values. In short, someone can run multiple times a base query to which a group is added. This, of course, may not be efficient because the total query time is multiplied by the numbers of facets.

In Manticore Search there is an optimization that retains the result set of the original query and reuses it for each facet calculation. As the aggregations are applied to already calculated subset of documents they are fast and the total execution time can be in many cases only marginally bigger than the initial query.

You can of course just send all the queries you want in a single multi-query (read more here https://docs.manticoresearch.com/latest/html/searching/multi-queries.html), but if you're looking for a more elegant solution or your client doesn't allow you to do multi-queries there's FACET clause.

This course feature a working web application in the Web panel which uses the queries presented in this course.

FACET

We're going to run several queries with facets in the terminal.

The FACET clause can be added at the very end of a SELECT statement (including after OPTION clause).

'FACET {expr_list} [BY {expr_list}] [ORDER BY {expr | FACET()} {ASC | DESC}] [LIMIT [offset,] count]'

Let's connect to Manticore and try some simple facets on the 'movies' index from the autocomplete course and perform simple facets on several attribute:

mysql -P9306 -h0

SELECT * FROM movies WHERE MATCH('robert de niro') LIMIT 10 FACET title_year FACET content_rating FACET director_name;

To see it in action in the web application enter Robert de Niro in the input box and submit.

We get back a multiple results set, where the first one is from our main query and the rest are the facets. Each facet result contains the attribute values and count of the grouping.

Sorting in FACETS

By default facets results are not sorted and are also limited to 20 rows. Each facet can have it's own limit clause.

SELECT * FROM movies WHERE MATCH('robert de niro') LIMIT 10 FACET title_year LIMIT 5 FACET content_rating LIMIT 1 FACET director_name LIMIT 100;

As we can see, results are not sorted. Each facet can have it's own sorting rule. We can sort by COUNT(*):

SELECT * FROM movies WHERE MATCH('robert de niro') LIMIT 10 FACET title_year ORDER BY COUNT(*) DESC;

or by attribute:

SELECT * FROM movies WHERE MATCH('robert de niro') LIMIT 10 FACET title_year ORDER BY title_year DESC;

We can also use the special FACET() function (which essentially is the same as GROUPBY() for main query):

SELECT * FROM movies WHERE MATCH('robert de niro') LIMIT 10 FACET title_year ORDER BY FACET() DESC;

Facet selection

In the most simple examples we use FACET attr_name and the result set will contain the attr_name and count columns.

At facet selection we can define multiple attributes to be returned. The column on which the facet is made is declared with 'BY attr_name'

SELECT * FROM movies WHERE MATCH('robert de niro') LIMIT 10 FACET director_facebook_likes,director_name BY director_name ORDER BY director_facebook_likes DESC;

Expressions in FACETs

In some situations faceting on actual values is not something we want. The most common example are product prices as we can have a wide range of values. Instead of getting facet of actual values we want instead to get a list of ranges. This can be achieved easy with the INTERVAL function. In our example we can use the imdb_score as it's a float value and we want to get the ranges between integer values of imdb_score.

SELECT * FROM movies WHERE MATCH('robert de niro') LIMIT 0 FACET title_year FACET content_rating FACET director_name FACET INTERVAL(imdb_score,1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0) AS score ORDER BY score DESC;

For our movies collection we use interval to get values within integer range for the IMDB score.