Terminal Terminal | Web Web
Home  //  Play

Did you mean?

Difficulty: Beginner
Estimated Time: 10 minutes
Note: this course contains examples of how to work with plain(not real-time) tables, which cannot be reproduced with a default Manticore configuration file. For more details about the operating modes and the corresponding configuration settings, see our Manual

Manticoresearch - Did You Mean?

In this course you will learn how Manticore Search can correct wrong typed words.

Did you mean?

Step 1 of 3

Introduction

Besides autocomplete feature, for which we covered a simple example in this course https://play.manticoresearch.com/simpleautocomplete/), another common feature people add to search applications is ability to show corrections of wrong typed words.

Manticore Search comes with a feature that allows getting suggestions for a word from the table dictionary. It is done by enabling infixing option. Not only infixing allows wildcard searches, but it also creates ngram hashes from the indexed words. Ngrams (or just parts of words of N characters length) are used to find words that are close to each other (as plain text, not linguistic-wise). Combined with Levenshtein distance between the suggestion candidate word and the original word, we can provide suggestions that are suitable as corrections for the bad word. This functionality is provided by CALL SUGGEST and CALL QSUGGEST functions (read more in the doc - https://manual.manticoresearch.com/Searching/Spell_correction#CALL-QSUGGEST,-CALL-SUGGEST).

First we should enable infixing in our table.

table movies
 {
    type            = plain
    path            = /var/lib/manticore/data/movies
    source          = movies
    min_infix_len   = 3
 }

This course features a working web application in the Web panel which uses the queries presented in this course.

CALL SUGGEST usage

When a user performs a query that returns no results it's possible that the user may have mistyped something.

Let's connect to Manticore and take an example (mind the mistype in 'revenge'):

mysql -P9306 -h0

And take a quick example of a word suggestion:

CALL SUGGEST('rvenge','movies');

The output contains 3 columns: the suggestion, a calculated Levenshtein distance and doc hits of the sugggestion in the table.

The first suggestion has a distance of 1 compared to our input and it's the actual word expected to be suggested. This is usually the best scenario when we get on the minimal distance a single suggestion, as it's most likely to be the one we look for. It is possible even for distance 1 to have more than one suggestion:

CALL SUGGEST('aprentice','movies');

When they share same distance, suggestions are sorted again by their doc hits. In this example 'apprentice' is most likely what the user wanted as it has more hits than 'prentice'.

Of course, when the input word is actually found in our table, it will appear as the first suggestion with distance=0

CALL SUGGEST('revenge','movies');

If we want to increase the suggestions number, we can add the limit parameter:

CALL SUGGEST('aprentice','movies', 10 as limit);

If we want to restrict the suggestions, we can lower the maximum Levenshtein distance (default is 4) and maximum word length (default is 3):

CALL SUGGEST('aprentice','movies', 10 as limit,3 as max_edits,2 as delta_len);

For the next step we need to exit the mysql client

exit;

A working example

A simple working example of 'Did you mean' can be tested in the Web panel.

The PHP script provides a simple search page results.

In case the input string doesn't find a result, the script tests each word with 'CALL SUGGEST' and tries to build a new query string.

If the new query string has matches, its result set is provided.

The script can be viewed with cat /html/index.php.