>>
Spell Correction with Spell Suggest
Setting Up for Spell Correction
Let's connect to Manticore and create a table for spell suggestions.
mysql -P9306 -h0
CALL SUGGEST requires infixing and keywords dictionary:
CREATE TABLE articles(title text, content text) min_infix_len='2' dict='keywords';
Insert sample content with various vocabulary:
INSERT INTO articles(title, content) VALUES
('Introduction to Programming', 'Learn programming basics with Python and JavaScript. Understanding variables, functions, and algorithms.'),
('Database Management Systems', 'Comprehensive guide to database design, SQL queries, and optimization techniques for developers.'),
('Machine Learning Fundamentals', 'Explore machine learning algorithms including neural networks, deep learning, and classification.'),
('Web Development Best Practices', 'Modern web development using React, Node.js, and responsive design principles.'),
('Cloud Computing Architecture', 'Understanding cloud services, containerization, and microservices architecture patterns.'),
('Cybersecurity Essentials', 'Protecting systems from vulnerabilities, encryption methods, and security protocols.'),
('Data Science with Python', 'Statistical analysis, data visualization, and predictive modeling techniques.'),
('Software Engineering Principles', 'Agile methodology, code review practices, and continuous integration workflows.');
Verify the data:
SELECT * FROM articles;
The dictionary now contains words from these articles.
Using CALL SUGGEST
CALL SUGGEST finds dictionary words similar to a misspelled word.
Get suggestions for a misspelled word:
CALL SUGGEST('programing', 'articles');
This returns:
- suggest: the suggested word
- distance: Levenshtein distance (number of edits)
- docs: how many documents contain this word
Try more examples:
CALL SUGGEST('databse', 'articles');
CALL SUGGEST('machne', 'articles');
CALL SUGGEST('algortihm', 'articles');
Words with distance 1 (one character difference) are closest matches:
CALL SUGGEST('learing', 'articles');
When a word is spelled correctly, it returns exact match:
CALL SUGGEST('python', 'articles');
If no similar words exist, empty result is returned:
CALL SUGGEST('xyzabc', 'articles');
Using CALL QSUGGEST
CALL QSUGGEST works on the LAST word of a phrase — useful for correcting typos in multi-word search queries.
CALL QSUGGEST('learn programing', 'articles');
Only "programing" (last word) gets corrected. First words are ignored.
Compare with CALL SUGGEST on the same phrase:
CALL SUGGEST('learn programing', 'articles');
CALL SUGGEST works on the FIRST word, so this suggests corrections for "learn".
Try correcting typos in different phrases:
CALL QSUGGEST('cloud computng', 'articles');
CALL QSUGGEST('data viualization', 'articles');
CALL QSUGGEST('database managment', 'articles');
CALL QSUGGEST('security protocl', 'articles');
QSUGGEST always targets the last word, making it convenient for correcting the word currently being typed.
Customizing Suggestions
CALL SUGGEST accepts parameters to fine-tune results.
Limit number of suggestions:
CALL SUGGEST('learing', 'articles', 3 as limit);
Note: limit sets the maximum — fewer results may be returned if there aren't enough similar words in the dictionary.
Control maximum edit distance:
CALL SUGGEST('prgraming', 'articles', 5 as limit, 3 as max_edits);
Higher max_edits finds more distant matches.
Control length difference from original:
CALL SUGGEST('prog', 'articles', 5 as limit, 2 as max_edits, 2 as delta_len);
delta_len limits how much longer/shorter suggestions can be.
Disable statistics for faster response:
CALL SUGGEST('databse', 'articles', 5 as limit, 0 as result_stats);
Get all suggestions in one row:
CALL SUGGEST('lerning', 'articles', 5 as limit, 1 as result_line);
Combine multiple parameters:
CALL QSUGGEST('machine lerning', 'articles', 3 as limit, 2 as max_edits, 1 as result_stats);
Available parameters:
- limit: max suggestions (default 5)
- max_edits: max Levenshtein distance (default 4)
- delta_len: max length difference (default 3)
- result_stats: show distance/docs (default 1)
- result_line: all in one row (default 0)
Building "Did You Mean?" Feature
Let's implement a practical spell correction workflow.
First, try a search with a misspelled term:
SELECT * FROM articles WHERE MATCH('programing');
No results due to the typo. Now get a suggestion:
CALL SUGGEST('programing', 'articles', 1 as limit);
Use the suggestion in the corrected query:
SELECT * FROM articles WHERE MATCH('programming');
Handle multi-word queries by checking each word:
CALL SUGGEST('machne', 'articles', 1 as limit);
CALL SUGGEST('lerning', 'articles', 1 as limit);
Then combine the suggestions: "Did you mean: machine learning?"
Alternative approach — use fuzzy search to tolerate typos directly:
SELECT * FROM articles WHERE MATCH('machine lerning') OPTION fuzzy=1;
Learn more about this in the Fuzzy Search and Typo Tolerance course.