Terminal Terminal | Web Web
Home  //  Play

Manticore Search - Indexing data from MySQL example

Difficulty: Beginner
Estimated Time: 15 minutes
Note: this course contains examples of how to work with plain(not real-time) tables, which cannot be reproduced with a default Manticore configuration file. For more details about the operating modes and the corresponding configuration settings, see our Manual

Manticore Search - Indexing from MySQL

In this tutorial you will learn how to create a table in Manticore Search using data from a MySQL database.

Plain tables are created by connection and grabbing the data from a source.

Manticore Search can get data from databases, CSV/TSV or XML files.

Available database connectors are for MySQL, Postgres, MSSQL as well as a generic ODBC driver, which allows getting data from any database that supports ODBC.

In this tutorial we are using a MySQL database which uses the sample Sakila database provided by MySQL (https://dev.mysql.com/doc/sakila/en/).

Manticore Search packages come with built-in support for MySQL, but require the presence of client libraries (the package has different names, depending on Linux distribution, for more information check https://manual.manticoresearch.com/Installation)

Manticore Search - Indexing data from MySQL example

Step 1 of 4

Table configuration

The most simple plain table definition contains the path pattern of the table files and reference to the source the table would use.

table sakila_film {
        path = /var/lib/manticore/testplain
        source = sakila_film
}

Let's look at the source. The first parameter defines the connector type, in our case 'mysql'. Next are the connection parameters to the database we connect.

The 'sql_query_pre' directive is set here to ensure the result set values will use UTF8 values, as the search engine uses on UTF8 character sets.

The engine doesn't necessarily have to get data from a single table; instead, a query is used, which can join data from multiple cases.

The first column should always be the document id, which should have unique values.

The rest of the source parameters define the attributes taken from the sql_query result. Anything not declared here will be considered a fulltext field.

Fulltext fields are only indexed, not stored, so we can search them, but their values will not be returned in the results.

The sql_field_string is a combo field/attribute, and its values are both indexed and stored.

source sakila_film {
        type = mysql
        sql_host = 127.0.0.1
        sql_user = user
        sql_pass = pass123
        sql_db = sakila
        sql_query_pre = SET NAMES utf8
        sql_query =  SELECT film.film_id, title, description,category_id, release_year,rental_rate from film join film_category on film.film_id=film_category.film_id
        sql_field_string = title
        sql_field_string = description
        sql_attr_uint = release_year
        sql_attr_uint = category_id
        sql_attr_float = rental_rate

}

Indexing

Our table sakila_film is created in manticore.conf but has no data. searchd (Manticore Search server) will complain about this when it starts; this is normal.

To populate the table, we use the indexer tool. When using indexer, the generated table files must be made with permissions that allow searchd to access them. In the case of Linux official packages, searchd runs under the manticore user, so indexer must also run under the manticore user:

sudo -u manticore indexer sakila_film

If you are running searchd differently, you might be able to omit the sudo -u manticore part. Just make sure that the user under which your searchd instance is running has read/write permissions to the tables you generate.

The utility will output information about the created table, like how many documents were found and how fast indexing was.

Now let's start Manticore Search:

sudo -u manticore searchd

Perform queries

Let's connect to Manticore Search

mysql -P9306 -h0

And check the indexes available

SHOW TABLES;

Perform a first query on our table:

SELECT * FROM sakila_film;

And run a simple fulltext search:

SELECT *,WEIGHT() FROM sakila_film WHERE MATCH('a secret agent and a dentist');

SELECT * FROM sakila_film WHERE MATCH('a secret agent and a dentist') ORDER BY rental_rate DESC;

Refreshing data in the table

Plain tables are immutable. This means that if we want to refresh the content in it, we need to reindex it entirely.

For attributes, it is possible to use the 'UPDATE' statement (just like in MySQL). However, for new rows, modified fulltext fields, or to make content disappear, we need to reindex.

Unlike the first time we indexed, now the searchd daemon runs and has the table loaded. This means that it puts a lock on the table.

To be able to make a new version of the table, the 'indexer' utility will create a new copy and when ready, it will inform the daemon about it.

This is done by adding the --rotate parameter.

exit

sudo -u manticore indexer sakila_film --rotate