>>
Manticore Search - Indexing data from MySQL example
Index configuration
The most simple plain index defition contains the path pattern of the index files and reference of the source the index would use.
index sakila_film { path = /var/lib/manticore/testplain source = sakila_film }
Let's look at the source. The first parameter defines the connector type, in our case 'mysql'. Next are the connection parameters to the database we connect.
The 'sql_query_pre' directive is set here to ensure the result set values will use UTF8 values, as the search engine uses on utf8 character sets.
The engine doesn't necessary have to get data from a single table, instead a query is used, which can join data from multiple cases.
The first column should be always the document id, which should have unique values.
The rest of the source parameters define the attributes taken from the sql_query result. Anything not declared here will be considered a fulltext field.
Fulltext fields are only indexed, not stored, so we can search them, but their values will not be returned in the results.
The sql_field_string is a combo field/attribute and it's values are both indexed, but also stored.
source sakila_film { type = mysql sql_host = 127.0.0.1 sql_user = user sql_pass = pass123 sql_db = sakila sql_query_pre = SET NAMES utf8 sql_query = SELECT film.film_id, title, description,category_id, release_year,rental_rate from film join film_category on film.film_id=film_category.film_id sql_field_string = title sql_field_string = description sql_attr_uint = release_year sql_attr_uint = category_id sql_attr_float = rental_rate}
Indexing
Our index 'sakila_film' is created in manticore.conf but has no data. Searchd will complain about this when starts, this is normal.
To populate the index we use the indexer tool:
indexer sakila_film
The utility will output information about the created index, like how many document were found and how fast indexing was.
Now let's start the searchd daemon:
searchd
Perform queries
Let's connect to Manticore Search
mysql -P9306 -h0
And check the indexed available
SHOW TABLES;
And perform a first query on our index:
SELECT * FROM sakila_film;
And run a simple fulltext search:
SELECT *,WEIGHT() FROM sakila_film WHERE MATCH('a secret agent and a dentist');
SELECT * FROM sakila_film WHERE MATCH('a secret agent and a dentist') ORDER BY rental_rate DESC;
Refreshing data in the index
Plain indexes are immutable, this means if we want to refresh the content in it we need to reindex it enterely.
For attributes it is possible to use 'UPDATE' statement (just like in MYSQL), however for new rows, modified fulltext fields or make content dissapear, we need to reindex.
Unlike the first time we indexes, now the searchd daemon runs and have the index loaded. This means it puts a lock on the index.
To be able to make a new version of the index, the 'indexer' utility will create a new copy and when ready it will inform the daemon about it.
This is done adding the '--rotate' parameter.
exit
indexer sakila_film --rotate