Home  //  Play

Manticore percolate queries

Difficulty: Beginner
Estimated Time: 10 minutes

Manticoresearch - Introduction in percolate queries

In this tutorial we will explore percolate queries in Manticore Search.

Manticore percolate queries

Step 1 of 4

Storing Percolate Queries

Percolate Queries are also known as Persistent Queries, Prospective Search, document routing, search in reverse or inverse search.

So let's connect to Manticore using the mysql client:

mysql -P9306 -h0

The normal way of doing searches is to store documents we want to search and perform queries against them. However, some cases exist when we want to apply a query to an incoming new document to find the matching. There are some scenarios where this is needed.

Percolate Queries are stored in a special Real-Time index defined by percolate type.

index pq {
    type = percolate
    path = /var/lib/manticore/data
}

Document ids support auto-increment for percolate indexes, so you don’t need to generate a document id like for normal RT indexes when you do inserts. Instead, inserting a new query requires a mandatory ‘query’ text field that contains the full-text search expression.

INSERT INTO pq(query) VALUES('catch me');

Storing Attributes

Now we can see what rules have been stored in index:

SELECT * FROM pq;

There are 2 optional attributes for a specific purpose.

The first is tags, which supports a bag of strings (think of it as a string MVA). The tags can be used to filter the PQs when doing SELECTs or DELETEs, they have no role when performing percolate searches.

insert into pq(query,tags) values('catch me if','tag1');

SELECT * FROM pq WHERE tags='tag1';

For deleting use standard mysql query:

DELETE FROM pq WHERE id = 1;

Or by tag:

DELETE FROM pq WHERE tags = 'tag1';

The second is filters, containing additional query rules in which you can set attribute filtering by using the SphinxQL format.
The attributes used here must be declared in the index configuration just like attributes in a RT index.

index pq {
    type = percolate
     path = /var/lib/sphinxsearch/pq
     rt_attr_uint = catId
}

INSERT INTO pq(query,tags,filters) VALUES('catch me','tag2,tag1','catId=10');

select * from pq;

If we want to enable field text searches, we need to define full-text fields in the index configuration and reconfigure the index.

index pq {
    type = percolate
    path = /var/lib/sphinxsearch/pq
    rt_field = subject
    rt_field = content
    rt_attr_uint = catId
}

ALTER RTINDEX pq RECONFIGURE;

And now we can run queries:

INSERT INTO pq (query, tags, filters) VALUES ('@subject match by field','tag2,tag3','catId=10'); INSERT INTO pq (query, tags, filters) VALUES ('@subject match by field','tag2,tag3','catId=20');

Performing Percolate Queries

Now, before moving on, we recreate the row that we deleted at the previous step:

insert into pq(query,tags) values('catch me','tag1');

The CALL PQ function can be used against a single (for now) percolate index that we set in the first parameter of the function. The second parameter, which is a mandatory one, defines a document or a list of documents. The document(s) passed can be a plain text or a JSON object. The JSON object allows passing multiple text fields, so per field matches are possible, and attributes which are tested against the expression defined in ‘filters’ attribute of the percolate query. This means the query can be not only a full-text match expression, but you also can have attribute filters which allows a more complex matching (thing about someone searching for specific keywords, but who wants to be notified only if certain types or categories of articles match).

CALL PQ('pq','catch me if you can', 0 AS docs_json, 1 AS query);

By default, CALL PQ expects documents in JSON format, so if we want to perform the most basic example, we need to pass a “0 as docs_json” argument here. Also the default result set contains only the query ID, but we can use the argument 1 as query to show all the query information.

To perform per field matching or additional attribute filtering we need to pass documents as a JSON object:

CALL PQ('pq', '{"subject":"expect to match by field","content":"here is some content","catid":20}', 1 AS query);

CALL PQ have 2 more runtime arguments : the docs argument is useful when you have more than one document at the input, as it will tell you which document matches a search (document indexes start from 1 here):

CALL PQ('pq', ('catch me if can', 'catch me'), 0 AS docs_json, 1 AS docs, 1 AS verbose);

Show meta

After running a CALL PQ, the SHOW META command provides information about the function having been executed. It shows execution time, number of queries and documents matched, a total of stored queries and so on. If the 1 as verbose argument is set in a CALL PQ command, SHOW META displays some extra information like the time spent for each query.

SHOW META;