Terminal Terminal | Web Web
Home  //  Play

Secondary Indexes for Performance

Difficulty: Beginner
Estimated Time: 15 minutes

Secondary Indexes for Performance

In this course, you will learn how secondary indexes in Manticore Search speed up attribute filtering and how to monitor their usage.

Secondary Indexes for Performance

Step 1 of 5

What Are Secondary Indexes?

In Manticore Search, secondary indexes speed up filtering by attributes such as integers, floats, strings, and booleans. Unlike full-text indexes that handle text search via MATCH(), secondary indexes accelerate WHERE clause conditions on non-text columns.

Secondary indexes are created automatically for all attributes (except JSON) and use the PGM (Piecewise Geometric Model) algorithm to quickly locate matching rows without scanning the entire table.

Let's connect to Manticore and see how it works:

mysql -P9306 -h0

Create a table with various attribute types:

CREATE TABLE products(title text, price float, quantity int, category string, in_stock bool);

Insert sample data:

INSERT INTO products(title, price, quantity, category, in_stock) VALUES
('Laptop Pro', 1299.99, 50, 'Electronics', 1),
('Wireless Mouse', 29.99, 200, 'Electronics', 1),
('Office Chair', 349.99, 30, 'Furniture', 1),
('USB Cable', 9.99, 500, 'Electronics', 1),
('Standing Desk', 599.99, 15, 'Furniture', 0),
('Monitor 27inch', 399.99, 45, 'Electronics', 1),
('Keyboard', 79.99, 150, 'Electronics', 1),
('Desk Lamp', 49.99, 80, 'Furniture', 1),
('Webcam HD', 89.99, 60, 'Electronics', 0),
('Filing Cabinet', 199.99, 25, 'Furniture', 1);

For RT tables, secondary indexes only exist on disk chunks, not in the RAM segment. Let's flush data to disk:

FLUSH RAMCHUNK products;

Now view the automatically created secondary indexes:

SHOW TABLE products INDEXES;

You'll see indexes for price, quantity, category, and in_stock columns. Each row shows the index name, data type, whether it's enabled, and the percentage of disk chunks that have this index.

Queries That Benefit from Secondary Indexes

Secondary indexes help most with selective attribute filters — conditions that match a small portion of the table. Let's explore different filter types.

Equality filter

SELECT * FROM products WHERE category = 'Furniture';

Range filter

SELECT * FROM products WHERE price BETWEEN 100 AND 500;

Comparison filter

SELECT * FROM products WHERE quantity > 100;

Boolean filter

SELECT * FROM products WHERE in_stock = 1;

Combined filters

Multiple filters can work together efficiently:

SELECT * FROM products WHERE category = 'Electronics' AND price < 100;

Monitoring query execution

Enable profiling and use SHOW META to see query statistics:

SET profiling=1;

SELECT * FROM products WHERE price > 100;

SHOW META;

SHOW META displays the total number of matches, query time, and keyword statistics. With larger datasets, it also shows which secondary indexes were used by the optimizer (e.g. price:SecondaryIndex (100%)).

With a small table like ours (10 rows), the cost-based optimizer prefers a full scan since it's fast enough. Secondary indexes provide significant speedup with larger datasets where selective filters can skip most of the data.

Secondary Indexes for JSON Attributes

By default, JSON attributes do not get secondary indexes. You can enable them by adding the secondary_index option to the JSON column definition.

Create a table with JSON secondary indexes enabled:

CREATE TABLE items(name text, data json secondary_index='1');

Insert JSON documents:

INSERT INTO items(name, data) VALUES
('Product A', '{"brand": "Apple", "year": 2023, "specs": {"ram": 16, "storage": 512}}'),
('Product B', '{"brand": "Samsung", "year": 2024, "specs": {"ram": 8, "storage": 256}}'),
('Product C', '{"brand": "Apple", "year": 2024, "specs": {"ram": 32, "storage": 1024}}'),
('Product D', '{"brand": "Sony", "year": 2023, "specs": {"ram": 16, "storage": 512}}'),
('Product E', '{"brand": "LG", "year": 2022, "specs": {"ram": 8, "storage": 128}}');

Flush data to disk so secondary indexes are created:

FLUSH RAMCHUNK items;

View the indexes — JSON contents are flattened into multiple secondary indexes:

SHOW TABLE items INDEXES;

You'll see indexes for JSON paths like data['brand'], data['year'], data['specs']['ram'], data['specs']['storage'] — each with its automatically detected type.

Now query the JSON fields:

SELECT * FROM items WHERE data.brand = 'Apple';

SELECT * FROM items WHERE data.year >= 2023;

Nested JSON fields work too:

SELECT * FROM items WHERE data.specs.ram >= 16;

SELECT * FROM items WHERE data.specs.storage >= 512;

Cost-Based Optimizer and Query Hints

The cost-based optimizer (CBO) automatically decides whether to use secondary indexes or a full scan for each filter in a query. It evaluates costs based on data distribution (histograms), filter selectivity, and table size.

For highly selective queries (few matches), secondary indexes are preferred. For queries matching most rows, a full scan may be faster.

Forcing secondary index usage

You can override the optimizer using SQL hints placed at the end of the query. Use SecondaryIndex to force the optimizer to use a secondary index:

SELECT * FROM products WHERE price > 100 /*+ SecondaryIndex(price) */;

SHOW META;

Disabling secondary index usage

Use NO_SecondaryIndex to tell the optimizer to skip a secondary index:

SELECT * FROM products WHERE price > 100 /*+ NO_SecondaryIndex(price) */;

SHOW META;

Notice both SHOW META outputs are identical — with only 10 rows the optimizer handles both approaches equally fast. In production with millions of rows and selective filters, hints let you override the optimizer when you know your data better.

Viewing query plan details

When a query includes full-text search, you can use SHOW PLAN to see the execution tree:

SELECT * FROM products WHERE MATCH('laptop') AND price > 100;

SHOW PLAN;

The transformed_tree row shows how the full-text query was processed.

How the optimizer decides

The CBO considers:

  • Number of documents in the table
  • Data distribution stored in histograms (.sphi files)
  • Selectivity of filter conditions (how many rows match)
  • Cost of index lookup vs sequential scan

It evaluates up to 1024 combinations of execution paths and picks the one with the lowest estimated cost. With our small 10-row table, the optimizer likely prefers a full scan. With thousands of rows and selective filters, secondary indexes provide significant speedup.

Maintenance and Best Practices

When secondary indexes get disabled

When an attribute value is updated, its secondary index gets disabled temporarily:

UPDATE products SET price = 1500.00 WHERE MATCH('laptop');

Check the index status:

SHOW TABLE products INDEXES;

You may see the Enabled column showing 0 for the price index.

Rebuilding secondary indexes

Use ALTER TABLE ... REBUILD SECONDARY to re-enable disabled indexes:

ALTER TABLE products REBUILD SECONDARY;

Verify the indexes are enabled again:

SHOW TABLE products INDEXES;

RT table considerations

Remember that secondary indexes exist only on disk chunks. When you insert new data into an RT table, it first goes to the RAM segment where there are no secondary indexes. Data is flushed to disk automatically during normal operation, or you can force it with:

FLUSH RAMCHUNK table_name;

Best practices

  • Secondary indexes are created automatically for all non-JSON attributes — no configuration needed
  • For JSON attributes, use secondary_index='1' in the column definition to enable indexing
  • Use SHOW TABLE ... INDEXES to verify that indexes exist and are enabled
  • Use SHOW META after queries to check query performance
  • Run ALTER TABLE ... REBUILD SECONDARY after bulk attribute updates
  • The optimizer usually makes good decisions — use hints like SecondaryIndex() and NO_SecondaryIndex() at the end of the query only when you need to override it