Terminal Terminal | Web Web
Home  //  Play

Table Joins in Manticore Search (JSON API)

Difficulty: Beginner
Estimated Time: 15 minutes

Manticore Search - Table Joins (JSON API)

In this course you will learn how to combine data from multiple tables using JOIN operations in Manticore Search via JSON API.

Table Joins in Manticore Search (JSON API)

Step 1 of 4

About Table Joins in Manticore Search

Table joins in Manticore Search enable you to combine documents from two tables by matching related columns. This functionality allows for more complex queries and enhanced data retrieval across multiple tables.

Manticore Search supports two types of joins:

  • INNER JOIN: Returns only the rows where there is a match in both tables
  • LEFT JOIN: Returns all rows from the left table and the matched rows from the right table

Creating Tables for Joins

In this course, we will use two tables: customers and orders. The customers table contains customer information, while the orders table contains order details linked to customers via customer_id.

We will interact with Manticore using its JSON API via HTTP. The API is available at http://localhost:9308.

Let's create our tables using JSON API:

curl -s localhost:9308/cli -d "DROP TABLE IF EXISTS customers"

curl -s localhost:9308/cli -d "DROP TABLE IF EXISTS orders"

curl -s localhost:9308/cli -d "CREATE TABLE customers (id BIGINT, name TEXT, email TEXT, address TEXT) type='rt' min_infix_len='3'"

curl -s localhost:9308/cli -d "CREATE TABLE orders (id BIGINT, customer_id BIGINT, product TEXT, quantity INT, order_date TEXT) type='rt' min_infix_len='3'"

Inserting Sample Data

Let's insert some customers using the JSON insert endpoint:

curl -s localhost:9308/insert -d '{"index":"customers","id":1,"doc":{"name":"Alice Johnson","email":"alice@example.com","address":"123 Maple St"}}' | jq

curl -s localhost:9308/insert -d '{"index":"customers","id":2,"doc":{"name":"Bob Smith","email":"bob@example.com","address":"456 Oak St"}}' | jq

curl -s localhost:9308/insert -d '{"index":"customers","id":3,"doc":{"name":"Carol White","email":"carol@example.com","address":"789 Pine St"}}' | jq

curl -s localhost:9308/insert -d '{"index":"customers","id":4,"doc":{"name":"John Smith","email":"john@example.com","address":"15 Barclays St"}}' | jq

Now let's insert some orders:

curl -s localhost:9308/insert -d '{"index":"orders","id":1,"doc":{"customer_id":1,"product":"Laptop","quantity":1,"order_date":"2023-01-01"}}' | jq

curl -s localhost:9308/insert -d '{"index":"orders","id":2,"doc":{"customer_id":2,"product":"Phone","quantity":2,"order_date":"2023-01-02"}}' | jq

curl -s localhost:9308/insert -d '{"index":"orders","id":3,"doc":{"customer_id":1,"product":"Tablet","quantity":1,"order_date":"2023-01-03"}}' | jq

curl -s localhost:9308/insert -d '{"index":"orders","id":4,"doc":{"customer_id":3,"product":"Monitor","quantity":1,"order_date":"2023-01-04"}}' | jq

Note that customer with id=4 (John Smith) has no orders.

Using INNER JOIN

INNER JOIN returns only the rows where there is a match in both tables. Let's see it in action using JSON API.

For JOIN queries, we use the /search endpoint with JSON DSL format:

curl -s localhost:9308/search -d '
{"table":"orders",
"_source":["product","customers.name","customers.email"],
"join":[{"type":"inner","table":"customers",
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}]}
' | jq

This query will return 4 rows - one for each order, with customer details included.

Notice that John Smith (id=4) does not appear in the results because he has no orders. INNER JOIN only includes rows where there's a match in both tables.

Let's try a more specific query with filtering. We'll find all orders by customers whose address contains "Maple":

curl -s localhost:9308/search -d '
{"table":"orders",
"_source":["product","customers.email","customers.name","customers.address"],
"join":[{"type":"inner","table":"customers","query":{"query_string":"maple"},
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}]}
' | jq

This returns only Alice Johnson's orders (Laptop and Tablet) because her address is "123 Maple St".

You can also sort the results:

curl -s localhost:9308/search -d '
{"table":"orders",
"_source":["product","customers.email","customers.name"],
"join":[{"type":"inner","table":"customers",
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}],
"sort":[{"customers.email":"asc"}]}
' | jq

This will sort all orders by customer email in ascending order.

Using LEFT JOIN

LEFT JOIN returns all rows from the left table and the matched rows from the right table. If there is no match, default values are returned for the right table's columns (0 for numbers, empty strings for text).

Let's retrieve all customers along with their orders:

curl -s localhost:9308/search -d '
{"table":"customers",
"_source":["name","orders.product","orders.quantity"],
"join":[{"type":"left","table":"orders",
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}],
"sort":[{"email":"asc"}]}
' | jq

Notice the result: John Smith appears in the results with empty string ("") for orders.product and 0 for orders.quantity because he has no orders. This is the key difference between LEFT JOIN and INNER JOIN.

In JSON API, missing values from joined tables are returned as default values instead of NULL: 0 for numbers, empty strings for text, empty arrays for MVA.

Let's filter by customer name using full-text search:

curl -s localhost:9308/search -d '
{"table":"customers",
"_source":["name","email","orders.product"],
"join":[{"type":"left","table":"orders",
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}],
"query":{"query_string":"smith"}}
' | jq

This will return both Bob Smith (with his order "Phone") and John Smith (with empty product field), since both have "smith" in their name.

You can use aggregations to count rows per customer. Note that with LEFT JOIN, customers without orders will still appear with a count of 1 (the row with default values):

curl -s localhost:9308/search -d '
{"table":"customers","_source":["name"],
"join":[{"type":"left","table":"orders",
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}],
"limit":0,"aggs":{"customers_count":{"terms":{"field":"id","size":10}}}}
' | jq

This shows that Alice has 2 rows (2 orders), while Bob, Carol, and John each have 1 row.

Full-Text Search in JOINs

One of the powerful features of table joins in Manticore Search is the ability to perform full-text searches on both the left and right tables simultaneously.

You can use separate query blocks for each table in your JOIN. The query filters results based on text content in both tables.

Let's search for customers whose name contains "smith" and their orders containing "phone":

curl -s localhost:9308/search -d '
{"table":"customers","_source":["name","orders.product"],
"query":{"query_string":"smith"},
"join":[{"type":"inner","table":"orders","query":{"query_string":"phone"},
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}]}
' | jq

This returns only Bob Smith's Phone order, because:

  • Bob's name contains "smith"
  • His order contains "phone"
  • John Smith has no orders, so he doesn't appear (INNER JOIN)

You can also use OR conditions in full-text search:

curl -s localhost:9308/search -d '
{"table":"customers","_source":["name","orders.product"],
"query":{"query_string":"alice | bob"},
"join":[{"type":"inner","table":"orders",
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}]}
' | jq

This returns all orders from either Alice or Bob.

Let's try a more complex query with LEFT JOIN and full-text search:

curl -s localhost:9308/search -d '
{"table":"customers","_source":["name","orders.product"],
"query":{"query_string":"johnson | white"},
"join":[{"type":"left","table":"orders",
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}]}
' | jq

This will return:

  • Alice Johnson with her 2 orders (Laptop and Tablet)
  • Carol White with her 1 order (Monitor)

You can combine full-text search with attribute filtering using bool query:

curl -s localhost:9308/search -d '
{"table":"customers","_source":["name","orders.product","orders.quantity"],
"query":{"bool":{"must":[{"query_string":"alice | bob"},
{"range":{"orders.quantity":{"gt":1}}}]}},
"join":[{"type":"inner","table":"orders",
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}]}
' | jq

This returns only Bob's Phone order with quantity=2, since Alice's orders have quantity=1.

Full-Text Search in Main Table

You can perform full-text search on the main (left) table by adding a query block at the root level:

curl -s localhost:9308/search -d '
{"table":"customers","_source":["name","orders.product"],
"query":{"query_string":"smith"},
"join":[{"type":"inner","table":"orders",
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}]}
' | jq

This searches for "smith" in the customers table (left table). However, since we're using INNER JOIN, only Bob Smith appears (as John has no orders).

Let's try with LEFT JOIN:

curl -s localhost:9308/search -d '
{"table":"customers","_source":["name","orders.product"],
"query":{"query_string":"smith"},
"join":[{"type":"left","table":"orders",
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}]}
' | jq

This returns both Bob Smith with his Phone order and John Smith with NULL order values.

You can combine full-text search on both tables:

curl -s localhost:9308/search -d '
{"table":"customers","_source":["name","orders.product"],
"query":{"query_string":"smith"},
"join":[{"type":"inner","table":"orders","query":{"query_string":"phone"},
"on":[{"left":{"table":"orders","field":"customer_id"},
"operator":"eq","right":{"table":"customers","field":"id"}}]}]}
' | jq

This searches for "smith" in customers (left table) and "phone" in orders (right table), returning only Bob Smith's Phone order.

Remember the important rules for JOINs:

  1. When selecting fields from two tables, prefix fields from the right table (e.g., orders.product)
  2. Use query block in the join section to search in the right table
  3. Use query block at root level to search in the left (main) table
  4. INNER JOIN requires matches in both tables
  5. LEFT JOIN includes all rows from the left table, even without matches