Tokenization of Chinese texts
First we make sure there's a running Manticore instance on the machine
Everything is ok, and now we connect to the Manticore search deamon:
mysql -P 9306 -h0
Let's insert into a simple mixed chinese-english sentence '买新的Apple电脑' ('I like Apple computers'):
INSERT INTO testrt VALUES ( 1, 'first record', '买新的Apple电脑', 1 );
And now let's make a search against our content field with a query phrase 'Apple电脑' ('Apple computers'):
SELECT * FROM testrt WHERE MATCH ('@content Apple电脑');
As we see, the search has been succesfully executed and we got the expected result despite the fact that both original sentence and query phrase we used in this example didn't have any separators between the words.
We can use 'Show meta' command to see the information about the query and make sure that our sentence was properly segmented into separate words:
We see that out search phrase was indeed divided into 'apple' and '电脑' ('computers') words just as we supposed it to be done.