This is a development contribution report about the newly developed multiple-word text search for utopian.
I am very grateful to have been able to take part in this action made possible by @elear and the Utopian community.
When I offered to take part in this, Elear suggested that it would be cool if I would do it together with @codingdefined. So I started to talk on Discord with @codingdefined and then started the development yesterday, each on his fork of the api.utopian.io repo, but we were in contact over Discord to discuss the project in the meantime.
I had some experience with Mongodb before this, but only for basic finds, edits, deletes and such, nothing fancy.
Yesterday I researched for many hours and read quite a few articles about the different types of search available for Mongodb/Mongoose. I tested some possible solutions locally and eventually realized that the text search with added indexes and ordering by relevance ($meta) score was the best option for this task.
I researched the repo to understand how to integrate the search better and do away with the regex search as @elear suggested.
I managed to keep most of the existing structure intact. I stayed very late yesterday and had made the search work as I wanted, but I didn't want to make a PR so late.
Today after I did the PR, me and @codingdefined started to test and see that it behaves as expected.
While testing @codingdefined found an unrelated bug on the live utopian site and he reported it.
Cool, so what does the new search do?
Well, with this we can search multiple words at a time (regex was bad at this), and the results are ordered after a $meta match score (relevance). This relevance score
is also broadcasted by the api.utopian.io when making a search, so we could highlight the ones with a really high score.
Notice the score
being broadcasted below:
We can set weights for the body of the text and the title. Right now title has 5
and body has 2
. So a post that has a searched word in the title is more relevant(has more weight) than a post that has the word in the body.
Also we can search for exact phrase if we put the desired text between quotes, or we can exclude words if we put a minus in front of a word.
Here is the pull request:
https://github.com/utopian-io/api.utopian.io/pull/18
Going ahead I am investigating how I can add aggregated pipeline text search in order to limit the results broadcasted by a $meta score threshold. So that we only broadcast relevant entries, and not somewhat relevant or barely relevant.
Open Source Contribution posted via Utopian.io