Search

My Search Engine


Unlike a lot of common content management systems, like Wordpress, the system which generates my website does not have a built-in search feature. I use Pelican, a static site generator written in Python, installed through the Pip package manager.

I wanted a challenge, so I decided to build my own search engine. It would be based on two components: a script, which hooks into Pelican's page generation system, and another script to provide a search API REST service.

The search indexer takes the generated HTML code, and the post metadata, and inserts it into a sqlite3 database. I used beautifulsoup4 to get plain text. Admittedly, this is not the most optimal approach, when I could've scanned the pages myself.

The "REST endpoint" for the search API was implemented with Flask, and, sqlite3. Originally I had implemented the server in C++, using Oat++, as it seemed to be good at reflection. Unfortunately, the latter module proved too flaky, so, I went with Python, which handles strings and serializing objects to JSON with ease.

I think my solution works well enough as my website has pretty low traffic volume. There are a few tweaks I could make.

From this project, I've learned that developing software doesn't require the most powerful components, but rather, those that are easy enough to maintain. Having a gradual learning curve and having affordable compromises are important as well.