- Wed 01 January 2025
- Meta
- Ben Cottrell
Unlike a lot of common content management systems, like Wordpress, the system which generates my website does not have a built-in search feature. I use Pelican, a static site generator written in Python, installed through the Pip package manager.
I wanted a challenge, so I decided to build my own search engine. It would be based on two components: a script, which hooks into Pelican's page generation system, and another script to provide a search API REST service.
The search indexer takes the generated HTML code, and the post metadata, and inserts it into a sqlite3 database. I used beautifulsoup4 to get plain text. Admittedly, this is not the most optimal approach, when I could've scanned the pages myself.
The "REST endpoint" for the search API was implemented with Flask, and, sqlite3. Originally I had implemented the server in C++, using Oat++, as it seemed to be good at reflection. Unfortunately, the latter module proved too flaky, so, I went with Python, which handles strings and serializing objects to JSON with ease.
I think my solution works well enough as my website has pretty low traffic volume. There are a few tweaks I could make.
From this project, I've learned that developing software doesn't require the most powerful components, but rather, those that are easy enough to maintain. Having a gradual learning curve and having affordable compromises are important as well.