Lubomir Varbanov

0 %
Lubomir Varbanov
Tech entrepreneur
CTO as a service
  • Residence:
    Europe, Bulgaria
  • Web Applications
  • Full-stack Development
  • Agile Methodologies
  • Infrastructure as Code
  • DevOps & MLOps
  • Requirement Analysis
  • Code Reviews
  • Talks and Lectures
  • Working with Startups
  • Dev Consultancy
  • Community Building
Sense of humor
Pain in the ass

ELK monitoring for e-commerce

Back-end development, ELK, JavaScript, MySQL, PHP, REST API, TypeScript

Project details

The  setup

The client is an US e-commerce startup with tens of thousands monthly orders. The site is hosted on a platform-specific manged hosting and is being actively developed & deployed with CI via Github Actions. The platform has history of being slow to process certain client actions during peak hours / promotions like black Friday. 

Project goals

Idenitify key business metrics to measure and set up alarms when they fall under certain thresholds as an indicator of a possible problem. Start measuring backend execution times to identify slow endpoints for future improvements.  

Work done

A Kibana dashboard was set up for the client (as part of an ELK setup). Site metrics and execution time statistics started being collected in their backend and stored in separate database tables. CRON jobs were created to sync these tables with a separate MySQL database on the remote instance where the ELK is running (hosted separately)

The MySQL is then indexed in the ElasticSearch that powers the Kibana dashboard and 10 separate alerts were set. The alerts are being sent to either Telegram or Slack depending on the severity level for the developers to be able to react quickly.

Additionally 3 types of logs for the dev teams are being monitored - hosting logs, JS error logs and own caught exceptions. An additional step was added to the Github Actions workflow to store timestamps and commit messages/hashes when code has been deployed. The deploys information make it possible to analyze site behaviour and spot spikes in either warnings/errors or server response times as a consequence of a bad deploy.


Thanks to the monitoring system slow endpoints were idenitified and consequentially optimized. An educated, data-driven decision was made to increase the server resources and to invest time to rework a slow checkout. Both operations lead to an increase > 1% convertion rate

Incidents are now spotted within minutes and resolved before they can harm business KPIs. 

  • Client:
  • Year:
  • Type:
  • Tech stack:
    ELK, MySQL
© 2023 All Rights Reserved.
Verified by MonsterInsights