πŸ” Basic Principles of Software Monitoring

by Dmytro Litvinov

Share

So you are running roughly $100k in MRR πŸ€‘ (or, let's imagine, $1-5k MRR) SaaS business and are still in the active development phase. You continue to add in the project "features, not bugs" πŸ˜„ and you understand that it started to be harder not to break a code. So what are our next steps here?

First of we need to track issues at production πŸ› and for that one, there is a great tool called Sentry. There is a free tier here 🌟

Don’t just observe. Take action. The only app monitoring platform built for developers that gets to the root cause for every issue.
πŸ›
Track the bugs in your code before they evolve into features in your production

Then you would love to know that your website is running (yeah, I know that owner is 24/7 in business and checking/monitoring but there should be time for your family). So let's delegate the work of checking that the website is running to an uptime monitoring service. My go-to tool for today is UptimeRobot (I used an affiliate link). There is a free tier here 🌟

So right now we can lay back and chill 🏝️
You don't see any errors at Sentry, uptime monitoring service like UptimeRobot shows it is "up". So what can be worse? πŸ€”


But what if I say that the uptime monitoring service does not work fully? Imagine the situation when there is a gradual increase in latency at the database. 20% of your users started to be timed out. Just 20% of your users which can potentially be 80% of your revenue of $1-5k MRR of your business 😨
You won't know about it until the pinger, along with regular users, starts to time out.

πŸ™ˆ
What's a website's favorite game? "Hide and seek" with the uptime monitor!

So we are mature enough project with good enough MRR and we need to make sure our customers can rely on our service as we should be reliable. So uptime monitoring service does not work here for us that's where the monitoring πŸ” is coming into the game.

Proactive monitoring is key to flagging potential issues with your applications and infrastructure early, enabling you to respond quickly and reduce downtime.

The Four Golden Signals:

author Denise Yu (from X https://twitter.com/deniseyu21/status/1092615933531688961)

So with these four metrics, you can say exactly which API endpoids is slow or maybe because we released new feature like rendering PDFs which is CPU intensive task or maybe it is because of latency from 3rd party application like SFDC. Every request is tracable so we are not blind anymore and we can adapt our codebaase/infrastructure and not lose money 😎

πŸ€–
May the DevOps be with you