Monitoring and Alerting

Monitoring is a good thing (duh!).  It’s one of the core functions that an operations group provides.  As with most things, there are good ways and bad ways of doing monitoring.

When you monitor, you end up with some kind of dashboard, say with Nagios:



Which is very helpful, when you’re looking at it.

Ever seen a setup like this?


I think people who build systems like this really liked the movie War Games, but have missed one important difference – in the movie, it was someone’s job to sit and watch the screens 24/7/365.  Do you have staff for that?  Should we treat people that way?

In the real world, people have better things to do than stare at a monitor, waiting for some indicator to turn red.  Large displays like this become “monitoring theater” (see Security Theater) – basically fluff to make people think the system is being monitored. But, with those monitors sitting there, what happens when nobody is looking?


You must have alerting to make your monitoring worthwhile. Do you?

2 responses to “Monitoring and Alerting

  1. Rajesh Swarnkar

    Indeed ELK should combine alerting and reporting stack.

  2. Thank you for the wonderful post!

Leave a Reply

Your email address will not be published. Required fields are marked *