11/27/2023 0 Comments Intermapper open source![]() ![]() Rather than build a wiki, we embed private Hackpads onto each Service overview page. Sharing debugging informationīecause one of Cabot’s most important features is helping consolidate information and metrics that tell us what is broken, it’s the first place that we look to work out how to fix it. You can also choose which channels you would like to receive notifications for this alert by, select users who should receive alerts/notifications, or disable alerts entirely. If you have already configured Checks that you want to make part of the service, you can select them in the Status checks field. Adding a serviceĪdding a service is straightforward - just click New ▾ and then Service. Suggestions and pull requests are welcome. Currently we respond by jumping into Hipchat and using that as a war room for coordinating response, but a more formal escalation and acknowledgement procedure may be appropriate for larger organisations. ![]() We have thought of implementing a mechanism for acknowledging alerts. You can disable alerts for a service by editing it and unticking Alerts enabled. (You can configure this by changing the Importance of a check.)Ĭabot will ignore checks that are disabled (by unticking the Active box on the check configuration page) in calculating service status. A telephone alert is a phone call, not just an SMSĪlert repeats every 2 hours (notification every 10 mins)Ĭabot sets the status of a Service to the status of its most important (Critical > Error > Warning) failing check.A Hipchat message that does not contain (which will cause pop-up desktop notifications, emails and/or SMS notifications to be sent).It’s easier to understand what happens when a service changes from its Start status to End status by looking at the following table. Phone alerts and other notifications to duty officers repeated every ten minutes. Critical - this is for when the production database crashes at 2am.Error - alerts possible via all channels other than phone call, and repeated every 2 hours.Warning - things you want to be alerted to, but probably not disturbed for.If a service has previously been marked as Error or Critical you will get a notification to say that the service is back to normal. Services have four possible overall statuses which feed into alert frequency: Instead of (or rather, as well as) monitoring your Hadoop cluster’s disk usage, you can make sure you get alerts when the response time for your ElasticSearch-based search API crosses a threshold, or when a particular database on your Redis server grows past a certain size. We created Cabot because we wanted a way of monitoring services, not just machines.Ĭabot allows you to monitor logical services - running on one machine or 100 nodes - as logical units. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |