The use of resources shall be monitored and adjusted in line with current and expected capacity requirements
Our Linux servers and Kubernetes clusters are actively monitored, and the comprehensive oversight enables us to maintain optimal performance and resource utilization across our infrastructure.
We leverage tools like Prometheus and Grafana from the kube-prom stack to collect and visualize resource metrics effectively. Grafana dashboards provide clear insights into usage patterns, while Alertmanager is configured to send real-time alerts to both our team and customers through webhook integration, ensuring all stakeholders remain informed.
Our in-house alert processor, Opsmondo, enhances our alert management by creating Gitea issues for those alerts from Alertmanager and then dispatching notification. We utilize Mattermost for immediate team communication and Grafana Oncall for streamlined incident management. This ensures that no alerts go unnoticed and that we can respond swiftly to any capacity issues.
By using a fully self-hosted stack of monitoring and alerting tools, we maintain complete control over our resources, ensuring that we can tailor our capacity management strategies to meet the dynamic needs of our operations and those of our customers.