The 12 Days of Benchmarking - Day 8
Incident rate
So far in this series weâve focused on value, experience, and delivery.
Now we turn to something more operational, but just as critical:
đ How stable are your digital products and services day to day?
Because stability is the foundation that everything else is built on.
ďťż
What this metric is
Incident rate measures the number of incidents per user per month.
In simple terms:
How many things go wrong for your users, relative to the size of your organisation?
It is calculated by dividing the number of incidents in a given month by the number of users supported by your IT services.Â
ďťż
Why it matters
Incident rate is a direct reflection of your operational stability.
A low incident rate indicates:
- stable platforms and infrastructure
- effective change and release practices
- and well-managed capacity and availability
A high incident rate often signals deeper issues:
- recurring problems not being resolved
- or gaps in monitoring and observability
What good looks like in practice
Organisations with a low incident rate typically have:
- Strong problem management
- Recurring issues are identified and removed at the root
- Effective change enablement
- Changes are well tested and introduced safely
- Proactive monitoring and observability
- Issues are detected before they impact users
- Accurate configuration and asset data
- Systems and dependencies are well understood
- Stable, well-engineered platforms
- Infrastructure and applications are designed for resilience
ďťż
Why are incident rates high
When incident rates are higher than expected, the PBM highlights some common drivers:
- Ineffective problem management
Root causes of recurring incidents are not removed
- Poor change management
Changes introduce new incidents into the environment
- Weak monitoring and observability
Issues are only detected once users are already impacted
- Outdated or inaccurate configuration data
Leading to incorrect changes or misaligned dependencies
- Capacity and availability gaps
Systems are under-provisioned or poorly managed
- Fragile infrastructure or applications
Legacy systems or technical debt create instability
- Insufficient user training
Users generate incidents because systems are not intuitive or well understood
All of these are highlighted as common contributing factors in the PBM guidance.Â
ďťż
How to improve it
If you want to reduce your incident rate:
1. Invest in problem management
Focus on eliminating recurring incidents, not just resolving them
2. Strengthen change enablement
Improve testing, release controls, and change risk assessment
3. Improve monitoring and observability
Detect and resolve issues before users experience them
4. Maintain accurate configuration data
Ensure your CMDB and service maps are reliable and up to date
5. Address technical debt and fragility
Stabilise legacy systems and improve platform resilience
6. Improve user guidance and training
Reduce avoidable incidents caused by confusion or lack of knowledge
ďťż
A simple reflection
If your service desk volumes suddenly doubled next monthâŚ
Would you see that as a surprise?
Or would you already know exactly where the issues are coming from?
ďťż
Take part in the benchmarking
The ITIL Performance Benchmarking Model helps you understand the operational stability of your digital services compared to your peers.
By contributing your data, you can:
- benchmark your incident rates
- identify systemic stability issues
- and prioritise improvements in resilience and reliability
đ Take part in the PBM survey and contribute your data - ITIL Performance Benchmarking Survey 2026ďťż ďťż
Tomorrowâs focus:
On-time incident resolution - how quickly and consistently do you restore service when things go wrong?