Mean Time To Restore or MTTR
MTTR is the elapsed time between an incident and the time taken to resolve the incident.
Computation of the metric
- When a production outage occurs, the time of the incident is recorded and serves as the start time.
- When the incident is resolved and service is restored, the time of the resolution is recorded and serves as the end time.
- The difference between start time & end time is the “time to resolve” for that incident.
- The average time to resolve across all incidents is the mean time to resolve.
Industry benchmarks
The DORA performance grades for MTTR are:
- Elite - Less than 1 hour
- High - Between 1 hour and 1 week
- Medium - Between 1 day and 1 week
- Low - Between 1 week and 1 month
Dashboards where this metric is used
- DORA Metrics
Use cases of this metric
- MTTR measures how quickly incidents are resolved, indicating the efficiency of incident management processes.
- Tracking MTTR trends over time helps identify areas for process optimization and efficiency gains.
- Longer MTTR values indicate higher risk due to prolonged system downtime, prompting proactive measures to reduce incident resolution times.
- Faster incident resolution improves customer satisfaction by minimizing service disruptions and downtime.