Docs
Mean Time to Restore

Mean Time To Restore or MTTR

MTTR is the elapsed time between an incident and the time taken to resolve the incident.

Mean-time-to-restore

Computation of the metric

  1. When a production outage occurs, the time of the incident is recorded and serves as the start time.
  2. When the incident is resolved and service is restored, the time of the resolution is recorded and serves as the end time.
  3. The difference between start time & end time is the “time to resolve” for that incident.
  4. The average time to resolve across all incidents is the mean time to resolve.

Industry benchmarks

The DORA performance grades for MTTR are:

  1. Elite - Less than 1 hour
  2. High - Between 1 hour and 1 week
  3. Medium - Between 1 day and 1 week
  4. Low - Between 1 week and 1 month

Dashboards where this metric is used

  1. DORA Metrics

Use cases of this metric

  1. MTTR measures how quickly incidents are resolved, indicating the efficiency of incident management processes.
  2. Tracking MTTR trends over time helps identify areas for process optimization and efficiency gains.
  3. Longer MTTR values indicate higher risk due to prolonged system downtime, prompting proactive measures to reduce incident resolution times.
  4. Faster incident resolution improves customer satisfaction by minimizing service disruptions and downtime.