Bitbucket DORA Metrics: Connecting Bitbucket Cloud to DORA Tracking
Bitbucket Cloud contains most of the data you need to track DORA metrics — but not all of it, and none of it is pre-calculated. This is a step-by-step guide to what data lives where, which gaps require external integrations, and how to wire it all together into a complete engineering dashboard.
Start with a data inventory
Before connecting any tool to Bitbucket, it helps to understand exactly what data Bitbucket Cloud exposes and what it does not. Many DORA tracking projects stall because teams assume Bitbucket can provide all four metrics natively — it cannot. MTTR in particular requires an incident management integration that no amount of Bitbucket API calls can replace.
The table below summarizes the full data inventory. Use it to audit your current setup before selecting a tracking approach.
| Data Type | In Bitbucket? | API / Source | Notes |
|---|---|---|---|
| Pull request events | Yes | GET /repositories/{workspace}/{repo_slug}/pullrequests | Creation, review, approval, merge timestamps. Full history via pagination. |
| Pipeline run events | Yes | GET /repositories/{workspace}/{repo_slug}/pipelines | Build/test/deploy runs. Status (successful, failed, in_progress, stopped). |
| Deployment environment events | Yes | GET /repositories/{workspace}/{repo_slug}/deployments | Requires named environments in pipeline config. Production events for deploy frequency. |
| Commit history | Yes | GET /repositories/{workspace}/{repo_slug}/commits | Timestamps, messages, authors, branch associations. Rate-limit aware pagination needed. |
| Revert commit detection | Yes | Inferred from commit messages | No dedicated API — requires pattern matching on commit message text. |
| Incident open/close events | No | Not available — requires PagerDuty, OpsGenie, or incident.io | Required for MTTR. Cannot be approximated from Bitbucket data alone. |
| Rollback detection (non-revert) | No | Partial — infer from deployment status + subsequent redeployment | Bitbucket Pipelines failed deployments are visible; manual rollback procedures are not. |
Step 1: Configure Bitbucket Pipelines for deployment tracking
This is the most commonly skipped step, and it undermines deploy frequency measurement entirely. Without named deployment environments in your pipeline configuration, Bitbucket cannot distinguish between a build run and a production deployment. Both appear as pipeline completions in the general Pipelines API.
To enable deployment tracking, your bitbucket-pipelines.yml must include a deployment field in the step that performs your production deployment:
pipelines:
branches:
main:
- step:
name: Deploy to production
deployment: production
script:
- ./deploy.shWith this configuration, Bitbucket creates a deployment record in the Deployments API every time this step runs. The record includes the environment name, the deployment status, the associated commit SHA, and the timestamp. This is the data DORA tracking tools use to calculate deploy frequency and lead time Phase 3.
If you have multiple deployment targets (staging, canary, production), configure each as a separate named environment. Most DORA tools will filter to the "production" environment for deploy frequency calculation — confirm this behavior with your tool of choice.
If changing your pipeline configuration is not immediately feasible, some tools fall back to counting merges to the main branch as a proxy for deployments. This approximation is acceptable for teams that deploy on every merge with no human approval gate, but it overcounts significantly for teams with manual deployment approval or scheduled release windows.
Step 2: Connect pull request data for lead time
Pull request data from Bitbucket's API is well-structured and relatively easy to work with. The key events you need for lead time calculation are:
- First commit timestamp on the branch: This is the start of lead time. It requires traversing the commit history to find the first commit that diverged from the base branch — typically the main or develop branch. The Commits API supports this via branch filtering, but computing the exact divergence point requires some graph traversal.
- PR creation timestamp: Available directly in the PR object as
created_on. This marks the end of the coding phase and the start of the review phase. - First review activity timestamp: The first comment, inline review, or approval on the PR. Available from the PR activity feed endpoint. This marks the start of active review.
- PR merge timestamp: Available in the PR object as
updated_onwhen the state isMERGED. This marks the end of the review phase.
With these four timestamps, you can calculate:
- Coding time: PR creation − first commit
- Time to first review: first review − PR creation
- Review duration: PR merge − first review
- Deployment time: Deployment event − PR merge (from Step 1 data)
- Total lead time: Deployment event − first commit (the full DORA-defined measurement)
One practical concern: Bitbucket's PR activity endpoint is paginated and can be slow for PRs with large review threads. For high-volume repositories, you will need to implement caching or use a tool that handles this pagination and rate limiting for you.
Step 3: Identify failure signals for change failure rate
Change failure rate requires identifying which deployments caused problems that needed remediation. In Bitbucket, there are two reliable signals and one weaker inference:
Failed deployment pipeline steps:When a deployment step in Bitbucket Pipelines fails, the Deployments API records a deployment with a failed status. This is the clearest failure signal. A deployment that failed is definitionally a failed change. Some teams argue that a failed deployment that was immediately re-run successfully is not a "failure requiring remediation" — this is a judgment call that affects your CFR denominator. The DORA framework does not specify how to handle re-run deployments; be consistent in your definition.
Revert commits merged to main:When a commit to main is a revert of a previous commit, it signals that a previous change caused a problem. You can detect these by looking for commit messages matching the pattern "Revert '<original commit message>'" — git's default revert message format. The Bitbucket Commits API includes full commit messages, so this pattern can be applied at collection time.
Hotfix branches (weaker signal): Branches named with a hotfix prefix (hotfix/, hot-fix/, fix/) that are merged to main quickly after a production deployment are candidates for CFR inclusion. This inference is unreliable for teams that do not use consistent branch naming conventions.
To be explicit: Bitbucket does not natively track rollbacks, incident severity, or any concept of "this deployment caused a problem." You are inferring failure from deployment status and git history, which covers most but not all failure modes. Production incidents that were resolved by configuration changes, infrastructure restarts, or feature flag toggles will not appear in your Bitbucket-derived CFR.
Step 4: Add an incident management integration for MTTR
There is no workaround for this step. MTTR cannot be calculated from Bitbucket data alone because Bitbucket does not know when a production incident started or ended.
The three incident management tools with the most complete DORA integration are:
PagerDuty:The industry standard for on-call alerting. The PagerDuty API exposes incident created and resolved timestamps, severity, and affected services. MTTR is the average of (resolved_at − triggered_at) for incidents in the production service, filtered to your measurement period. PagerDuty's own analytics feature surfaces MTTR, but it does not connect to Bitbucket for the other three DORA metrics.
OpsGenie:Atlassian's on-call management tool. If your organization is already on the Atlassian stack (Jira, Confluence, Bitbucket), OpsGenie is the natural choice for incident management. The API is similar to PagerDuty — alert created and alert closed timestamps for severity-filtered production alerts.
incident.io: A newer incident management tool with a cleaner API and better structured data around incident severity and impact. Integrates with Slack, PagerDuty, and most monitoring tools. For teams setting up incident management for the first time alongside DORA tracking, incident.io is worth evaluating.
Once you have an incident management integration, MTTR connects to your other DORA data through a shared timeline. Incidents that correlate with deployments (incidents that started within hours of a deployment event) can also inform your change failure rate calculation — an incident triggered by a deployment is a stronger CFR signal than either the deployment status or the revert commit pattern alone.
Step 5: Bring it together in a dashboard
With data from Bitbucket (PR events, deployment events, commit history) and your incident management tool, you have everything needed for all four DORA metrics. The final step is deciding where to surface the results.
There are three approaches, roughly in order of setup effort:
Purpose-built DORA tool (lowest effort): Connect Bitbucket and your incident management tool to a platform like Koalr. The tool handles API authentication, data collection, metric calculation, and dashboard rendering. You get DORA metrics on day one, with 90 days of backfilled history, without writing any code or maintaining any infrastructure. This is the right choice for most engineering teams. See how Koalr connects to Bitbucket.
Custom pipeline into a BI tool (medium effort): Build a data collection pipeline that calls the Bitbucket and incident management APIs, stores events in a database, and calculates DORA metrics on a schedule. Surface results in Grafana, Metabase, or Looker. This approach gives you maximum flexibility but requires ongoing maintenance as APIs change and your engineering organization grows. Budget 2–4 weeks of engineering time for initial implementation and ongoing maintenance time for API changes.
Spreadsheet tracking (highest effort, poorest outcomes): Some teams track DORA metrics manually in a spreadsheet — recording deployment dates, PR merge times, and incident durations by hand. This is only viable for very small teams and breaks down quickly as the team grows. It also produces unreliable data because manual entry is inconsistent and selectively biased toward recording positive outcomes.
Common configuration mistakes and how to avoid them
After connecting Bitbucket to a DORA tracking system, these are the configuration issues that most commonly produce inaccurate metric baselines:
- Missing production environment tag in pipelines: If your deployment step is not tagged with
deployment: production, your deploy frequency will be zero in the Deployments API. Double-check yourbitbucket-pipelines.ymlbefore assuming the integration is broken. - Including bot PRs in lead time: Automated dependency update PRs (from Dependabot equivalents or Renovate) inflate PR count and distort review time metrics. Filter these out by excluding PRs from bot accounts or by using a minimum PR duration threshold.
- Not filtering incidents by severity: Including all PagerDuty or OpsGenie alerts — including low-severity informational alerts — in your MTTR calculation produces a metric that measures noise, not production recovery. Filter to P1 and P2 incidents for DORA-relevant MTTR.
- Measuring a 7-day window at the start: DORA metrics are most meaningful over 30, 60, or 90-day windows. A 7-day window has high variance and can make a healthy team look bad or a troubled team look good depending on which week you sample. Start with 90-day windows and trend downward as you accumulate data.
Skip the pipeline — connect Bitbucket to Koalr in 30 minutes
Koalr handles all the Bitbucket API integration, data joining, and DORA metric calculation automatically. Connect Bitbucket and your incident tool, and your first dashboard is ready with 90 days of backfilled history.