Implementation, Deployment & Maintenance

Go-Live & Hypercare

18 min Lesson 6 of 10

Go-Live & Hypercare

Every preceding phase — requirements gathering, design, build, testing, training, and cutover planning — has been building toward a single moment: go-live. The day the new system goes from staging to production and real users begin doing real work inside it. That moment is exhilarating and terrifying in equal measure. Go-live is not a finish line; it is the beginning of a brief, high-stakes window called hypercare — an intensified support period where the project team remains on standby to stabilise the system and protect the business.

As a business analyst, your role does not end when the code is deployed. You authored the requirements the system was built against. You defined the acceptance criteria. During go-live and hypercare you are a critical interface between confused end-users, stressed developers, and anxious executives — the person who can translate between all three groups and keep the organisation functioning while the system beds in.

Launch Day: What Actually Happens

A well-run go-live is not a single event — it is a structured sequence of activities executed against a documented go-live runbook. The runbook (also called a go-live checklist or cutover plan) lists every action, the person responsible, the expected duration, and the success criterion for that step. Nothing should be improvised on go-live day.

A typical launch-day sequence for a mid-sized system looks like this:

Pre-go-live freeze — code is locked; no new features or fixes enter the release branch without change-control approval.
Final data migration run — delta data since the last rehearsal migration is loaded; checksums are verified against the source system.
Smoke tests — a short scripted set of the most critical business transactions is executed by the QA team to confirm the deployment is healthy before users are admitted.
Go / no-go decision — stakeholders confirm that smoke tests passed and the rollback window has not been exceeded; leadership signs off.
Communication sent — an announcement goes to all users explaining that the new system is live and where to get help.
Hypercare begins — enhanced support staffing activates; the war room opens.

The go/no-go gate is non-negotiable. Before any users access the new system, a named decision-maker (often the project sponsor or IT director) explicitly confirms that pre-defined go-live criteria have been met. This gate prevents teams from drifting into a broken go-live through optimism and sunk-cost pressure. Document the criteria in advance — never define them on launch day.

The War Room

On go-live day — and for the first days of hypercare — the project team operates a war room: a physical or virtual space where technical leads, the business analyst, a representative from the business, and often a senior manager are co-located (or on a persistent video call) throughout business hours. The war room exists to:

Receive and triage incoming issues in real time
Make fast decisions without bureaucracy (no ticket-queue delays for critical issues)
Communicate status to executives and the broader organisation
Coordinate rollback if a critical failure threshold is reached

In a logistics firm launching a new warehouse management system, the war room might include the IT project manager, the lead developer, the business analyst who wrote the picking-and-packing workflow requirements, the warehouse operations manager, and a helpdesk representative. When a picker reports that bin locations are not populating on the handheld device, the BA can immediately identify which data migration step or configuration rule governs that field — cutting diagnosis time from hours to minutes.

War room issue flow: end-user reports reach the helpdesk, the BA diagnoses and routes to the developer, and escalates status to the executive sponsor.

Hypercare: The Stabilisation Period

Hypercare is the formally defined period — typically 2 to 4 weeks — immediately following go-live, during which the project team provides an elevated level of support above and beyond the normal IT helpdesk. Hypercare exists because even the best-tested system will surface unexpected behaviour when it meets the full complexity and diversity of real production usage.

Hypercare is characterised by several specific practices:

Extended hours — support coverage stretches beyond normal business hours to match peak usage windows. An online store going live before the holiday season may require 07:00–22:00 coverage.
Dedicated triage queue — go-live issues are handled with higher urgency than normal service requests. A P1 (critical) issue during hypercare might have a 30-minute response SLA instead of the normal 4-hour target.
Daily stand-ups — a short synchronisation meeting (15–20 minutes) each morning where the war-room team reviews open issues, confirms the system's health metrics, and decides which fixes will be deployed that day.
Issue log — every reported problem is logged with the reporter, affected business process, severity, root cause (once known), and resolution. This log becomes the input to the post-implementation review.
Clear exit criteria — hypercare ends when the system meets pre-defined stability thresholds, such as fewer than 5 open P1/P2 issues, error rates below a defined baseline, and user-satisfaction scores above a minimum. These criteria must be agreed before go-live, not improvised when the team wants to disband.

BA tip — own the issue log: In practice, no one else will maintain a clean, business-context-aware issue log. Developers track bugs in a code-level ticket system; managers track headline KPIs. The BA bridges the gap by logging each issue with its business impact ("pickers cannot confirm picks — 200 orders delayed per hour") alongside the technical description. This framing drives appropriate prioritisation and gives the post-implementation review real data.

Issue Severity and the Rollback Decision

Not every go-live issue triggers the same response. Hypercare teams classify issues by severity:

P1 — Critical: The system is unable to support core business operations. Example: the clinic booking system cannot save any appointment. Rollback is considered immediately.
P2 — High: A significant business process is impaired but a workaround exists. Example: bulk appointment import fails; staff can enter appointments manually. Fix within 4 hours.
P3 — Medium: A non-critical function is broken or a cosmetic issue causes confusion. Fix within the current business day or next sprint.
P4 — Low: Minor usability issue or enhancement request. Logged for the maintenance backlog.

The rollback decision is the most consequential judgment during hypercare. Rollback means reverting to the previous system, which causes its own disruption (data entered in the new system must be migrated back; re-training may be needed; users lose confidence). Rollback should happen when P1 issues cannot be resolved within the agreed time window AND continuing causes greater business harm than reverting. Rollback criteria — and the name of the person who can authorise it — must be documented in the go-live runbook before launch day.

The "just one more hour" trap: Under go-live pressure, teams repeatedly delay the rollback decision by one more hour, hoping the next fix will stabilise everything. Each delay increases data-migration complexity and organisational confusion. If rollback criteria are met, execute the decision without hesitation. A clean rollback now is far better than a chaotic partial rollback tomorrow.

Stabilisation: From Reactive to Proactive

As hypercare progresses, the pattern of incoming issues typically shifts. The first 48 hours bring a flood of configuration errors, data-migration anomalies, and user-error reports. By day 5–7, the volume usually drops sharply. By day 10–14, only edge-case issues remain. This pattern — sometimes called the go-live spike — should be tracked visually in the daily stand-up so the team and stakeholders can see the system stabilising.

Typical go-live spike: issue volume peaks in the first 48 hours and decays toward the exit threshold as the system stabilises.

Communication During Hypercare

One of the BA's most important hypercare tasks is maintaining clear, honest communication with the organisation. Two documents support this:

Daily status report — sent each afternoon to project stakeholders and management. Includes: issues opened that day, issues resolved, current open count by severity, system health indicators (error rates, response times), and a brief narrative. Written in business language, not technical jargon.
End-user FAQ / known-issues list — a living document shared on the intranet or via email. Lists confirmed bugs, workarounds, and expected fix dates. Reduces repeat calls to the helpdesk and builds user trust.

Communicate problems before users discover them. If the migration team identifies a data quality issue at 08:00 that will affect certain users, send a targeted notification before those users hit the problem at 09:00. Proactive communication maintains trust; being caught off-guard destroys it.

Ending Hypercare and Handing Over to Operations

Hypercare ends formally — not by attrition. When the pre-agreed stability criteria are met, the project team documents the hypercare closure: a record of all issues raised, their resolutions, any open items handed to the support team, and the final system health baseline. This document is signed off by the business owner and becomes the starting point for the post-implementation review (covered in the next lesson).

The transition from hypercare to normal operations (BAU — Business As Usual) must be communicated clearly. Users need to know that the intensified support window has closed and where to go for help going forward. The IT operations team needs a complete list of known issues, system quirks, and the BA\'s contact information for escalations that require business context.

Summary

Go-live day is a structured sequence — runbook, final data migration, smoke tests, go/no-go gate, user communication — not an improvised event.
The war room co-locates key roles (helpdesk, BA, developer, business representative, executive) to enable fast triage and decision-making.
Hypercare is a formally bounded, intensified support period (typically 2–4 weeks) with elevated SLAs, daily stand-ups, and a structured issue log.
Issues are classified by severity (P1–P4); rollback criteria and the decision authority must be documented before launch day.
Issue volume typically spikes on day 1–2 and decays toward a stability threshold; track this visually to manage stakeholder expectations.
Clear daily communication — status reports and known-issues lists — is as important as technical fixes during this period.
Hypercare closes formally when stability criteria are met, with a handover document that feeds directly into the post-implementation review.