Skip to main content

Emergency Playbook

This document provides response procedures for emergency scenarios. It is intended for guardians, governance operators, and anyone monitoring the protocol.

Trigger / response matrix

TriggerSeverityImmediate actionInvestigationRecovery
Rate drop breaker firesHighProtocol auto-pauses. No action needed.Check Aztec rollup for slashing events. Run refreshAttesterState() for affected attesters.Verify exchange rate is accurate after accounting. Guardian unpauses SafetyModule, then Core and Vault.
Queue ratio breaker firesMediumProtocol auto-pauses.Check if a large withdrawal request caused the spike or if total assets dropped (slashing).If queue pressure is legitimate, unpause and let rebalance unstake to cover. If caused by a bug, investigate before unpausing.
Accounting staleness breaker firesMediumProtocol auto-pauses.Check why updateAccounting() hasn't been called. Possible causes: rebalance stuck mid-cycle, gas prices too high, no callers.Call updateAccounting() (permissionless). If rebalance is stuck, guardian calls forceRebalanceReset() first. Then unpause.
Suspected exploit in progressCriticalGuardian calls emergencyPauseAll() on OllaGovernance (pauses Core + Vault in one tx).Assess scope: which contracts are affected, what funds are at risk, is the attacker still active.Do not unpause until root cause is identified. Prepare upgrade if contract fix is needed.
Aztec rollup incidentHighGuardian pauses Core and Vault.Monitor Aztec status channels. Check attester states via refreshAttesterState().Wait for rollup recovery. Verify staked balances match expectations. Unpause when rollup is stable.
Rebalance stuck mid-cycleLowGuardian calls forceRebalanceReset().Check which step failed and why (gas, external call revert).Reset clears the state machine. Cooldown restarts. Next rebalance will retry. No funds are lost.
Key compromise (guardian)HighGovernance revokes GUARDIAN_ROLE from compromised key and grants to new address (timelocked).Assess if the compromised guardian performed any malicious pauses or resets.Guardian can only pause — no fund loss possible. But availability may be disrupted if attacker keeps pausing.
Key compromise (governance)CriticalCommunity alert. No on-chain mitigation if governance multisig is fully compromised.Governance controls upgrades and fee parameters. Full compromise means full protocol risk.This is a catastrophic scenario. Timelock delay is the only mitigation — users have a window to exit before malicious proposals execute.

Response procedures

Circuit breaker triggered

When any circuit breaker fires, the SafetyModule emits CircuitBreakerTriggered(reason) with one of:

  • BreakerReason.RateDrop
  • BreakerReason.QueueRatio
  • BreakerReason.AccountingStale

Step 1: Identify the cause

# Check which breaker fired (look for CircuitBreakerTriggered events)
cast logs --from-block <block> --address <SafetyModule> "CircuitBreakerTriggered(uint8)"

Step 2: Investigate

For rate drop: Check if attesters were slashed on the rollup. Call refreshAttesterState() with affected attester addresses to update the protocol's view of staked balances.

For queue ratio: Check WithdrawalQueue.totalPendingAssets() vs OllaCore.totalAssets(). Determine if the ratio is transient (large single withdrawal) or systemic.

For accounting staleness: Check lastAccountingTimestamp on the SafetyModule. If rebalance is stuck (step != Done), use forceRebalanceReset() first, then call updateAccounting().

Step 3: Verify state before unpausing

Before unpausing, confirm:

  • Exchange rate reflects current on-chain reality.
  • No ongoing exploit or attack.
  • Attester states are up-to-date (refreshAttesterState called for all active attesters).
  • Accounting has been updated.

Step 4: Unpause

Unpause in order:

  1. SafetyModule.unpause() (guardian)
  2. OllaCore.unpause() (guardian)
  3. OllaVault.unpause() (guardian)

Or use OllaGovernance.emergencyUnpauseAll() (governance admin) to unpause Core and Vault in one transaction. SafetyModule must be unpaused separately.

Suspected exploit

Step 1: Pause immediately

# Governance admin (fastest, pauses Core + Vault in one tx)
cast send <OllaGovernance> "emergencyPauseAll()" --private-key <gov_key>

# Also pause SafetyModule separately
cast send <SafetyModule> "pause()" --private-key <guardian_key>

Step 2: Assess damage

  • Check token balances of all protocol contracts.
  • Check for unexpected Transfer, Approval, or role change events.
  • Verify proxy implementations haven't been changed.
  • Check if any governance proposals are pending in the timelock.

Step 3: Prepare response

If a contract upgrade is needed:

  1. Develop and test the fix.
  2. Deploy new implementation.
  3. Schedule upgrade through governance timelock.
  4. Wait for timelock delay.
  5. Execute upgrade.
  6. Verify fix.
  7. Unpause.

Force rebalance reset

Use when a rebalance cycle is stuck (e.g., an external call reverts repeatedly).

cast send <OllaCore> "forceRebalanceReset()" --private-key <guardian_key>

What this does:

  • Resets the rebalance state machine to Done.
  • Sets lastRebalanceTimestamp to current time (enforces cooldown before next cycle).

What is NOT lost:

  • Unharvested rewards remain on the rollup. The next cycle's harvest step will claim them.
  • Partial unstakes are tracked on-chain by the Aztec rollup. refreshAttesterState() will pick them up.
  • Queued withdrawal requests are preserved in the WithdrawalQueue.

What is discarded:

  • Any in-progress step computations (e.g., partially calculated stake/unstake amounts).

Monitoring recommendations

Events to watch

EventContractIndicates
CircuitBreakerTriggered(reason)SafetyModuleAutomatic pause — investigate immediately
Paused()OllaCore, OllaVault, SafetyModuleManual or automatic pause
Unpaused()OllaCore, OllaVault, SafetyModuleOperations resumed
RebalanceForceReset()OllaCoreGuardian reset a stuck rebalance
AccountingUpdated(...)OllaCoreSuccessful accounting cycle
SlashingDetected(amount)OllaCoreAttester(s) were slashed on the rollup
FeesMinted(treasury, provider)OllaVaultProtocol fees distributed

Health checks

Run periodically to detect issues before they trigger breakers:

  1. Accounting freshness: Time since last AccountingUpdated event. Alert if approaching maxAccountingDelay.
  2. Queue pressure: WithdrawalQueue.totalPendingAssets() / OllaCore.totalAssets(). Alert if approaching maxQueueRatioBps.
  3. Buffer level: Compare bufferedAssets to targetBufferedAssets. Low buffer means instant redemptions may fail.
  4. Rebalance state: Check if rebalance is stuck mid-cycle (step != Done for extended period).
  5. Attester health: Monitor attester statuses on the Aztec rollup for unexpected exits or slashing.