Emergency Playbook
This document provides response procedures for emergency scenarios. It is intended for guardians, governance operators, and anyone monitoring the protocol.
Trigger / response matrix
| Trigger | Severity | Immediate action | Investigation | Recovery |
|---|---|---|---|---|
| Rate drop breaker fires | High | Protocol auto-pauses. No action needed. | Check Aztec rollup for slashing events. Run refreshAttesterState() for affected attesters. | Verify exchange rate is accurate after accounting. Guardian unpauses SafetyModule, then Core and Vault. |
| Queue ratio breaker fires | Medium | Protocol auto-pauses. | Check if a large withdrawal request caused the spike or if total assets dropped (slashing). | If queue pressure is legitimate, unpause and let rebalance unstake to cover. If caused by a bug, investigate before unpausing. |
| Accounting staleness breaker fires | Medium | Protocol auto-pauses. | Check why updateAccounting() hasn't been called. Possible causes: rebalance stuck mid-cycle, gas prices too high, no callers. | Call updateAccounting() (permissionless). If rebalance is stuck, guardian calls forceRebalanceReset() first. Then unpause. |
| Suspected exploit in progress | Critical | Guardian calls emergencyPauseAll() on OllaGovernance (pauses Core + Vault in one tx). | Assess scope: which contracts are affected, what funds are at risk, is the attacker still active. | Do not unpause until root cause is identified. Prepare upgrade if contract fix is needed. |
| Aztec rollup incident | High | Guardian pauses Core and Vault. | Monitor Aztec status channels. Check attester states via refreshAttesterState(). | Wait for rollup recovery. Verify staked balances match expectations. Unpause when rollup is stable. |
| Rebalance stuck mid-cycle | Low | Guardian calls forceRebalanceReset(). | Check which step failed and why (gas, external call revert). | Reset clears the state machine. Cooldown restarts. Next rebalance will retry. No funds are lost. |
| Key compromise (guardian) | High | Governance revokes GUARDIAN_ROLE from compromised key and grants to new address (timelocked). | Assess if the compromised guardian performed any malicious pauses or resets. | Guardian can only pause — no fund loss possible. But availability may be disrupted if attacker keeps pausing. |
| Key compromise (governance) | Critical | Community alert. No on-chain mitigation if governance multisig is fully compromised. | Governance controls upgrades and fee parameters. Full compromise means full protocol risk. | This is a catastrophic scenario. Timelock delay is the only mitigation — users have a window to exit before malicious proposals execute. |
Response procedures
Circuit breaker triggered
When any circuit breaker fires, the SafetyModule emits CircuitBreakerTriggered(reason) with one of:
BreakerReason.RateDropBreakerReason.QueueRatioBreakerReason.AccountingStale
Step 1: Identify the cause
# Check which breaker fired (look for CircuitBreakerTriggered events)
cast logs --from-block <block> --address <SafetyModule> "CircuitBreakerTriggered(uint8)"
Step 2: Investigate
For rate drop: Check if attesters were slashed on the rollup. Call refreshAttesterState() with affected attester addresses to update the protocol's view of staked balances.
For queue ratio: Check WithdrawalQueue.totalPendingAssets() vs OllaCore.totalAssets(). Determine if the ratio is transient (large single withdrawal) or systemic.
For accounting staleness: Check lastAccountingTimestamp on the SafetyModule. If rebalance is stuck (step != Done), use forceRebalanceReset() first, then call updateAccounting().
Step 3: Verify state before unpausing
Before unpausing, confirm:
- Exchange rate reflects current on-chain reality.
- No ongoing exploit or attack.
- Attester states are up-to-date (
refreshAttesterStatecalled for all active attesters). - Accounting has been updated.
Step 4: Unpause
Unpause in order:
SafetyModule.unpause()(guardian)OllaCore.unpause()(guardian)OllaVault.unpause()(guardian)
Or use OllaGovernance.emergencyUnpauseAll() (governance admin) to unpause Core and Vault in one transaction. SafetyModule must be unpaused separately.
Suspected exploit
Step 1: Pause immediately
# Governance admin (fastest, pauses Core + Vault in one tx)
cast send <OllaGovernance> "emergencyPauseAll()" --private-key <gov_key>
# Also pause SafetyModule separately
cast send <SafetyModule> "pause()" --private-key <guardian_key>
Step 2: Assess damage
- Check token balances of all protocol contracts.
- Check for unexpected
Transfer,Approval, or role change events. - Verify proxy implementations haven't been changed.
- Check if any governance proposals are pending in the timelock.
Step 3: Prepare response
If a contract upgrade is needed:
- Develop and test the fix.
- Deploy new implementation.
- Schedule upgrade through governance timelock.
- Wait for timelock delay.
- Execute upgrade.
- Verify fix.
- Unpause.
Force rebalance reset
Use when a rebalance cycle is stuck (e.g., an external call reverts repeatedly).
cast send <OllaCore> "forceRebalanceReset()" --private-key <guardian_key>
What this does:
- Resets the rebalance state machine to
Done. - Sets
lastRebalanceTimestampto current time (enforces cooldown before next cycle).
What is NOT lost:
- Unharvested rewards remain on the rollup. The next cycle's harvest step will claim them.
- Partial unstakes are tracked on-chain by the Aztec rollup.
refreshAttesterState()will pick them up. - Queued withdrawal requests are preserved in the WithdrawalQueue.
What is discarded:
- Any in-progress step computations (e.g., partially calculated stake/unstake amounts).
Monitoring recommendations
Events to watch
| Event | Contract | Indicates |
|---|---|---|
CircuitBreakerTriggered(reason) | SafetyModule | Automatic pause — investigate immediately |
Paused() | OllaCore, OllaVault, SafetyModule | Manual or automatic pause |
Unpaused() | OllaCore, OllaVault, SafetyModule | Operations resumed |
RebalanceForceReset() | OllaCore | Guardian reset a stuck rebalance |
AccountingUpdated(...) | OllaCore | Successful accounting cycle |
SlashingDetected(amount) | OllaCore | Attester(s) were slashed on the rollup |
FeesMinted(treasury, provider) | OllaVault | Protocol fees distributed |
Health checks
Run periodically to detect issues before they trigger breakers:
- Accounting freshness: Time since last
AccountingUpdatedevent. Alert if approachingmaxAccountingDelay. - Queue pressure:
WithdrawalQueue.totalPendingAssets() / OllaCore.totalAssets(). Alert if approachingmaxQueueRatioBps. - Buffer level: Compare
bufferedAssetstotargetBufferedAssets. Low buffer means instant redemptions may fail. - Rebalance state: Check if rebalance is stuck mid-cycle (step != Done for extended period).
- Attester health: Monitor attester statuses on the Aztec rollup for unexpected exits or slashing.