Emergency Playbook

This document provides response procedures for emergency scenarios. It is intended for guardians, governance operators, and anyone monitoring the protocol.

Trigger / response matrix

Trigger	Severity	Immediate action	Investigation	Recovery
Rate drop breaker fires	High	Protocol auto-pauses. No action needed.	Check Aztec rollup for slashing events. Run `refreshAttesterState()` for affected attesters.	Verify exchange rate is accurate after accounting. Guardian unpauses SafetyModule, then Core and Vault.
Queue ratio breaker fires	Medium	Protocol auto-pauses.	Check if a large withdrawal request caused the spike or if total assets dropped (slashing).	If queue pressure is legitimate, unpause and let rebalance unstake to cover. If caused by a bug, investigate before unpausing.
Accounting staleness breaker fires	Medium	Protocol auto-pauses.	Check why `updateAccounting()` hasn't been called. Possible causes: rebalance stuck mid-cycle, gas prices too high, no callers.	Call `updateAccounting()` (permissionless). If rebalance is stuck, guardian calls `forceRebalanceReset()` first. Then unpause.
Suspected exploit in progress	Critical	Guardian calls `emergencyPauseAll()` on OllaGovernance (pauses Core + Vault in one tx).	Assess scope: which contracts are affected, what funds are at risk, is the attacker still active.	Do not unpause until root cause is identified. Prepare upgrade if contract fix is needed.
Aztec rollup incident	High	Guardian pauses Core and Vault.	Monitor Aztec status channels. Check attester states via `refreshAttesterState()`.	Wait for rollup recovery. Verify staked balances match expectations. Unpause when rollup is stable.
Rebalance stuck mid-cycle	Low	Guardian calls `forceRebalanceReset()`.	Check which step failed and why (gas, external call revert).	Reset clears the state machine. Cooldown restarts. Next rebalance will retry. No funds are lost.
Key compromise (guardian)	High	Governance revokes `GUARDIAN_ROLE` from compromised key and grants to new address (timelocked).	Assess if the compromised guardian performed any malicious pauses or resets.	Guardian can only pause — no fund loss possible. But availability may be disrupted if attacker keeps pausing.
Key compromise (governance)	Critical	Community alert. No on-chain mitigation if governance multisig is fully compromised.	Governance controls upgrades and fee parameters. Full compromise means full protocol risk.	This is a catastrophic scenario. Timelock delay is the only mitigation — users have a window to exit before malicious proposals execute.

Response procedures

Circuit breaker triggered

When any circuit breaker fires, the SafetyModule emits CircuitBreakerTriggered(reason) with one of:

BreakerReason.RateDrop
BreakerReason.QueueRatio
BreakerReason.AccountingStale

Step 1: Identify the cause

# Check which breaker fired (look for CircuitBreakerTriggered events)
cast logs --from-block <block> --address <SafetyModule> "CircuitBreakerTriggered(uint8)"

Step 2: Investigate

For rate drop: Check if attesters were slashed on the rollup. Call refreshAttesterState() with affected attester addresses to update the protocol's view of staked balances.

For queue ratio: Check OllaVault.pendingWithdrawalAssets() vs OllaCore.totalAssets(). Determine if the ratio is transient (large single withdrawal) or systemic.

For accounting staleness: Check lastAccountingTimestamp on the SafetyModule. If rebalance is stuck (step != Done), use forceRebalanceReset() first, then call updateAccounting().

Step 3: Verify state before unpausing

Before unpausing, confirm:

Exchange rate reflects current on-chain reality.
No ongoing exploit or attack.
Attester states are up-to-date (refreshAttesterState called for all active attesters).
Accounting has been updated.

Step 4: Unpause

Unpause in order:

SafetyModule.unpause() (guardian)
OllaCore.unpause() (guardian)
OllaVault.unpause() (guardian)

Or use OllaGovernance.emergencyUnpauseAll() (governance admin) to unpause Core and Vault in one transaction. SafetyModule must be unpaused separately.

Suspected exploit

Step 1: Pause immediately

# Governance admin (fastest, pauses Core + Vault in one tx)
cast send <OllaGovernance> "emergencyPauseAll()" --private-key <gov_key>

# Also pause SafetyModule separately
cast send <SafetyModule> "pause()" --private-key <guardian_key>

Step 2: Assess damage

Check token balances of all protocol contracts.
Check for unexpected Transfer, Approval, or role change events.
Verify proxy implementations haven't been changed.
Check if any governance proposals are pending in the timelock.

Step 3: Prepare response

If a contract upgrade is needed:

Develop and test the fix.
Deploy new implementation.
Schedule upgrade through governance timelock.
Wait for timelock delay.
Execute upgrade.
Verify fix.
Unpause.

Force rebalance reset

Use when a rebalance cycle is stuck (e.g., an external call reverts repeatedly).

cast send <OllaCore> "forceRebalanceReset()" --private-key <guardian_key>

What this does:

Resets the rebalance state machine to Done.
Sets lastRebalanceTimestamp to current time (enforces cooldown before next cycle).

What is NOT lost:

Unharvested rewards remain on the rollup. The next cycle's harvest step will claim them.
Partial unstakes are tracked on-chain by the Aztec rollup. refreshAttesterState() will pick them up.
Queued withdrawal requests remain in OllaVault storage and are picked up by the next rebalance.

What is discarded:

Any in-progress step computations (e.g., partially calculated stake/unstake amounts).

Monitoring recommendations

Events to watch

Event	Contract	Indicates
`CircuitBreakerTriggered(reason)`	SafetyModule	Automatic pause. Investigate immediately.
`Paused()` / `Unpaused()`	OllaCore, OllaVault, SafetyModule	Manual or automatic pause / resume.
`RebalanceReset()`	OllaCore	Guardian reset a stuck rebalance.
`AccountingUpdated(...)`	OllaCore	Successful accounting cycle.
`NegativeRewardsPeriod(grossRewardsSigned)`	OllaCore	Slashing exceeded rewards in a reporting window.
`WithdrawalAdjusted(id, original, adjusted)`	OllaVault	A queued redemption was paid out below `assetsExpected` because slashing reduced the rate after the request.
`FeesMinted(treasuryShares, providerShares)`	OllaVault	Protocol fees distributed.
`Upgraded(implementation)` (ERC-1967)	Any UUPS proxy	A governance upgrade landed. The Butler logs every Upgraded event in the governance log.

Health checks

Run periodically to detect issues before they trigger breakers:

Accounting freshness: Time since last AccountingUpdated event. The Butler exposes olla_butler_accounting_staleness_seconds and alerts on > 24h.
Queue pressure: OllaVault.pendingWithdrawalAssets() / OllaCore.totalAssets(). Compare against maxQueueRatioBps.
Buffer level: Track OllaVault.bufferedAssets() against OllaVault.pendingWithdrawalAssets(). The Butler exposes olla_butler_buffer_utilization_pct and warns below 20%.
Rebalance state: olla_butler_rebalance_overdue is set to 1 when the on-chain cooldown has elapsed but the state machine is not at Done for 10+ minutes (typical signal of a stuck cycle).
Attester health: olla_butler_attester_slashing_loss, olla_butler_rollup_attester_zombie_count, olla_butler_attester_cached_vs_rollup_drift, and olla_butler_attester_refresh_needed_count cover the common failure modes. The full alerting ruleset lives in the Butler repo's monitoring/alerts.yml.
Recent activity: hit the Butler's /events (operational) and /governance (config / upgrade / pause) endpoints for a quick human-readable view of what's happened recently.

Trigger / response matrix​

Response procedures​

Circuit breaker triggered​

Suspected exploit​

Force rebalance reset​

Monitoring recommendations​

Events to watch​

Health checks​