The Web3 Ops Monitoring Guide: Alerts, Workflows, and Tooling for Protocol Teams
Web3 ops teams own production around the clock — but most of them still have no formal runbook for on-chain incidents. The contracts run, events fire, and if nothing is watching, the first sign of a problem is a Slack message from a user who noticed something was wrong. This guide covers what to monitor, how to structure an alert workflow that scales with your team, and how the main tooling options actually compare.
What to Monitor and Why
The mistake most teams make when they start monitoring smart contracts is trying to watch everything. Every event, every transaction, every block. That produces so much noise that alerts stop feeling urgent — and when a real incident hits, the signal is buried.
There are four event classes that cover the overwhelming majority of on-chain incidents worth a human response:
- Large transfers — Token movements above a configurable wei threshold. The threshold is what separates routine payroll from an unauthorized drain. Getting this calibration right is the work; the event signature (
Transfer(address,address,uint256)) is universal. - OwnershipTransferred — Any change to the owner or admin of a contract. A legitimate key rotation looks identical on-chain to a compromised key being used to reassign ownership. Both need immediate human review.
- Paused / Unpaused — Many DeFi protocols implement a circuit breaker. An unexpected pause during normal market hours is a strong signal that something is wrong — either an exploit in progress or an emergency governance action that your team needs to know about.
- RoleGranted / RoleRevoked — Role-based access control changes affect who can execute treasury transactions, upgrade contracts, or modify governance parameters. Any role change outside a known governance proposal is a high-priority alert.
Everything else — gas usage, block timestamps, internal function calls — is noise unless you have a specific threat model that requires it. Start with these four and tune from there.
Structuring Your Alert Workflow
A monitoring system without a defined response workflow is just a log. The alerts only matter if the right person sees them and knows what to do. Here's how production ops teams typically structure their alert routing:
Slack channel routing. Use two channels, not one. A #contract-alerts channel for severity-1 events (large transfers above your top threshold, ownership changes, pause events) with @here or @oncall pings. A separate #ops-feed channel for lower-threshold informational alerts that don't require immediate action. Never dump contract alerts into #general — they'll be ignored within a week.
Escalation tiers. Define two alert tiers before you configure anything. Tier 1: events that require someone to drop what they're doing and check the chain within 5 minutes. Tier 2: events that go into the ops feed for review at the next standup. The threshold that separates tier 1 from tier 2 is different for every protocol — but you need to decide it explicitly, not discover it when an incident happens.
Runbook format. For every tier-1 alert type, write a short runbook. It doesn't need to be long — three things: who gets paged (role, not individual), what they check first (the on-chain transaction, the governance forum, the team's internal channel), and how to verify whether an alert is a false positive. Teams that skip the runbook end up with two people both responding to the same alert and making conflicting decisions under pressure.
Tip: Dedicated alert channels are a force multiplier. The same alert in #contract-alerts gets 80% faster response than the same alert buried in a general channel — the dedicated channel trains your team to treat it as a high-priority signal. If you're cross-posting alerts from your DAO treasury monitor, see the DAO treasury threshold-tuning guide for calibration advice that applies equally here.
Tooling Comparison: Tenderly, DIY Webhooks, and Sentinel
Three options cover most of the market for Web3 ops monitoring. Here's how they compare on the dimensions that matter for a production ops team:
| Tool | Alert on Transfer / OwnershipTransferred |
Multi-chain | HMAC signing | Ops-first UX | Maintenance burden |
|---|---|---|---|---|---|
| Tenderly | Yes — via alerting rules in the devtools dashboard | Yes | No | Low — built for devs, not ops teams | Low — hosted |
| DIY webhooks | Yes — you write the filter logic | Yes — one RPC connection per chain | You implement it | None — you build everything | High — infra, reliability, upgrades all on you |
| Sentinel | Yes — configurable thresholds per event type | Yes — Ethereum, Arbitrum, Base, Polygon, Optimism | Yes — all webhook deliveries | High — built for ops alerting | Low — hosted, managed |
Tenderly is excellent for debugging and simulation — if your team is already using it for contract development, you can bolt on alerts. But the alerting UX is secondary to the dev tooling, and configuring escalation tiers, threshold tuning, and Slack routing requires a workaround for every feature. It's a developer tool that can do ops monitoring, not an ops monitoring tool.
DIY webhooks give you full control: your own RPC connection, your own filter logic, your own delivery pipeline. If you have a backend engineer with spare cycles and a strong opinion about how alerts should work, this is viable. The hidden cost is maintenance — every chain upgrade, RPC provider change, or new event type requires you to update the code. Most ops teams find the maintenance burden grows faster than they expected.
Sentinel is built ops-first: configurable thresholds per contract, multi-chain from a single dashboard, HMAC-signed webhook delivery, and alert routing to Slack or email without custom code. The tradeoff is that you're working within our event model rather than writing your own filters. For teams that want to be monitoring in under an hour, it's the fastest path.
A 5-Step Ops Runbook
Whether you're setting up monitoring for the first time or migrating from a DIY solution, this sequence gets you to production in a single sprint:
- Inventory your contracts by chain. List every contract your protocol has deployed, the chain it's on, and whether it holds funds or controls access. This is your monitoring surface. Contracts that hold funds or gate access are tier-1 monitoring candidates; everything else is tier-2 or below.
- Decide which event types need alerts. For each contract in your inventory, identify which of the four event classes apply. A simple token contract needs Transfer monitoring. A multisig needs OwnershipTransferred. A protocol with a pause function needs Paused/Unpaused. A governance contract needs RoleGranted/RoleRevoked. Not every contract needs every event type.
- Set thresholds per contract. For transfer monitoring, pull three months of on-chain history and find the 95th-percentile transfer size. Set your tier-1 threshold just above that. For event types without a numeric threshold (ownership changes, pause events), all occurrences are tier-1 by default — there's no such thing as a routine ownership change.
- Configure Slack and email destinations. Set up your
#contract-alertsand#ops-feedchannels before you connect them to monitoring. Write a one-paragraph channel description so any new team member knows what goes there and what to do when an alert lands. - Run a weekly alert audit. Schedule 20 minutes every Monday to review the previous week's alert history. Are tier-1 alerts getting responses? Are tier-2 alerts being read? Are thresholds generating false positives? Alert systems drift out of calibration as protocols evolve — the weekly audit catches it before it becomes an incident.
Alert fatigue is a real failure mode. Ops teams that skip threshold tuning — or set thresholds too low "just to be safe" — end up muting their own alerts within two weeks. A muted alert channel is worse than no monitoring: it creates false confidence. If your team is receiving more than 2–3 tier-1 alerts per day, most of which don't require action, raise the threshold. The goal is a signal-to-noise ratio high enough that every alert feels urgent.
What Good Looks Like
A well-tuned monitoring setup is mostly silent. That's the goal — not constant alerts, but high-confidence silence. When nothing fires, that means nothing above your thresholds moved. No unauthorized ownership changes. No unexpected pauses. Silence is correct behavior.
The target state for a production ops team is 1–2 tier-1 alerts per week, all of which require a human response. Not 20 alerts per day that you've trained yourself to ignore. Not zero alerts ever because your thresholds are so high that only a catastrophic event would fire them. One or two alerts per week, each requiring a human to look at a transaction and decide whether it's expected.
When you reach that state, the weekly alert audit becomes a 5-minute review rather than a triage session. Your runbook stays relevant because the team exercises it regularly. And when a real incident happens — an actual unauthorized transfer, an actual ownership change — the alert lands in a channel your team trusts, reaches the right person, and gets a response within minutes.
Get early access for your ops team
Join the waitlist to be notified when Sentinel opens to new teams. We're onboarding protocol ops teams in cohorts — drop your email and we'll reach out when your spot is ready.
Join the WaitlistRelated reading: How to Monitor Your DAO Treasury — A Governance Lead's Guide for threshold calibration on Gnosis Safe and DAO treasuries, and Why Smart Contracts Fail for the threat model behind these monitoring recommendations.