rustunnel supports group-based load balancing for HTTP and TCP tunnels. Multiple clients can register against the same subdomain (HTTP) or share a TCP port pool, and inbound connections are dispatched at random across healthy members of the group. Optional client-side health probes automatically remove sick backends from the rotation. The model is modeled on FRP’sDocumentation Index
Fetch the complete documentation index at: https://rustunnel.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
loadBalancer.group / healthCheck config — same shape, slightly different wire format.
Concepts
- Group — a logical pool of tunnel members sharing the same subdomain (HTTP) or TCP port. Identified by a user-supplied
groupname plus a sharedgroup_key. The server stores only the SHA-256 hash of the key and uses it to authorise joins; the raw key never leaves the client. - Member — one tunnel inside a group. Running two clients with the same
(group, group_key)produces a 2-member pool. - Health bit — every member has a
healthyflag. Dispatch routes around members whose flag isfalse. Without a health check configured, members are permanently healthy (the server trusts the client’s presence). - Dispatch — for each new public connection, the server picks one healthy member uniformly at random. There’s no weighting and no sticky sessions today.
Configuration
Server (server.toml)
The kill switch. When false (the default), the server accepts the new fields on the wire but ignores them — every registration is a solo tunnel. When true, members sharing (subdomain, group_key_hash) (HTTP) or (group_name, group_key_hash) (TCP) form a real pool.
The kill switch is per-region for self-hosted multi-region deployments. Flip it on regions one at a time during a rollout —
false is the safe default that preserves single-tunnel-per-key behaviour.Client (~/.rustunnel/config.yml)
Add group, group_key, and optionally health_check to a tunnel definition:
| Field | Required | Default | Meaning |
|---|---|---|---|
group | yes (for LB) | — | Display name of the pool. The first joiner sets TunnelGroup.name; later joiners are accepted regardless of what they pass. |
group_key | yes (for LB) | — | Shared secret. SHA-256-hashed before transmission. Members of one pool MUST agree on this value; the server rejects a join with a mismatched key. |
health_check.type | no | — | tcp (open a connection) or http (issue a GET). Omit to disable probing. |
health_check.path | yes when type: http | — | Path to GET against the local service. |
health_check.interval_secs | no | 10 | Probe period. |
health_check.timeout_secs | no | 3 | Per-probe deadline. |
health_check.max_failed | no | 3 | Consecutive failures before reporting TunnelUnhealthy. |
health_check.expect_2xx | no | true | When false, any HTTP response counts as healthy. |
health_check.alert_webhook | no | — | Per-tenant URL the server POSTs to when this group transitions to 0 healthy members. See Webhook alerts below. |
Behaviour rules
HTTP groups
HTTP groups
Members must declare the same protocol (
http vs https). A mismatch is rejected with a clear error. The subdomain is the routing key — every member of a group shares one subdomain.TCP groups
TCP groups
The first member of a
(group, group_key) allocates a port from the configured tcp_port_range. Subsequent members reuse that port; the server returns the same assigned_port to all joiners. Members never see a Registered listener event after the first — the listener is already bound.Solo collisions
Solo collisions
Registering a solo (no-group) tunnel against an existing group’s subdomain is rejected with
subdomain '...' is already in use. Registering a grouped tunnel against an existing solo tunnel is rejected with group key does not match. A subdomain is owned by exactly one identity at a time.Last-leave semantics
Last-leave semantics
The group entry is removed when its last member disconnects. The TCP port (if any) is returned to the pool. New registrations after that point start a fresh group with a fresh port.
Race safety
Race safety
The create / join / remove paths are serialised atomically via the routing-table entry API. Two concurrent first registrations produce one group, not two.
Health checks
Probes run on the client againstlocal_addr. The server never opens a connection to the upstream itself — it just trusts the client’s TunnelHealthy / TunnelUnhealthy reports.
- TCP probe: opens a TCP connection. Success = connect within
timeout_secs. - HTTP probe: sends
GET <path> HTTP/1.0and reads the status line. Success = response withintimeout_secsand (whenexpect_2xx) status in[200, 300).
First probe success
Emits
TunnelHealthy — lifts the initial healthy=false state for members that opted into probing.`max_failed` consecutive failures
Emits
TunnelUnhealthy. The server flips the healthy bit to false and excludes the member from dispatch.health_check is permanently healthy. A member with a spec starts unhealthy and only joins dispatch after the first successful probe.
Webhook alerts
When a load-balancing group transitions to 0 healthy members, public traffic to that subdomain or port starts returning 502. The server can POST a JSON alert to one or more URLs at the moment of that transition so an operator or tenant can react. There are two distinct destinations, each addressing a different audience:Operator URL — [load_balancing] alert_webhook_url in server.toml
Set on the edge. Fires for every group on that edge that goes 0/N, regardless of which tenant owns the group. Useful for self-hosted deployments and for ops awareness on a managed multi-tenant edge.
Per-tenant URL — health_check.alert_webhook in the client config
Set on the client. Fires only when the group containing this tunnel goes 0/N. Each tenant points it at their Slack / PagerDuty / email gateway. The URL is sent on the wire as part of HealthCheckSpec and stored on the affected GroupMember; only the server holds it (the URL is never returned by /api/groups — dashboards see a presence-only flag).
Payload
Same JSON body sent to every destination:key_hash_short is the first 8 hex chars of the group’s SHA-256 key hash — stable across reconnects, useful for correlating alerts when a single team runs multiple pools with the same group_name.
Debounce
The server tracks a per-groupzero_healthy_alerted flag. Once an alert fires, subsequent TunnelUnhealthy frames against the same already-down group do not re-fire. The flag resets the moment any member becomes healthy again — the next 0/N transition then fires fresh.
In practice: if your pool flaps badly (down → up → down → up), each downward edge generates one alert per destination. Steady-state “everyone is still down” generates none.
Delivery
Best-effort. The server uses a 5-second per-request timeout, no retry, no queue. If your webhook receiver is down at the moment of the transition, the alert is lost. For high-stakes paging, point the URL at something durable — a queueing alertmanager, or a service like Pushover with retry — rather than relying on the rustunnel server for delivery guarantees.
tokio::spawn, so a slow webhook receiver never blocks the server’s frame-handling hot path.
Testing the feature locally
Quick end-to-end smoke test against a self-hosted edge with[load_balancing] enabled = true. Spin up two clients with the same (group, group_key), point them at separate local backends, and hammer the public URL — both backends should serve.
Start client B with `local_port: 3001`
Either edit
/tmp/lb-test.yml and run a second rustunnel start, or use a second config file with the same group / group_key and local_port: 3001.Observability
When[load_balancing] enabled = true, the Prometheus exporter on :9090 emits three additional series:
| Metric | Type | Labels | What it measures |
|---|---|---|---|
rustunnel_group_members | gauge | group, region, healthy | Count of registered members partitioned by their health bit. |
rustunnel_group_dispatches_total | counter | group, region | Total dispatched connections, summed across the group’s members. Per-group rather than per-member to keep label cardinality bounded. |
rustunnel_group_health_failures_total | counter | group, region, kind | Total TunnelUnhealthy frames received across the group’s members. kind is tcp / http / none based on the probe type. |
rustunnel_active_tunnels_* and rustunnel_requests_total gauges/counters keep counting members (not groups) so historical dashboards stay accurate.
Per-tunnel timeline + live event stream
Two REST surfaces let dashboards reconstruct recent health behaviour without polling all of/api/tunnels:
GET /api/tunnels/:id/health-events— last 50 health-state transitions for that tunnel ({ at, healthy, reason }[], oldest first). Records edges only — steady-state probe reports are not stored. Use this to render a per-tunnel timeline panel.GET /api/groups/:protocol/:label/events— Server-Sent Events stream emitting onegroup_eventper member health-bit transition affecting the named group. 30s keep-alive ping. Use this for live dashboards that want push instead of polling. AlaggedSSE event means the consumer fell behind — resync via/api/groups.
/api/tunnels (admin token or DB token), and they apply the same per-tenant scope: a user-scoped DB token sees only groups containing at least one of its own members; aggregate counters reflect just the visible members; groups the caller can’t see return 404 rather than 403. Admin tokens see everything.
Limitations & non-goals
rustunnel’s load balancing is intentionally minimal. If you need any of the features below, layer them at the application or DNS level.
- No weighted dispatch — random uniform only.
- No sticky sessions — every new connection is dispatched independently. Long-lived WebSocket connections that need affinity must handle reconnects at the application layer.
- No active session draining on member removal — in-flight connections finish naturally; new connections route elsewhere.
- No UDP groups — UDP is connectionless; there’s no obvious unit to dispatch.
- No P2P groups — P2P publishers are 1-to-many by design.
- No cross-region pools — members must be on the same edge server. Layer DNS-based routing on top for global LB.
- No
groupKeyrotation — once a group exists, rotating its key requires dropping all members.
See also
Client guide
CLI flags, config file, and the full set of tunnel modes.
Architecture
How HTTP / TCP / UDP / P2P tunnels flow through the system.

