LD Relay
The LaunchDarkly Relay Proxy (LD Relay) can be deployed in your infrastructure to provide a local endpoint for SDKs, reducing outbound connections and potentially improving initialization availability. However, LD Relay introduces operational complexity and new failure modes that must be carefully managed.
Risks and Operational Burden
Using LD Relay introduces:
- Additional infrastructure: More services to deploy, monitor, scale, and secure
- Resource constraints: Insufficiently provisioned relay instances can become bottlenecks or points of failure
- Maintenance overhead: Your team must handle responsibilities previously managed by LaunchDarkly's platform
Operationalizing LD Relay Properly
Deploy Highly Available Infrastructure
Load balancer:
- Implement a highly available internal load balancer as the entry point for all flag delivery traffic
- If the load balancer is not highly available, it becomes a single point of failure
- Support routing to LaunchDarkly's primary streaming network and LD Relay instances
Relay instances:
- Deploy multiple Relay Proxy instances across different availability zones
- Ensure each instance is properly sized and monitored
- Implement health checks and automatic failover
Persistent Storage
LD Relay can operate with or without persistent storage. Each approach has different tradeoffs:
With persistent storage such as Redis:
- Benefits:
- Enables scaling LD Relay instances during outages
- Allows restarting LD Relay instances without losing flag data
- Provides durable cache that survives Relay Proxy restarts
- Tradeoffs:
- Increases operational complexity. Additional service to manage.
- Requires configuring infinite cache TTLs to prevent lower availability and incorrect evaluations
- Prevents using AutoConfig. AutoConfig requires in-memory only operation.
- Additional monitoring and alerting requirements for cache health
Without persistent storage, in-memory only:
- Benefits:
- Simpler architecture with fewer components to manage
- Supports AutoConfig for dynamic environment configuration
- Lower operational overhead
- Tradeoffs:
- Relies on LD Relay cluster being able to service production traffic without restarting or adding instances during outages
- Lost cache on Relay Proxy restart. Requires re-initialization from LaunchDarkly's service.
- Must ensure sufficient capacity and redundancy to handle outages without scaling
Monitor and Alert
Key metrics to monitor:
- Initialization latency and errors
- CPU/Memory utilization
- Network utilization
- Persistent store availability
When to Use LD Relay for Improved Initialization Availability
LD Relay can improve initialization availability in these specific scenarios:
Frequent Deployments or Restarts
Use LD Relay when: You deploy or restart services frequently, at least once per day.
Why: Frequent restarts mean frequent SDK initializations. LD Relay reduces initialization latency and provides cached flag data even if LaunchDarkly's service is temporarily unavailable during a restart window.
Example scenarios:
- Kubernetes deployments with rolling restarts
- Serverless functions with frequent cold starts
- Containers that restart frequently for configuration updates
Critical Consistency Requirements
Use LD Relay when: Multiple services or instances must evaluate flags consistently, even during short outages of initialization availability.
Why: LD Relay provides a shared cache that multiple SDK instances can use, ensuring consistent flag evaluations across services even when LaunchDarkly's service is temporarily unavailable.
Example scenarios:
- Microservices that must all evaluate the same flag consistently
- Multi-region deployments requiring consistent feature rollouts
- Applications where inconsistent flag evaluations cause data corruption or business logic errors
High Impact of Fallback Values
Use LD Relay when: Fallback values cause significant business impact, such as loss of business, not just degraded UX.
Why: When fallback values cause significant business impact such as payment processing failures, data loss, or compliance violations, LD Relay provides cached flag data to avoid serving fallbacks.
Example scenarios:
- Payment processing systems where fallback values cause transaction failures
- Compliance-critical features where fallback values violate regulations
- Safety-critical systems where degraded functionality is unacceptable
Additional information
For detailed information on LD Relay configuration, scaling and performance guidelines and refer to the LD Relay chapter.