Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Scaling and Performance

Overview

This topic explains scaling and performance considerations for the Relay Proxy.

The computational requirements for LD Relay are fairly minimal when serving server-side SDKs or when used to populate a persistent store. In this configuration, the biggest scaling bottleneck is network bandwidth and throughput. Provision LD Relay as you would for an HTTPS proxy and tune for at least twice the number of concurrent connections you expect to see.

You should leverage monitoring and alerting to ensure that the LD Relay cluster has the capacity to handle your workload and scale it as needed.

Out of the box, LD Relay is fairly light-weight. At a minimum you can expect:

  • 1 long-lived HTTPS SSE connection to LaunchDarkly's streaming endpoint per configured environment
  • 1 long-lived HTTPS SSE connection to the AutoConfiguration endpoint when automatic configuration is enabled

Memory usage increases with the number of configured environments, the payload size of the flags and segments, and the number of connected SDKs. Client-side SDKs have higher computation requirements as the evaluation occurs in LD Relay.

Event forwarding

LD Relay handles the following event forwarding patterns:

  • Approximately 1 incoming HTTPS request every 2 seconds per connected SDK. This may vary based on flush interval and event capacity settings in the SDK.
  • Approximately 1 outgoing HTTPS request every 2 seconds per configured environment. This may vary based on LD Relay's configured flush interval and event capacity.

Memory usage increases with event capacity and the number of connected SDKs.

Scaling strategies

Each LD Relay instance maintains connections and manages configuration for the environments you assign to it. The number of environments a single instance can handle depends on your memory, CPU, and network resources. Monitor resource usage to determine when to scale.

When your environment count or size exceeds the limits of a single Relay instance, use one of these scaling approaches:

  • Horizontal scaling: Add more Relay instances to share the load across your environments. This approach provides greater resilience and easier dynamic scaling.
  • Vertical scaling: Increase the memory and CPU resources allocated to each Relay instance.
  • Environment sharding: Distribute environments across multiple Relay instances so each Relay manages a subset of environments rather than all of them.

Environment sharding

Sharding distributes environment configurations across multiple Relay instances. Each Relay manages a subset of environments rather than all of them.

Use sharding in the following situations:

  • Your environment count or size exceeds the limits of a single Relay instance.
  • You need to seperate instances by failure or compliance domains.
  • You need to simplify health checks for load balancers and container orchestrators.

Sharding provides the following advantages:

  • Reduces memory and CPU load per Relay instance.
  • Limits the impact of failures or configuration errors.
  • Improves cache efficiency and stability by isolating workloads.

Seperation of concerns

LD Relay can perform several functions such as acting providing rules to server-side SDKs, evaluating flags for client-side SDKs, forwarding events, and populating persistent stores. You can configure LD Relay to perform one or more of these functions.

You may want consider using seperate LD Relay instances for different functions based on scaling characteristics and criticality. For example you might have seperate clusters for:

  • Server-side SDKs
  • Client-side SDKs (Evaluation)
  • Event forwarding
  • Populating persistent stores for daemon-mode or syncing big segments

This approach provides the following advantages:

  • Easier to individually scale components and predict resource utilization
  • Seperate concerns and increase reliabiity of critical components (e.g serving rules to server-side SDKs is more critical than event forwarding)
  • Prevent client-side workloads from impacting server-side SDKs

This approach will generally increase the total cost of ownership of the deployment as you will need to deploy and manage multiple instances. It is more applicable to large deployments with a mix of use-cases.