HAProxy as a Solana RPC Node Load Balancer and Reverse Proxy

How to configure HAProxy to handle Solana RPC traffic

Jul 21, 2023

The Solana RPC API allows for communication between client applications and the Solana cluster. With its wide range of functionalities, the API serves as a vital tool for operations like fetching cluster information, account details, transaction details, and more. However, if you're using a single Solana RPC node, its potential to become a bottleneck in your operations could be a significant issue. That's where HAProxy comes in.

HAProxy is well-known for its high performance, reliability, security, ease of use, and provides high availability, load balancing, proxying and routing http and websocket connections.

HAProxy, which stands for High Availability Proxy, is a popular, open-source load balancer and reverse proxy. It provides high availability, load balancing, and proxying for TCP and HTTP-based applications, as well as elegantly handling Websocket connections. HAProxy is well-known for its high performance, reliability, security, and ease of use. In this article, we're going to walk through how you can use HAProxy as a load balancer and reverse proxy for your Solana RPC interface.

Let's break down the provided HAProxy configuration to understand how it's functioning:

Global Settings

The "global" section contains settings that apply globally to every HAProxy process and which should not be changed once set.

Here is the full global configuration for reference:

global
    log /dev/log local0
    log /dev/log local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
    stats timeout 30s
    user haproxy
    group haproxy
    daemon
    maxconn 1000

log /dev/log local0 and log /dev/log local1 notice
- These two lines configure where HAProxy sends its log messages.
- /dev/log is the path to the system's syslog socket, to which HAProxy sends log data.
- local0 and local1 are facilities (categories) for these logs. notice indicates the level of logs to capture, in this case, all logs of severity "notice" and higher will be captured.
chroot /var/lib/haproxy
- The chroot command changes the root directory for the HAProxy process to /var/lib/haproxy.
- This is a security measure that isolates the HAProxy process, limiting its access to the filesystem.
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
- This line creates a Unix socket at /run/haproxy/admin.sock, which HAProxy uses to provide statistics and manage runtime settings. mode 660 sets the permissions of the socket file (owner and group can read and write, others have no permissions).
- level admin allows all operations on the socket. expose-fd listeners makes HAProxy expose file descriptors of the listening sockets via this socket.
stats timeout 30s
- This command sets the timeout for the stats socket to 30 seconds. If a connected user remains inactive for 30 seconds, the connection is closed.
user haproxy and group haproxy
- These lines set the user and group that the HAProxy process will run as. Running the process as a non-root user is a good security practice as it restricts what the process can do on the system.
daemon
- This setting makes HAProxy run as a background process, not attached to any particular terminal session.
maxconn 1000
- This parameter sets the maximum number of concurrent connections that HAProxy will allow. This limit helps protect the system from getting overwhelmed by too many simultaneous connections. You will want to configure this value to an acceptable one for your particular use case.

Understanding and properly configuring the global section of your HAProxy configuration file is critical for the overall operation and security of your load balancer. It is the bedrock upon which the rest of your configuration is built.

Defaults Configuration

defaults
    log global
    mode http
    option httplog
    option dontlognull
    option http-server-close
    option http-buffer-request
    option forwardfor

In the defaults section, you define parameters that apply to all frontend and backend sections.

The log global directive configures HAProxy to use global syslog parameters for logging.
The mode http directive indicates that the service should operate in HTTP mode.
The option httplog and option dontlognull directives are for logging configurations.
The http-server-close directive makes HAProxy close the connection to the server after the end of the response.
The http-buffer-request buffers the entire request before starting to send it to the server.
The option forwardfor directive ensures that the client's IP address is forwarded to the backend servers in the X-Forwarded-For header.

Timeouts

timeout connect 20s
timeout client 60s
timeout server 2m
timeout tunnel 5m
timeout http-keep-alive 5s
timeout http-request 60s
timeout client-fin 30s

These timeout directives manage the duration before aborting specific types of connections or requests. For instance:

timeout connect 20s
- This directive sets the maximum time to wait for a connection attempt to a server to succeed.
- If a backend server doesn't accept the connection within this time, the attempt is canceled and the request is forwarded to another server or results in an error if no other server is available.
timeout client 60s
- This directive specifies the inactivity timeout on the client side. It's the maximum amount of time HAProxy will wait for data to arrive from the client, or for a complete HTTP request to be received. before the connection is closed.
timeout server 2m
- This configuration sets the maximum inactivity time on the server side. It represents the time that HAProxy waits for the server to return a complete response before HAProxy will terminate the connection
timeout tunnel 5m
- The tunnel timeout applies specifically to tunnelled connections, such as WebSocket or long-polling HTTP connections.
- This timeout value should generally be quite high to allow these long-lasting connections to function correctly.
timeout http-keep-alive 5s
- This configuration determines the maximum time that HAProxy will maintain an idle HTTP keep-alive connection with the client.
- It's vital for performance, allowing for multiple HTTP requests to be run over the same TCP connection. Our configuration sets this timeout to 5 seconds, effectively ensuring that keep-alive connections are quickly recycled.
timeout http-request 60s
- This directive represents the maximum allowed time to receive a complete HTTP request from the client.
- If the whole request isn't received within this timeframe, the client is sent a 408 Request Timeout error.
timeout client-fin 30s
- This configuration sets the time that HAProxy will wait for the client to acknowledge a FIN (TCP connection termination request) before fully closing the connection. This is used to ensure a clean TCP connection teardown.

Frontend

frontend http_in
    bind *:80
    bind *:443 ssl crt /etc/haproxy/certs/domain.com.pem

The frontend directive declares a set of IP addresses and ports that will accept client connections. Here, we're listening to any traffic that comes on ports 80 (HTTP) and 443 (HTTPS). We also provide a location for our TLS certificate that we had pre-generated. If you do not have a TLS/SSL certificates then look into Certbot - an amazing tool to provision free TLS/SSL certificates for your domain.

SSL Redirect

http-request set-header X-Forwarded-Proto https if { ssl_fc } 
http-request redirect scheme https code 301 unless { ssl_fc }

These lines handle SSL termination and redirection.

From the first line, if the incoming request uses SSL, which can be checked using the directive {ssl_fc}, the X-Forwarded-Proto header is set to HTTPS.

On the second line, if {ssl_fc} is not set, then this is a regular HTTP connection that must be upgraded to HTTPS. This block directs the client to HTTPS using a 301 Moved Permanently response in this case.

Websocket Routing

By examining the headers of incoming HTTP requests, HAProxy can determine whether a client is attempting to establish a WebSocket connection and route those requests accordingly. This ability to route WebSocket connections separately from typical HTTP requests allows for more efficient resource utilisation and better overall performance, especially as Solana RPC utilizes Websocket connections heavily.

acl host_ws hdr_beg(Host) -i ws 
use_backend websocket_backend if host_ws 
acl is_connection_upgrade hdr(Connection) -i upgrade 
acl is_websocket hdr(Upgrade) -i websocket 
use_backend websocket_backend if is_websocket is_connection_upgrade

acl host_ws hdr_beg(Host) -i ws
- This defines an Access Control List (ACL) named host_ws.
- The hdr_beg(Host) fetch method checks if the beginning of the Host header in the HTTP request matches the pattern specified, in this case, ws..
- The -i flag makes the match case insensitive. So, any incoming request with a Host header starting with "ws." (ignoring case) will match this ACL.
use_backend websocket_backend if host_ws:
- If a request matches the host_ws ACL, then this directive instructs HAProxy to use the websocket_backend to process the request. In other words, the request is routed to the WebSocket servers.
acl is_connection_upgrade hdr(Connection) -i upgrade:
- This line defines another ACL named is_connection_upgrade.
- The hdr(Connection) fetch method checks the Connection header for the string "upgrade".
- If the header contains the word "upgrade" (case insensitive due to -i), the request matches this ACL.
acl is_websocket hdr(Upgrade) -i websocket:
- This line defines an ACL named is_websocket.
- The hdr(Upgrade) fetch method checks the Upgrade header for the string "websocket". If the header contains the word "websocket" (case insensitive due to -i), the request matches this ACL.
use_backend websocket_backend if is_websocket is_connection_upgrade:
- This line tells HAProxy to use the websocket_backend for any requests that match both the is_websocket and is_connection_upgrade ACLs. In essence, if a client sends an HTTP request with Connection and Upgrade headers indicating a WebSocket upgrade, the request is routed to the WebSocket servers.

Backend Servers

Backend sections are where you define the groups of servers that HAProxy can pass requests onto. Each backend is a group of servers that should be treated as interchangeable parts of a whole. Let's go through your configuration:

The HTTP backend block provides the routing of HTTP requests to the Solana RPC node interfaces.

backend http_backend
- This begins the definition of a backend named http_backend.
- This backend is where standard HTTP requests are forwarded
balance roundrobin
- This line sets the load balancing algorithm to be used for this backend. In this case, the roundrobin algorithm is used, which means that incoming requests are distributed to the backend servers in a circular order.
server backend_http1 10.20.30.40:8899 check
and
server backend_http2 10.20.30.41:8899 check
- These lines define the server IP addresses and ports in the http_backend group.
- In this case, check enables health checking on this server, which means HAProxy will periodically send requests to this server to make sure it's still responding.
- Port 8899 used as the default Solana RPC port.

The Websocket backend block provides the routing of WS requests to the Solana RPC node interfaces, on port 8900 which is intended for Websoket connections.

backend websocket_backend
- This line starts the configuration of a backend named websocket_backend.
balance roundrobin
- Just like in the previous backend, the roundrobin load balancing algorithm is used.
server backend_ws1 10.20.30.40:8900 check
and
server backend_ws2 10.20.30.41:8900 check
- These lines define the servers in this backend group. Note that these servers are listening on port 8900 instead of port 8899. This port is commonly used for WebSocket connections. The check option is used to enable health checks for these servers.

Conclusion

HAProxy is a great tool that can be leveraged as both a load balancer and reverse proxy for your Solana RPC nodes allows for efficient distribution of network or application traffic across several servers, ensuring high availability and reliability.

Let's recap the vital parts of our configuration:

Defaults and Global Settings: These settings serve as a foundation for the remaining configuration. They set default parameters that are then inherited by all frontends and backends unless explicitly overridden.
Timeouts: These configurations manage the maximum duration for specific connections and requests. They are crucial for maintaining the balance between keeping connections alive for legitimate users and freeing up resources from slow or unresponsive connections. These may need to be tuned for your individual use case.
Frontend: This section manages the incoming connections to the HAProxy. It includes settings related to SSL and handles the initial routing based on various conditions, like whether the request is an HTTP or WebSocket request.
SSL Redirect: This part ensures that all communication is encrypted by redirecting HTTP traffic to HTTPS.
WebSocket Routing: Rules were defined to determine whether a request should be routed to the WebSocket backend. The conditions are based on the Host header and the presence of specific headers that signify a WebSocket upgrade request.
Load Balancing: The given HAProxy configuration will load balance between two Solana RPC nodes through two backends - one for HTTP and one for WebSocket connections. The 'roundrobin' algorithm is used for load balancing, distributing incoming requests in a circular, sequential order to the backend servers. By implementing health checks ('check') on each server, HAProxy ensures that it only routes traffic to operational servers.

And lets look at our final configuration file:

global
    log /dev/log local0
    log /dev/log local1 notice
    chroot /var/lib/haproxy
    stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
    stats timeout 30s
    user haproxy
    group haproxy
    daemon
    maxconn 1000

defaults
    log global
    mode http
    option httplog
    option dontlognull

    # Close http connection on server but not client
    option  http-server-close

    # Buffer request bodies before transmitting
    option http-buffer-request

    # Set X-Forwarded-For header to incoming request
    option forwardfor

    # timeout for TCP connection time
    timeout connect 20s
    # timeout for client inactivity
    timeout client 60s
    # timeout for server response
    timeout server 2m
    # websocket timeout
    timeout tunnel 5m
    # time to keep http request open after response
    timeout http-keep-alive 5s
    # from the first client byte received, until last header byte
    timeout http-request 60s
    # timeout to keep log of dropped connections for
    timeout client-fin 30s

    # 5s healthcheck interval, 3 fails to bring down, 3 to bring up
    default-server inter 5s rise 3 fall 3

frontend http_in
    bind *:80
    bind *:443 ssl crt /etc/haproxy/certs/domain.com.pem

    # routing based on Host header
    acl host_ws hdr_beg(Host) -i ws
    use_backend websocket_backend if host_ws

    # check if this is a websocket connection
    acl is_connection_upgrade hdr(Connection)  -i upgrade
    acl is_websocket hdr(Upgrade) -i websocket

    # route to websocket_backend if websocket
    use_backend websocket_backend if is_websocket is_connection_upgrade

    # update http to https
    http-request set-header X-Forwarded-Proto https if { ssl_fc }
    http-request redirect scheme https code 301 unless { ssl_fc }

    # Route all remaining connections to http_backend
    default_backend http_backend

backend http_backend
    # Route requests to http endpoints using roundrobin strategy
    balance roundrobin
    server backend_http1 10.20.30.40:8899 check
    server backend_http2 10.20.30.41:8899 check

backend websocket_backend
    # Route requests to websocket endpoint using roundrobin
    balance roundrobin
    server backend_ws1 10.20.30.40:8900 check
    server backend_ws2 10.20.30.41:8900 check

In essence, the provided HAProxy configuration works as a strong, effective load balancer and reverse proxy for your Solana RPC nodes. But don't just settle for the default settings. It's worth diving in and customizing each component to perfectly align with your application's requirements. Remember, the HAProxy configuration has a significant impact on your Solana RPC nodes' performance and reliability. Therefore, investing some time to fine-tune the configuration is never a bad idea. So, why wait? Start exploring and optimizing today!

Mike’s Substack

Discussion about this post