PLT-2568 use all hosts available for es for connection pooling, node status visibility and retries #3747
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently ES nodes does round robin with K8 service. Instead we want to use ES loadbalancing. This change requires altas config update too to go in hand.
Failure Detection and Retry Mechanisms
• Kubernetes Service: Kubernetes automatically detects when a pod goes down and removes it from the Service endpoint list until it is healthy again, thereby routing traffic only to healthy pods. However, depending on the configuration, Kubernetes may not handle transient connection errors as well as Elasticsearch’s client, which has built-in retry mechanisms.
• Elasticsearch Client: The Elasticsearch RestClient can detect and reroute failed requests if a node goes down, thanks to its retry logic and round-robin mechanism. It allows you to control how retries, failovers, and timeouts are handled specifically within the context of Elasticsearch.
Direct Node Awareness
• Kubernetes Service: The Service abstracts the nodes, and your application does not directly know about individual Elasticsearch nodes or their health status. This setup keeps your application configuration simpler but limits finer-grained control over node selection.
• Elasticsearch Client with Multiple Nodes: When specifying multiple nodes directly, the RestClient is aware of each node’s status. This visibility can provide better resiliency and failover since the client will avoid routing traffic to known failed nodes immediately.
Connection Pooling and Efficiency
• Kubernetes Service: Using a single Service endpoint results in connection pooling across that one endpoint. While Kubernetes will distribute requests among pods, it may not balance connections perfectly due to the intermediate layer of load balancing.
• Elasticsearch Client: By configuring the client with multiple HttpHost nodes, the RestClient manages the connection pool for each node directly, leading to more efficient use of connections and potentially reducing latencies due to the client’s more optimized, internal load-balancing mechanisms.
Choosing Between Kubernetes Service and Multiple Hosts in Code:
Type of change
Related issues
Checklists
Development
Security
Code review