Refactoring ElasticSearch's RestClient class

In this post I'm picking a random Java library called "Elasticsearch" and a class in it called RestClient. We are going to refactor this class and improve the code in it as an exercise.

This is the best definition of the "Elasticsearch" library that I could find:

Elasticsearch is a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time. It is generally used as the underlying engine/technology that powers applications that have complex search features and requirements.

This is the class' javadoc comment that should allow us to understand it:

Client that connects to an Elasticsearch cluster through HTTP.

Must be created using {@link RestClientBuilder}, which allows to set all the different options or just rely on defaults.
The hosts that are part of the cluster need to be provided at creation time, but can also be replaced later by calling {@link #setHosts(HttpHost...)}.

The method {@link #performRequest(String, String, Map, HttpEntity, Header...)} allows to send a request to the cluster. When sending a request, a host gets selected out of the provided ones in a round-robin fashion. Failing hosts are marked dead and retried after a certain amount of time (minimum 1 minute, maximum 30 minutes), depending on how many times they previously failed (the more failures, the later they will be retried). In case of failures all of the alive nodes (or dead nodes that deserve a retry) are retried until one responds or none of them does, in which case an {@link IOException} will be thrown.

Requests can be either synchronous or asynchronous. The asynchronous variants all end with {@code Async}.
Requests can be traced by enabling trace logging for "tracer". The trace logger outputs requests and responses in curl format.

And this is the definition of cluster in elasticsearch:

A cluster is a collection of one or more nodes (servers) that together holds your entire data and provides federated indexing and search capabilities across all nodes. A cluster is identified by a unique name which by default is "elasticsearch". This name is important because a node can only be part of a cluster if the node is set up to join the cluster by its name.

Make sure that you don’t reuse the same cluster names in different environments, otherwise you might end up with nodes joining the wrong cluster. For instance you could use logging-dev, logging-stage, and logging-prod for the development, staging, and production clusters.

Note that it is valid and perfectly fine to have a cluster with only a single node in it. Furthermore, you may also have multiple independent clusters each with its own unique cluster name.

This class' public interface is mostly a set of method to perform a request asynchronously or perform a request and wait for a response.

There are html tags in the javadoc comments. It can be argued that that is not a good idea as it reduces legibility.

Arrays are used in the public interface, for example in the constructor. There's no need to use arrays in Java when lists can be used.

The "method" parameter of type String is an HTTP method. There seems to be no good reason not to use an enum instead. We could use this enum from the Spring framework, for example. If we dig into the implementation we can see that the method String is being used to compare it with constants from the httpclient Apache library. It looks like the implementation has creeped up to the interface for a matter of convenience.

Some of the methods in the interface have a large number of parameters. It might be wiser to wrap those parameters in an object.

In the method

void  performRequestAsync(String method, String endpoint, Map params, 
  HttpEntity entity, HttpAsyncResponseConsumerFactory httpAsyncResponseConsumerFactory, 
  ResponseListener responseListener, Header... headers)

everything happens inside a big try-catch block.

The first statement in it is checking that one of the parameters is not null. From a semantic standpoint, that is the kind of error that is different to any error that would cause the method to return a failure like it does on the catch part of the clause. It should simply raise an exception for failing to comply with the method's contract.

The few next statements don't look like they could throw any checked exception so it seems only carelessness that they are inside the try-catch block. Incidentally the catch block catches any generic Exception, so it doesn't help in knowing where exceptions are expected.

The hardcoded string "ignore" should be a constant and the variable "requestParams" isn't used until later in the method, but with its initialization it's trying to be clever by doing several things at a time.

After doing the previous refactorings and some other line reordering to reduce local variables' scopes we end up with this:

Deep inside the method code strings are being converted to integers and an IllegalArgumentException is thrown if the string is not a number. Doing this parameter validation inside the method reduces legibility. Since it's part of a public interface, there is an obligation to validate the parameters, but it should be done at the beginning of the method. In this way we will also ensure that the parameter is correct in all cases. Iterating this structure twice is not a problem, legibility comes first.

After some more refactoring, this is what we end up with:

Each method that comes from a chunk of code I refactored tells a story. Each of those new statements has a similar level of abstraction. The new methods are located under the original method, so it can read top to bottom.

Now it's easy to see we are performing some parameter validation first and then building a set of error codes to ignore and parameters for the request. Furthermore, since there is no possibility of exceptions (barring some improbable limit condition or hardware error) we have taken them out of the try-catch block.

There's still the try-catch block to refactor. It's not obvious to see why all those statements ought to be inside the block, since none of them throw any checked exception. Since there doesn't seem to be any good reason for it and all of the statements are setup code except for the call that does the actual request, we will be lenient and allow only the request call inside the try block.

I feel uneasy about the comments

// 404 never causes error if returned for a HEAD request

// ignore is a special parameter supported by the clients, shouldn't be sent to es

There's probably a better way to express that just with code such as a constant or config file where a list of special parameters used only by clients is properly named as such.

There are some problems with the method's javadoc

  
  * Sends a request to the Elasticsearch cluster that the client points to.
  * The request is executed asynchronously and the provided
  * {@link ResponseListener} gets notified upon request completion or
  * failure. Selects a host out of the provided ones in a round-robin
  * fashion. Failing hosts are marked dead and retried after a certain amount
  * of time (minimum 1 minute, maximum 30 minutes), depending on how many
  * times they previously failed (the more failures, the later they will be
  * retried). In case of failures all of the alive nodes (or dead nodes that
  * deserve a retry) are retried until one responds or none of them does, in
  * which case an {@link IOException} will be thrown.

The IOException mentioned is an implementation detail that bears no relation with the method's contract.

There should probably be an explanation of the algorithm, but the more concrete details such as times are best left 1) in the code, 2) in the project's documentation. If the user of the class needs to know these times, getter methods will suffice.

There's much more work to do in the class. The main problem with it is that the methods that do any significant work are too large and should be splitted into smaller methods that would make the code more transparent.

Diego Pardo Martín

Search This Blog

Refactoring ElasticSearch's RestClient class

Comments

Post a Comment