Retry Pattern With Spring Boot

Overview

In this tutorial, I would like to demo Retry Pattern, one of the Microservice Design Patterns for designing highly resilient Microservices using a library called resilience4j along with Spring Boot.

Need For Resiliency

Microservices are distributed in nature. When you work with distributed systems, always remember this number one rule – anything could happen. We might be dealing with network issues, service unavailability, application slowness etc. An issue with one system might affect another system behavior/performance. Dealing with any such unexpected failures/network issues could be difficult to solve.

Ability of the system to recover from such failures and remain functional makes the system more resilient. It also avoids any cascading failures to the downstream services.

Retry Pattern

In Microservice architecture, when there are multiple services (A, B, C & D), one service (A) might depend on the other service (B) which in turn might depend on C and so on. Sometimes due to some issue, Service D might not respond as expected. Service D might have thrown some exception like OutOfMemory Error or Internal Server Error. Such exceptions are cascaded to the downstream services which might result in poor user experience as shown below.

Sometimes when google.com does not work for us, we just do not give up. We simply refresh the page once assuming things will work next time and it does most of the times. Intermittent network issues are very common. In the Microservices world, we might be running multiple instances of same Service D for high availability and load balancing. If one of the instances could be having the issue and it does not respond properly to our request, If we retry the request, the load balancer could send the request to a healthy node and get the response properly. So with Retry option, we have more chance for getting the proper response.

retry pattern

Sample Application

Lets consider this simple application to explain this retry pattern.

  • We have multiple Microservices as shown above
  • Product service acts as product catalog and responsible for providing product information
  • Product service depends on the rating service.
  • Rating service maintains product reviews and ratings. It is notorious for throwing random errors.
  • Whenever we look at the product details, product service sends the request to the rating service to get the reviews for the product.
  • We have other services like account-service, order-service and payment-service etc which is not relevant to this article discussion.
  • Product service is a core service without which the user can not start the order workflow.

Project Set Up

Lets first create a Spring Boot project with these dependencies.

We also need this dependency.

<dependency>
    <groupId>io.github.resilience4j</groupId>
    <artifactId>resilience4j-spring-boot3</artifactId>
    <version>...</version>
</dependency>

This will be a multi-module maven project as shown below.

If the user tries to see a product, let’s say product id 1, then the product-service is expected to respond like this by fetching the ratings as well.

{
    "productId": 1,
    "description": "Blood On The Dance Floor",
    "price": 12.45,
    "productRating": {
        "avgRating": 4.5,
        "reviews": [
            {
                "userFirstname": "vins",
                "userLastname": "guru",
                "productId": 1,
                "rating": 5,
                "comment": "excellent"
            },
            {
                "userFirstname": "marshall",
                "userLastname": "mathers",
                "productId": 1,
                "rating": 4,
                "comment": "decent"
            }
        ]
    }
}

Common-DTO

As we have couple of services which are going to share the DTOs among them, Lets keep them as a separate module. This module will contain below classes.

  • Review
@Data
@NoArgsConstructor
@AllArgsConstructor(staticName = "of")
public class ReviewDto {

    private String userFirstname;
    private String userLastname;
    private int productId;
    private int rating;
    private String comment;

}
  • Product Rating
@Data
@NoArgsConstructor
@AllArgsConstructor(staticName = "of")
public class ProductRatingDto {

    private double avgRating;
    private List<ReviewDto> reviews;

}
  • Product
@Data
@NoArgsConstructor
@AllArgsConstructor(staticName = "of")
public class ProductDto {

    private int productId;
    private String description;
    private double price;
    private ProductRatingDto productRating;

}

Rating Service

This service is responsible for maintaining all the product reviews. To keep things simple, I am going to use a simple Map as database here.

  • Service class
@Service
public class RatingService {

    private Map<Integer, ProductRatingDto> map;

    @PostConstruct
    private void init(){

        // product 1
        ProductRatingDto ratingDto1 = ProductRatingDto.of(4.5,
                List.of(
                        ReviewDto.of("vins", "guru", 1, 5, "excellent"),
                        ReviewDto.of("marshall", "mathers", 1, 4, "decent")
                )
        );

        // product 2
        ProductRatingDto ratingDto2 = ProductRatingDto.of(4,
                List.of(
                        ReviewDto.of("slim", "shady", 2, 5, "best"),
                        ReviewDto.of("fifty", "cent", 2, 3, "")
                )
        );

        // map as db
        this.map = Map.of(
                1, ratingDto1,
                2, ratingDto2
        );

    }

    public ProductRatingDto getRatingForProduct(int productId) {
        return this.map.getOrDefault(productId, new ProductRatingDto());
    }

}
  • Controller
    • If you take a look at our controller, It seems to fail at random.
    • It can produce valid response in certain cases
    • It responds with Internal server error in certain cases
    • It also throws 400 error for bad requests.
@RestController
@AllArgsConstructor
@RequestMapping("ratings")
public class RatingController {

    private final RatingService ratingService;

    @GetMapping("{prodId}")
    public ResponseEntity<ProductRatingDto> getRating(@PathVariable int prodId) {
        var productRatingDto = this.ratingService.getRatingForProduct(prodId);
        return this.failAtRandom(productRatingDto);
    }

    private ResponseEntity<ProductRatingDto> failAtRandom(ProductRatingDto productRatingDto){
        var random = ThreadLocalRandom.current().nextInt(1, 5);
        return switch (random){
            case 1,2 -> ResponseEntity.status(500).build();
            case 3 -> ResponseEntity.badRequest().build();
            default -> ResponseEntity.ok(productRatingDto);
        };
    }

}

Product Service

Product service is responsible for providing list of products based on the user search criteria. It is one of the core services which should be up & responsive even under critical load. If it is down, it will have a severe impact on the revenue. Since this service depends on rating-service, we do not want any network issues or rating-service unavailability affects this product-service. This is where resilience4j library comes into picture.

  • Configuration
    • I first create a configuration for resilience4j as shown below.
    • We can have multiple service configuration as shown below.
    • for ratingService, We will do max 3 retry attempts with 1 second delay.
    • retryExceptions: these are exceptions for which we would retry. You can have multiple exceptions configured.
    • ignoreExceptions: There are exceptions for which we might not want retry. For example, a bad request is a bad request. There is no point in retrying. So we ignore that.
resilience4j.retry:
  instances:
    ratingService:
      maxRetryAttempts: 3
      waitDuration: 1s
      retryExceptions:
        - org.springframework.web.client.HttpServerErrorException
      ignoreExceptions:
        - org.springframework.web.client.HttpClientErrorException
    someOtherService:
      maxRetryAttempts: 3
      waitDuration: 10s
      retryExceptions:
        - org.springframework.web.client.HttpServerErrorException
        - java.io.IOException
  • Product entity
@Data
@AllArgsConstructor(staticName = "of")
public class Product {

    private int productId;
    private String description;
    private double price;

}
  • This product-service acts as a client for the rating-service.
    • @Retry indicates that resilience4j will apply retry logic for this method execution.
    • name=ratingService indicates that resilience4j will use the configuration for ratingService in the yaml.
    • fallbackMethod is used when the main method fails for some reason.
    • Check this project source code in Github to check all the dependencies.
@Service
@AllArgsConstructor
public class RatingServiceClient {

    private static final Logger log = LoggerFactory.getLogger(RatingServiceClient.class);
    private final RestClient client;
    private final ExecutorService executorService;

    @Retry(name = "ratingService", fallbackMethod = "onError")
    public CompletionStage<ProductRatingDto> getProductRatingDto(int productId) {
        return CompletableFuture.supplyAsync(() -> this.getRating(productId), executorService);
    }

    private ProductRatingDto getRating(int productId){
        return this.client.get()
                          .uri("{productId}", productId)
                          .retrieve()
                          .body(ProductRatingDto.class);
    }

    private CompletionStage<ProductRatingDto> onError(int productId, Throwable throwable) {
        log.error("error", throwable);
        return CompletableFuture.completedStage(ProductRatingDto.of(0, Collections.emptyList()));
    }

}
  • Product service
@Service
@AllArgsConstructor
public class ProductService {

    private final RatingServiceClient ratingServiceClient;
    private final ExecutorService executorService;

    // assume this would be DB in real life
    private Map<Integer, Product> db;

    @PostConstruct
    private void init(){
        this.db = Map.of(
                1, Product.of(1, "Blood On The Dance Floor", 12.45),
                2, Product.of(2, "The Eminem Show", 12.12)
        );
    }

    public CompletableFuture<ProductDto> getProductDto(int productId){
        // assuming this is a DB call
        var product = CompletableFuture.supplyAsync(() -> this.db.get(productId), executorService);
        var rating = this.ratingServiceClient.getProductRatingDto(productId);
        return product.thenCombine(rating, (p, r) -> ProductDto.of(productId, p.getDescription(), p.getPrice(), r));
    }

}
  • Controller
@RestController
@RequestMapping("product")
public class ProductController {

    @Autowired
    private ProductService productService;

    @GetMapping("{productId}")
    public CompletionStage<ProductDto> getProduct(@PathVariable int productId){
        return this.productService.getProductDto(productId);
    }

}

Demo

All the services are ready. Start both product-service and rating-service. Lets access below endpoint.

http://localhost:8080/product/1
  • Case 1: When the rating-service works perfectly fine.
{
    "productId": 1,
    "description": "Blood On The Dance Floor",
    "price": 12.45,
    "productRating": {
        "avgRating": 4.5,
        "reviews": [
            {
                "userFirstname": "vins",
                "userLastname": "guru",
                "productId": 1,
                "rating": 5,
                "comment": "excellent"
            },
            {
                "userFirstname": "marshall",
                "userLastname": "mathers",
                "productId": 1,
                "rating": 4,
                "comment": "decent"
            }
        ]
    }
}
  • Case 2: When the rating-service responds with bad request response
{
    "productId": 1,
    "description": "Blood On The Dance Floor",
    "price": 12.45,
    "productRating": {
        "avgRating": 0.0,
        "reviews": []
    }
}
  • Case 3: When the rating-service throws internal server error
    • Now we could see the request takes time – because we have configured the time delay as 1 second. It will do max 3 attempts. If we get the proper response, it will be used. Otherwise the fallback method will be called.
    • So, We might either get proper response or fall back response.
{
    "productId": 1,
    "description": "Blood On The Dance Floor",
    "price": 12.45,
    "productRating": {
        "avgRating": 0.0,
        "reviews": []
    }
}

Here the rating and reviews are empty. But that is ok as it is not critical. The product itself is not available then we will have very bad user experience and could impact the revenue.

Advantages
Make the core services work always even when the dependent services are not available
Upstream failures are not propagated to downstream
Intermittent network issues are avoided

 

Disadvantages
Retry increases the overall response time
Retrying adds unnecessary load on the server if it is an app issue

Summary

Retry Pattern is one of the simplest Microservice Design Patterns for designing resilient Microservices. Introducing retry helps with occasional network related issues we might have.

Read more about other Resilient Microservice Design Patterns.

The source code is available here.

Happy learning 🙂

 

Share This:

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.