In distributed systems, operations sometimes fail due to transient issues like network glitches or temporary unavailability of resources. Instead of giving up immediately, it can be beneficial to retry the operation after a delay. A delayed retry framework can help manage such scenarios effectively. In this blog post, I’ll walk you through the core concepts of implementing a delayed retry mechanism in Java.

Pseudo Code:

The first step is to create an interface that defines the contract for any operation that will be retried.

IDelayedRetryProcessor<Request, Response>
    Response process(Request, SomeExtraInfoLikeAttemptNumber)
    void onSucess(Request, Response)
    void onFailure(Request, Exception)

Main Logic for Retrying Operations

The core logic of the delayed retry framework is encapsulated in the DelayedRetryable class, which handles the submission and processing of retryable operations.

DelayedRetryable
    void submitOperation(Request, OperationName)
        procesorBean = load bean from application context with name (operation name bean from config)
        try:
            response = procesorBean.process(Request)
            processorBean.onSuccess(request, response)
        catch:
            if retrybale exception
                pushToDelayQueue(Request, Delay, 1, operationName)
            else
                processorBean.onFailure(request, exception)


    void messageCounsmer(Message<Request> message)
        operationName = message.getHeader("X-operationName")
        attempt = message.getHeader("X-Attempt")
        procesorBean = load bean from application context with name (operation name bean from config)
        try:
            response = procesorBean.process(Request)
            processorBean.onSuccess(request, response)
        catch:
            if retrybale exception
                if AND attemp > maxAttempt
                    processorBean.onFailure(request, MaxAttemptsBreachedException)
                else
                    pushToDelayQueue(Request, Delay, ++attempt, operationName)
            else
                processorBean.onFailure(request, exception)


    pushToDelayQueue(Request, Delay, Attempt, operationName)
        - load config for operation name
        - delay calcilation logic   
        - push meesgae to queue, Message = Body(Request) + Headers(operation name, delay, attemp, and others by import/export)

Options for Implementing the Delayed Messaging Queue

To implement delayed retries, you need a messaging queue that supports delaying messages. Here are some options:

Summary

The Delayed Retry Framework is a powerful tool for handling transient failures in distributed systems. By implementing a retry mechanism with customizable delay and max attempts, you can improve the resilience of your applications. Depending on your specific requirements, you can choose from various messaging queue options, each with its own pros and cons. Whether you go with RabbitMQ, Redis, or AWS SQS, the framework allows you to handle failures gracefully and keep your systems running smoothly.