redis

My AWS SQS Requests Skyrocketed to 1 Million at Month's Start. This is How I Implemented a Cost-Effective Solution.

Ganesh Kumar

Aug 4, 2024 — 11 min read

We started using Amazon SQS for our push notifications, triggering emails, etc but the way it handles polling led to a big increase in usage. To address this and potentially save on cloud costs, we switched to Redis. Here’s our story.

At Hexmos, we are building innovative products like Feedback By Hexmos and FeedZap.

To keep our systems efficient and adaptable, we embraced a microservices architecture. This means we built separate, smaller services that work together seamlessly.

One of these services is a unified Identity Service, which manages user accounts and payments across all our products.

But behind the scenes, ensuring smooth communication between different parts of our system can be a challenge.

We leverage a microservices architecture, which breaks down our application into smaller, independent services that work together seamlessly.

However, coordinating communication between these services, especially for time-sensitive tasks like sending emails, requires a robust and reliable solution.

This article explores how we overcame the limitations of our initial approach and discovered a powerful solution with Redis Streams.

We'll delve into our challenges, why traditional message queuing systems weren't the perfect fit, and how Redis Streams helped us build a resilient notification system that keeps users informed and engaged.

Our Existing Microservices Architecture

Building Scalable Systems with Microservices

Our current architecture involves two services:

Leave Request API:
- Processes leave requests.
- Creates a message containing relevant data.
- Pushes the message to the message queue.
Email API:
- Processes received messages by sending appropriate emails.

Direct communication between these services can cause problems:

Tight Coupling: Changes in one service impact the other, hindering independent development.
Data Loss: If the Email API is down during signup, the email notification might be lost. Customers might not receive verification emails or leave notifications.

The Identity Service: A Foundation for User Experience

Why Two Services?

Explaining the use of microservices in our architecture:

Backend dedicated to product: Handles core functionalities of the product.
Backend dedicated to emails and notifications: Manages all email and notification-related tasks.

This separation allows each service to be developed, deployed, and scaled independently, increasing the system's resilience and maintainability.

AWS SQS: A Step Towards Decoupling

We integrated AWS SQS as a connection between these services:

Leave Request API:
- Processes leave requests.
- Creates a message (e.g., JSON object) containing relevant data (user details, leave type, duration, etc.).
- Pushes the message to the message queue.
Message Queue (SQS) :
- Stores the message until it's processed.
- Provides reliability and durability guarantees.
- Can handle varying message rates, ensuring system scalability.
Email API:
- Continuously polls the message queue for new messages.
- Processes received messages by sending appropriate emails.

If the Consumer BE (Email API) is down, messages are queued and not lost. Once the Consumer BE is back online, it processes the queued messages and sends email notifications using data from SQS.

Benefits of Using a Message Queue

Decoupling: Services become independent, improving maintainability and scalability.
Reliability: Messages are persisted in the queue, ensuring delivery even if the Email API is temporarily unavailable.
Performance: The Leave Request API can process requests faster without waiting for email delivery.
Scalability: Each service can be scaled independently to handle increasing load.
Error Handling: Implement retry mechanisms and dead-letter queues to handle failed message processing.

AWS SQS Challenges and Finding Perfect Alternatives

As We integrated SQS to 7 different services we started depleting usage limits Due to Free Tier Limitations.

AWS SQS Free Tier Limitations

While AWS SQS offers a convenient solution for message queuing, it presents some limitations for our specific needs:

Free Tier Limitations: The free tier offered by SQS restricts the number of requests, hindering the scalability of our growing application.

Hidden Costs: Exceeding the free tier results in significant cost increases, potentially impacting our budget.

Finding a Temporary Fix for It

Amazon SQS offers Short and Long polling options for receiving messages from a queue.

Short polling (default) – The ReceiveMessage request queries a subset of servers (based on a weighted random distribution) to find available messages and sends an immediate response, even if no messages are found.

Long polling – ReceiveMessage queries all servers for messages, sending a response once at least one message is available, up to the specified maximum. An empty response is sent only if the polling wait time expires. This option can reduce the number of empty responses and potentially lower costs.

For Temporary solutions, we choose Long polling. But this didn't solve Our Problem.

Why Pub/Sub Wasn't the Perfect Fit

We explored alternative message queuing solutions, including Pub/Sub. However, Pub/Sub wasn't a suitable choice due to specific requirements:

Pros of Pub/Sub

Scalability: Pub/Sub systems are designed to handle a large number of publishers and subscribers efficiently.

Decoupling: It promotes loose coupling between systems, as publishers and subscribers don't need to know about each other.

Flexibility: Pub/Sub can be used for various messaging patterns, including publish-subscribe, fan-out, and request-reply.

Cons of Pub/Sub

Message Loss: Pub/Sub systems typically don't guarantee message delivery, especially if subscribers are offline or unable to process messages.

Message Ordering: Message order is not guaranteed in most Pub/Sub systems, which can be a limitation for certain applications.

Complexity: Managing subscribers and handling message delivery can be complex, especially for large-scale systems.

Latency: While Pub/Sub is generally fast, it might not be suitable for applications with strict real-time requirements.

The search for a viable alternative led us to Redis Streams, a powerful data structure within the Redis database.

Redis Streams: A Game-Changer

While we leverage several open-source products like Meilisearch, Gitlab, Listmonk, and Ghost for our notification system, we require a robust and high-performance solution.
After careful consideration, Redis Streams emerged as the ideal choice for our specific needs.

Redis Streams: A High-Performance Message Queue

According to its G it H ub repository, Redis (Remote Directory Server) is an in-memory data structure store. It is a disk-persistent key-value database with support for multiple data structures or data types.

This means that while Redis supports mapped key-value-based strings to store and retrieve data (analogous to the data model supported in traditional kinds of databases), it also supports other complex data structures like lists, sets, etc. As we proceed, we will look at the data structures supported by Redis. We will also get to learn about the unique features of Redis.

Redis is an open-source, highly replicated, performant, non-relational kind of database and caching server. It works by mapping keys to values with a sort of predefined data model. Its benefits include:

Mapped key-value-based caching system, almost comparable to Memcached
No strict rules pertaining to defining strict schemas or tables for data (schemaless)
Support for multiple data models or types
Offers more advanced features when compared to other kinds of database systems
Ability to withstand multiple concurrent write requests or transactions per second, via a technique known as sharding
Can be used together with other databases as a support to reduce load and improve performance, but can also be used as a primary database. Note that this is usually based on individual needs and use cases
Can come in handy in the areas of quick data ingestion with data integrity in the mix, where features like high efficiency and replication are paramount
As it is open-source it is Free to use

Note: Redis has a variety of use cases in large enterprise applications. Apart from acting as a caching server, it can also act as a message broker or be used in publisher/subscriber kind of systems. For detailed information about other use cases, we can check this section of the documentation.

Performance: Redis Streams are renowned for their high throughput and low latency, ensuring efficient message delivery for our real-time notifications.

Flexibility: Redis Streams provide various features like message IDs and stream groups, catering to our evolving needs and complex scenarios.

Efficiently Managing Data Streams with Redis

Using Redis Streams within our microservices architecture streamlines message processing:

Leave API triggers an event: Upon the successful user Leaving the application, the Signup API publishes a message to a Redis Stream.

Notification API listens and reacts: The Notification API continuously listens for new messages on the stream.

Real-time Processing: New messages are processed immediately by the Notification API, triggering actions like sending an email welcome message.

This approach ensures real-time communication and timely notification delivery, enhancing the user experience.

Leveraging Redis Streams for Robust Message Handling

Publishing to a Stream

A publisher in Redis Streams is the agent responsible for adding new elements to the stream. In this role, the publisher is central to real-time data processing. By publishing data to a stream, other parts of your application (the consumers) can react to that data almost instantly.

To add new elements to the stream, we can use the XADD command. It appends an element consisting of key-value pairs to the specified stream.

If the stream doesn’t exist, Redis will create it automatically.

The command also generates a unique identifier for the element, which is based on the current timestamp and a sequence number by default.

Code snippet

Redis Command	Key of the Stream	ID	1st Field	1st Value	2nd Field	2nd Value
`XADD`	`mystream`	`*`	`temperature`	`22`	`city`	`Lisbon`

XADD mystream * temperature 22 city Lisbon

Breaking down this command, we have:

Key of the stream: The name of the stream you want to add the element to.
ID: The identifier for the element. You can use the special character * to let Redis automatically generate an ID based on the current timestamp and a sequence number.
Element in the Stream: The key-value pairs that make up the element.

As a publisher, you can add elements to the stream continuously. For instance, in the context of a concert, the ticket scanner could be the publisher. As it scans each ticket, it could publish an element to the stream, including information like the ticket ID, seat number, and element time. Other parts of the system could consume these messages in real-time, for instance, to keep track of the occupied and available seats.

Here is an example of adding a new element to the ‘email’ stream:

XADD email * id 102

By entering this command, Redis will respond with the generated ID for the new element, which can be used later to read or reference the element in the stream.

"1678900012656-0"

The above element can then be consumed by a consumer or consumer group for further processing.

Publishers play a crucial role in Redis Streams, facilitating real-time data processing by pushing new elements to the stream. They are not constrained by the number of consumers; they can continue publishing irrespective of the number of consumers or the speed at which consumers are processing elements.

Consuming from a stream

In a real-time processing scenario, it’s not just about adding elements to the stream but also reading them, analyzing the data, and possibly taking some action based on that data. That’s where stream consumers come into play.

A consumer in Redis Streams reads the data from the stream and processes it. There can be multiple consumers reading from the same stream simultaneously, which allows for parallel processing of the data.

To read elements from a stream, we use the XREAD command.

Redis Command	Number of Records to be Returned	Millis that the Client will be Blocked	Keys of the Streams to be Read	IDs from which the Elements will be Returned
`XREAD`	`COUNT 10`	`BLOCK 3000`	`STREAM tickets [..]`	`0-0 [..]`

XREAD COUNT 10 BLOCK 3000 STREAM tickets [..] 0-0 [..]

Breaking down this command, we have:

COUNT count: This is optional and is used to limit the number of messages returned by the command.
BLOCK milliseconds: The BLOCK option in the XREAD command is optional and sets a timeout in milliseconds for how long the command should wait if no data is available. If the timeout expires and no data is received, the command returns an empty response.
STREAMS stream_name: The name of the streams from which you want to read elements.
ID: The ID from which to start reading elements. If you want to read all elements, you can use “0–0” as the ID.

This command will return all elements in the “tickets” stream. And the reason why we use 0–0 to read from the beginning of the stream is that the IDs are incremental by default. They are a combination of timestamps and sequence numbers (1678900012656-0). 0–0 would be the lowest one.

But if you want to listen only to new messages, you can use the special $ ID with the BLOCK option. The $ ID will only return messages that have been added to the stream after the XREAD command was sent.

Redis Command	Number of Records to be Returned	Millis that the Client will be Blocked	Keys of the Streams to be Read	IDs from which the Elements will be Returned
`XREAD`	`COUNT 10`	`BLOCK 3000`	`STREAM tickets [..]`	`$`

XREAD COUNT 10 BLOCK 3000 STREAM tickets [..] $

So, it is important to understand that you should use the $ ID only for the first call to XREAD. Later the ID should be the one from the last reported element in the stream. Otherwise, you could miss all the entries that are added in between.

Managing Message Lifecycle with Redis Streams

Redis Streams, while powerful, can become a bottleneck if not managed correctly. Unprocessed messages can accumulate, leading to performance issues, consumer crashes, and potential data loss.

The Role of XDEL

The XDEL command is a fundamental tool for managing message lifecycle in Redis Streams. It removes messages from a stream based on their IDs. However, it’s crucial to use XDEL strategically to avoid unintended consequences.
The XDEL command is used to delete one or more messages from a stream using their IDs.

Redis Command	Stream Name	IDs of Messages to be Deleted
`XDEL`	`stream_name`	`id1 [id2 id3 ...]`

XDEL stream_name id [id …]

For example, to delete the message with the ID ‘1685734200000-0 from the “email” stream, you would use:

XDEL email 1685734200000-0

To Understand More about Redis Stream You can check this Article

Conclusion

By transitioning from traditional message queues to Redis Streams, we significantly enhanced our notification system's efficiency, reliability, and scalability. This shift has been instrumental in delivering real-time updates to users, improving user experience, and optimizing the performance of our microservices architecture. The ability to handle high throughput, low latency, and flexible message management positions us for continued growth and innovation. Redis Streams has become a cornerstone of our infrastructure, ensuring seamless communication and timely notifications for users.

This approach demonstrates our commitment to building robust and responsive applications that meet the demands of today's fast-paced digital landscape.