Tautik Agrahari

Delegation | Task Offloading

Let's add basic analytics to our blog (just taking an example, assume Medium/Hashnode). So the whole idea is, the crux of designing a good system at scale is minimize the number of synchronous calls you are making.

The mantra for performance: What does not need to be done in real time should not be done in real time. If you can somehow delegate and do async processing, do it async.

We write to log and forget about it. That's how Postgres does commits. So that's what gives you consistent commit times. But then it puts additional load on the background processing, there's different volume altogether. But try to minimize things that you are doing synchronously so that you wrap up things as fast as possible.

scaling

Now a classic example of this you all know is creation of EC2 instance or trans-coding of video files. It takes a long time and for that long time you cannot have that HTTP request open. For example, let's say you requested to create an EC2 instance, you made a POST request and it takes let's say five minutes (it doesn't take that long but assume it takes five minutes) - your HTTP request cannot wait for this long, it would start timing out depending on your web server and your load balancer configuration. Plus why keep your request waiting?

Which is where you typically see a flow where you have a long running task - you put the task in the broker and let async workers pick that up and update the database with the status. So this is what your delegation is all about.

The core idea: Delegate & Respond

Tasks that are Delegated

Tasks that should outlive your request timeout:

Brokers

The key infra component that enables this is a broker. Broker is a generic term - it's a buffer to keep the tasks & messages.

scaling

This broker can be of two types:

Message Queues

Examples: SQS, RabbitMQ

Visibility timeout concept - if a consumer has gotten a message and it has not deleted it until this time, the message queue will resurface this message at the head of the queue so that other consumers can pick that up.

scaling

Homogeneous consumers: Any consumer can handle any message

Message Streams

Examples: Kafka, Kinesis

Consumer groups concept - different types of consumers can consume the same message for different purposes.

scaling

Heterogeneous consumers: Same message consumed by different consumer groups

Real Example: Blog Analytics

Let's take a concrete example. Let's say in case of a blog (like Medium), what you want to do is maintain a count of total number of blogs published by a user.

You have two ways to do it: https://tautik.me/scaling-db-proxy-pattern/

https://bear-images.sfo2.cdn.digitaloceanspaces.com/tautik/03pm.webp

The Async Way

It's a crude example, very primitive example but it will help you build an understanding. In case of this, you want to do it in async fashion. What you can do is:

Whenever a blog is published → You put a message in SQS → A worker picks that up → That worker does total_blogs++ in the database.

UPDATE users SET total_blogs = total_blogs + 1 WHERE id = user_id;

Storing it pre-computed to avoid joins & runtime computation.

Note: Accept lag in processing rather than sync

So this way you accept a lag in processing and data getting reflected here rather than doing a synchronous message. So the opposite is sync vs async.

The Problem with Single Worker

Now imagine today you have this but a new requirement came where you said "Hey I need to index a blog when it's published in my search engine." Let's say use Elasticsearch for that. image So what you have is - you not only have to do total_blogs++, you also have to index the same thing in Elasticsearch. It's two things:

So now this worker is now doing two things. This is not a good design, because:

Problems:

  1. Inconsistent data - If your code broke after incrementing the count, your worker could not delete the message. After visibility timeout the message will resurface again, some other worker might pick that up and execute the same thing again. So it would do count++ again. Although you have 5 blogs but system will now say 6 blogs because this happened twice.
  2. Team dependencies - This codebase which does count increment and ES indexing is now owned by two teams: backend team and search team. Backend team wants to push something but search team says "hey we have a broken change in master please don't push it." It's difficult to work with people, easier to work with machines.

Hence, we use Kafka

Same event consumed by two types of consumers.

The whole idea of message stream is simple - you have a Kafka topic and you have heterogeneous set of consumers which means the same message needs to be consumed by two different types of consumers (we call them consumer groups).

https://bear-images.sfo2.cdn.digitaloceanspaces.com/tautik/07pm.webp

What you need to do is whenever this event (blog is published) happens, you create that event and now two consumer groups consume it:

The best way to visualize Kafka is that all the messages are enqueued in the queue (which is a Kafka topic) and then the consumers iterate at their own sweet pace on the messages. It's possible that one iterates faster than the other, it doesn't matter. So system becomes eventually consistent - it's not strongly consistent that search and count both happen at the same time, but eventually everything would be consistent.

This is consumer group - so this is the fundamental difference between message queues and message streams.

Read about Kafka Essentials: https://tautik.me/kafka/