Tautik Agrahari

designing the unread message count

the tiny number on your messaging icon: designing the unread sender count

a system that looks deceptively simple on the surface but hides a bunch of interesting engineering decisions underneath. you know that little number on your messaging icon — the one that tells you how many new people messaged you. let's walk through how to actually build it.

what you'll take away

quick pointers so you know what to look for as you read:


the requirement

you know that little number on the messaging icon in linkedin, twitter, instagram? that's not the total number of unread messages. it's the number of unique people from whom you received new messages since you last tapped on the icon.

so if your friend sends you 100 messages, the indicator shows 1, not 100. if three different people each send you messages, it shows 3. you tap the icon, the counter resets. next time someone messages you, it starts counting again.

sounds simple. let's see.

approach 1: count on the fly

your schema in mysql (or any relational db) has three tables — users, messages, and user_activity. user_activity.last_read_at resets to current time whenever the user taps the message icon. the query to get the unread count is literally:

SELECT COUNT(UNIQUE(from))
FROM messages
WHERE to = ?
AND timestamp > last_read_at

that's it. what looks like a super complicated system is literally this one query. at small to medium scale, this works just fine.

messages schema (id, from, to, msg, ts) + user_activity.last_read_at watermark + the COUNT(UNIQUE(from)) query

but will this scale? that depends entirely on your indexing strategy. and this is where most people mess up.

the indexing deep dive

let's say you blindly create individual indexes on to and timestamp. seems reasonable right? here's how the query evaluates.

take user C who was last online at timestamp 11:11. the query fires WHERE to = C AND timestamp > 11:11.

the to index (B+ tree, ordered by to then id) gives you all messages ever sent to C. could be millions of rows across 10 years. the timestamp index gives you all messages after 11:11 — potentially the entire table if the timestamp is old enough. then the database does a set intersection of these two result sets to find the matching rows.

imagine millions of entries from the to index intersected with millions from the timestamp index. catastrophically slow.

composite index to the rescue

the fix is a composite index on (to, timestamp, from). now when you fire WHERE to = C AND timestamp > 11:11:

  1. O(log n) — binary search to find where to = C starts
  2. skip — within C's entries, jump to where timestamp > 11:11
  3. k sequential reads — scan only the matching entries

total: O(log n + k) where k is the number of matching rows. no set intersection. no pointed lookups on the main table for the from column (because it's baked into the index). done.

now compare:

(to, ts, from) composite index layout: binary search to find to=C, skip to where ts > 11:11, then sequential scan

the overhead? just 8 extra bytes per index entry for storing from. even with 10 million messages, that's 80 MB. peanuts compared to the compute savings.

and yes, this index is hyper-optimized for this one query. that's perfectly fine. it's not an anti-pattern to optimize for your most critical query. as long as it works for your product and business, do it.

also — the order of columns in the composite index matters. if you did (timestamp, to, from) instead, the first column would match a huge range of timestamps, then you'd have to linearly scan through all of them to filter by to. completely defeats the purpose. the index should be ordered to match your query's selectivity — to first (narrows to one user), then timestamp (narrows to recent messages), then from (avoids main table lookup).

approach 2: pre-computed with redis

the on-the-fly approach works at decent scale. but if you want to pre-compute, here's how it shapes up.

you need a way to know whether a message was actually delivered or not. websockets help — if the user is connected, the message is delivered live and doesn't count. if the user is offline, the message is undelivered and that's when the counter should bump. so you need an online/offline service that knows the websocket state of every user.

whenever a message can't be delivered in real-time (user is offline or inactive), an ON_MSG_UNSENT event hits kafka partitioned by receiver ID. workers consume it and fire SADD on redis — adding the sender to the receiver's set.

so if D sends A 10 messages, the worker fires SADD A D ten times. but since it's a set, D only appears once. the count? just SLEN A. returns 3 if A has messages from B, C, and D.

when A taps the message icon: DEL A. counter reset. done.

the read path is a single SLEN call. the clear path is a single DEL. both O(1). the write path is SADD calls from workers.

the punching bag pattern

now here's the micro-optimization. if D sends A 100 messages, the worker fires SADD A D a hundred times. but after the first one, D is already in the set. the remaining 99 are redundant operations — they don't change the data but they still consume redis resources.

at peak load, imagine 90% of your commands are redundant. your redis cluster is both read-heavy (polling for status) and write-heavy (ingesting events). every unnecessary command adds up.

the punching bag pattern is about protecting your critical component. you add an auxiliary redis replica in front. before firing SADD on the primary, you check the replica — is this member already in the set? if yes, skip. if no, write to primary.

punching bag pattern: ON_MSG_UNSENT → status update workers → check auxiliary redis (replica) before SADD on primary cluster

this isn't your day-zero solution. it's a day-n optimization when you observe that a huge percentage of writes are redundant. the pattern shows up everywhere — rate limiters are essentially punching bags that absorb load before it hits your core system.

two flavors of the punching bag:

key takeaways

and the bigger picture — identify the read path and write path, optimize them independently. for reads: indexes, replicas, caches, pre-computed counts. for writes: batching, buffering, punching bags to absorb redundancy.

nothing is best. everything depends on the usecase. and the answer almost always lies in the gray area — not purely in engineering, not purely in product, but in that sweet spot where both work together.