Skip to content

Updated Architecture

Architecture after Caching deep dive

A profile cache is added as a dedicated Redis node. The app server follows cache-aside for all profile reads. Cache invalidation is synchronous on profile update.


What changed from base architecture

The base architecture had no caching layer for user data — every profile read hit DynamoDB directly. After this deep dive, profile reads are served from Redis on every warm request, with DynamoDB as the fallback on miss.


Changes

1. Profile cache added — dedicated Redis node

A new Redis instance (separate from the connection registry and inbox sorted sets) stores user profiles:

Profile cache:
  key:   user:<user_id>
  value: { name, avatar_s3_url, status }
  TTL:   3600s + jitter(0, 600)

Estimated size: 500M users × 250 bytes = ~125GB. One medium Redis cluster.

2. Cache-aside read path

Every profile read now goes through the cache:

Alice opens inbox — needs Bob's profile:
→ App server: GET user:bob from Redis
→ Hit:  return cached profile
→ Miss: fetch from DynamoDB → SET user:bob in Redis → return profile

3. Synchronous invalidation on profile update

Bob updates profile:
→ App server writes to DynamoDB
→ App server DEL user:bob from Redis   (~1ms, same request)
→ Next read re-populates cache from DB

No Kafka. No outbox. One DEL in the same request handler.

4. TTL jitter

TTL is randomised between 3600s and 4200s to prevent synchronised mass expiry — which would cause a scheduled thundering herd exactly 1 hour after a cold start.


Updated architecture diagram

flowchart TD
    A[Client A] -- WebSocket --> APIGW[API Gateway]
    B[Client B] -- WebSocket --> APIGW
    APIGW --> LB[Load Balancer]
    LB --> WS1[Connection Server 1]
    LB --> WS2[Connection Server 2]
    LB --> WSN[Connection Server N]
    WS1 --> AS[App Server]
    WS2 --> AS
    WSN --> AS
    AS --> SEQ[Sequence Service]
    SEQ --> SEQREDIS[(Redis - seq counters)]
    AS --> DDB[(DynamoDB - messages)]
    AS --> PENDING[(DynamoDB - pending_deliveries)]
    AS --> STATUS[(DynamoDB - message_status)]
    AS --> CONVOS[(DynamoDB - conversations)]
    AS --> USERS[(DynamoDB - users)]
    AS --> REGISTRY[(Redis - Connection Registry + last_seen)]
    AS --> INBOX[(Redis - inbox sorted sets)]
    AS --> PROFILES[(Redis - profile cache)]
    AS --> PUSH[Push Notification Service]
    PUSH --> APNS[APNs / FCM]
    DDB -- cold after 30d --> S3[(S3 - Cold Tier)]
    REGISTRY -- lookup --> AS
    INBOX -- ZREVRANGE top K --> AS
    PROFILES -- cache-aside --> AS
    AS -- route to online user --> WS2
    WS2 -- delivered ack / read receipt --> AS
    AS -- tick push --> WS1