From Todo Lists to Distributed Systems: Understanding CDC, CQRS, Event Sourcing, and Conflict Resolution

October 8, 2025 • 15 min read

#distributed-systems #architecture

From Todo Lists to Distributed Systems

Let’s talk about how simple applications evolve into complex distributed systems, and the patterns we use to tame that complexity. We’ll start with something everyone understands: a todo list app.

The Simple Beginning

Imagine you’ve built a todo list app. It’s straightforward:

Users add tasks
Users mark tasks as complete
Users can edit or delete tasks

Your database has a simple tasks table:

CREATE TABLE tasks (
  id INT PRIMARY KEY,
  user_id INT,
  title VARCHAR(255),
  completed BOOLEAN,
  created_at TIMESTAMP,
  updated_at TIMESTAMP
);

Life is good. Your app works perfectly for the first 100 users.

When Things Get Complicated

But then your app becomes popular. Suddenly you’re facing real-world problems:

Problem 1: The Search Disaster

Your users love your app, but they’re complaining: “Search is too slow!”

You have 10 million tasks in your database. When someone searches for “buy milk”, your database has to scan through millions of records. Even with indexes, complex searches (like “find all my incomplete tasks from last month that contain ‘meeting’”) are killing your response times.

The naive solution: Just add more database resources!

The problem: Your database is now doing two very different jobs:

Handling writes (creating, updating, deleting tasks)
Handling complex searches and analytics

These have completely different optimization needs. Writes need consistency and durability. Searches need speed and flexibility.

Problem 2: The Analytics Request

Your business team wants a dashboard showing:

How many tasks are created per hour
Which users are most productive
What times of day see the most activity
Completion rate trends over time

Running these queries on your main database would slow down everyone’s todo lists. Users would start seeing “Loading…” spinners everywhere.

Problem 3: The Mobile App Sync Nightmare

You’ve built a mobile app. Users want to use it offline on the subway, then sync when they get signal back.

Here’s what happens:

Alice edits task #42 on her phone (offline): “Buy milk” → “Buy almond milk”
Meanwhile, on her laptop, she edits the same task: “Buy milk” → “Buy oat milk”
Both devices come online at the same time
Which change wins?

If you just use “last write wins”, Alice might lose data she cared about. But how do you know which change she actually wanted to keep?

Problem 4: The Audit Trail Mystery

Six months later, a task mysteriously disappeared from your database. The user insists they never deleted it. Your customer support team asks you: “Can you tell us what happened?”

You look at your database:

SELECT * FROM tasks WHERE id = 12345;
-- 0 rows returned

That’s all you know. The task is gone. You have no idea:

When it was deleted
Who deleted it
What it said before deletion
Whether it was deleted or just never existed

Enter: The Patterns

Now that we understand the problems, let’s see how four key patterns solve them.

Pattern 1: CQRS (Command Query Responsibility Segregation)

The Core Idea: Split your system into two sides:

Command side: Handles all writes (Create, Update, Delete)
Query side: Handles all reads (Search, List, Analytics)

How It Works

Instead of one database doing everything, you have:

User writes "Create task" 
    ↓
Command Handler (validates, saves to write database)
    ↓
Write Database (optimized for consistency)
    ↓
[synchronization happens]
    ↓
Read Database (optimized for queries)
    ↑
Query Handler (searches, filters, aggregates)
    ↑
User requests "Search for tasks"

Scenario: Solving the Search Problem

With CQRS, your architecture now looks like:

Command Side (PostgreSQL):

-- Simple, normalized structure
CREATE TABLE tasks (
  id INT PRIMARY KEY,
  user_id INT,
  title VARCHAR(255),
  completed BOOLEAN
);

Query Side (Elasticsearch):

{
  "task_id": 12345,
  "user_id": 789,
  "title": "Buy milk",
  "completed": false,
  "tags": ["shopping", "groceries"],
  "priority": "high",
  "created_by": "Alice",
  "created_at": "2025-10-01T10:30:00Z"
}

Now when users search, you hit Elasticsearch (blazing fast). When they create or update tasks, you hit PostgreSQL (reliable and consistent).

The Synchronization Challenge

But wait - how do changes in PostgreSQL get to Elasticsearch? That’s where our next pattern comes in…

Pattern 2: Change Data Capture (CDC)

The Core Idea: Watch your database for changes and broadcast them to interested parties.

How It Works

CDC tools (like Debezium) tap into your database’s transaction log:

PostgreSQL transaction log:
  [10:30:15] INSERT INTO tasks (id, title) VALUES (42, 'Buy milk')
  [10:31:22] UPDATE tasks SET completed = true WHERE id = 42
  [10:32:45] DELETE FROM tasks WHERE id = 42

CDC Tool captures these and publishes:
  Event: TaskCreated { id: 42, title: "Buy milk" }
  Event: TaskCompleted { id: 42 }
  Event: TaskDeleted { id: 42 }

Scenario: Keeping Everything in Sync

Here’s the flow:

User creates a task
Your app saves it to PostgreSQL
CDC tool notices the change in the transaction log
CDC publishes a “TaskCreated” event to a message queue (Kafka)
Multiple consumers listen:
- Elasticsearch consumer: Indexes the task for search
- Analytics consumer: Updates the “tasks per hour” metric
- Email consumer: Sends a reminder email if it’s due today
- Mobile sync consumer: Pushes update to user’s devices

Real-World Example: Uber’s Architecture

Uber uses CDC extensively. When you request a ride:

The ride request goes to the main database
CDC captures the change
Multiple systems react:
- Driver matching service finds nearby drivers
- Pricing service calculates surge pricing
- Analytics tracks demand patterns
- Your app gets real-time updates

Each system gets the data it needs without overloading the main database.

Pattern 3: Event Sourcing

The Core Idea: Instead of storing current state, store the sequence of events that led to that state.

Traditional Storage vs Event Sourcing

Traditional approach:

-- You only see the current state
tasks table:
  id: 42, title: "Buy oat milk", completed: true

If you want to know “What was the original title?” or “When was it completed?” - too bad, that information is gone.

Event Sourcing approach:

// You store every event that happened
events = [
  { type: "TaskCreated", id: 42, title: "Buy milk", timestamp: "10:30" },
  { type: "TaskTitleChanged", id: 42, oldTitle: "Buy milk", 
    newTitle: "Buy almond milk", timestamp: "10:45" },
  { type: "TaskTitleChanged", id: 42, oldTitle: "Buy almond milk", 
    newTitle: "Buy oat milk", timestamp: "11:20" },
  { type: "TaskCompleted", id: 42, timestamp: "14:30" }
]

To get the current state, you “replay” all events:

function getCurrentState(events) {
  let task = {};
  events.forEach(event => {
    switch(event.type) {
      case "TaskCreated":
        task = { id: event.id, title: event.title, completed: false };
        break;
      case "TaskTitleChanged":
        task.title = event.newTitle;
        break;
      case "TaskCompleted":
        task.completed = true;
        break;
    }
  });
  return task;
}

Scenario: The Audit Trail

Remember our mystery deletion? With event sourcing:

// Query: What happened to task 12345?
events_for_task_12345 = [
  { type: "TaskCreated", id: 12345, title: "Submit expense report", 
    user: "alice@example.com", timestamp: "2025-09-15T09:00:00Z" },
  { type: "TaskCompleted", id: 12345, 
    user: "alice@example.com", timestamp: "2025-09-16T14:30:00Z" },
  { type: "TaskDeleted", id: 12345, 
    user: "bob@example.com", timestamp: "2025-09-20T16:45:00Z" }
]

Now you can tell the user: “Bob deleted this task on September 20th at 4:45 PM. Here’s what it looked like before deletion.”

Scenario: Time Travel Debugging

A user reports: “My task was marked complete, but I never did that!”

With event sourcing, you can replay history:

// What did the task look like at 2 PM?
getStateAt(task_id: 42, timestamp: "14:00")
// Returns: { title: "Buy milk", completed: false }

// What about at 3 PM?
getStateAt(task_id: 42, timestamp: "15:00")
// Returns: { title: "Buy milk", completed: true }

// Who completed it?
events.find(e => e.type === "TaskCompleted" && 
                 e.timestamp > "14:00" && 
                 e.timestamp < "15:00")
// Returns: { user: "mobile_app_sync", reason: "offline_sync" }

Aha! It was marked complete by the mobile sync service, probably from an offline action.

Scenario: Business Analytics Gold Mine

Your business team asks: “How long does it typically take users to complete tasks after creating them?”

With traditional storage, you’d only have creation and completion timestamps for tasks that still exist. With event sourcing:

// Analyze all tasks ever created
events
  .filter(e => e.type === "TaskCreated")
  .map(created => {
    const completed = events.find(e => 
      e.type === "TaskCompleted" && e.id === created.id
    );
    return completed 
      ? completed.timestamp - created.timestamp 
      : null;
  })
  .filter(duration => duration !== null)
  .reduce((sum, duration) => sum + duration, 0) / count

You can answer questions about deleted tasks, abandoned tasks, and historical patterns that would be impossible with traditional storage.

Pattern 4: Conflict Resolution

The Core Idea: When multiple changes happen to the same data simultaneously, have a strategy to merge them intelligently.

The Offline Sync Problem Revisited

Remember Alice editing her task on two devices? Let’s see different conflict resolution strategies:

Strategy 1: Last Write Wins (LWW)

Simple but dangerous:

// Phone edit at 10:30: "Buy milk" → "Buy almond milk"
// Laptop edit at 10:31: "Buy milk" → "Buy oat milk"

// Result: "Buy oat milk" (10:31 is later)
// Problem: Alice's phone edit is completely lost!

Strategy 2: Version Vectors

Track changes from each device:

// Initial state
{ 
  title: "Buy milk", 
  version: { phone: 0, laptop: 0 } 
}

// Phone edit
{ 
  title: "Buy almond milk", 
  version: { phone: 1, laptop: 0 } 
}

// Laptop edit (doesn't know about phone edit yet)
{ 
  title: "Buy oat milk", 
  version: { phone: 0, laptop: 1 } 
}

// Sync: System detects conflict!
// phone: 1, laptop: 0 vs phone: 0, laptop: 1
// Neither version is strictly "newer"

Now you can present both options to Alice: “You changed this to ‘almond milk’ on your phone and ‘oat milk’ on your laptop. Which do you want to keep?”

Strategy 3: CRDTs (Conflict-free Replicated Data Types)

For certain data types, you can merge changes automatically:

Scenario: Collaborative Task Lists

Alice and Bob are both adding items to a shared grocery list offline:

// Initial: ["Milk", "Bread"]

// Alice adds: "Eggs"
// Alice's list: ["Milk", "Bread", "Eggs"]

// Bob adds: "Butter"
// Bob's list: ["Milk", "Bread", "Butter"]

// When they sync, CRDT merges:
// Final list: ["Milk", "Bread", "Eggs", "Butter"]

CRDTs guarantee that both Alice and Bob end up with the same list, without losing either addition.

Strategy 4: Operational Transformation

Used in collaborative editing (like Google Docs):

// Initial text: "Buy milk"

// Alice at position 4: Insert "almond "
// Alice's intent: "Buy almond milk"

// Bob at position 8: Insert " today"
// Bob's intent: "Buy milk today"

// System transforms Bob's operation:
// "Position 8" becomes "position 14" (accounting for Alice's insert)
// Final: "Buy almond milk today"

Both edits are preserved intelligently.

Putting It All Together

Let’s see how all four patterns work together in a real system:

Scenario: A Distributed Todo App

Architecture:

User creates a task on their phone:

Command: CreateTask({ title: "Buy milk" })

Command handler (CQRS) validates and saves:

// Event Sourcing: Store the event
eventStore.append({
  type: "TaskCreated",
  id: generateId(),
  title: "Buy milk",
  userId: "alice",
  deviceId: "alice-phone",
  timestamp: "2025-10-07T10:30:00Z",
  version: { "alice-phone": 1 }
});

CDC captures the new event:

// Debezium notices the new row in event_store table
cdcStream.publish(TaskCreatedEvent);

Multiple consumers react:

// Query side (CQRS): Update search index
elasticsearch.index({
  id: task.id,
  title: task.title,
  userId: task.userId,
  searchableText: "buy milk"
});

// Sync service: Push to Alice's other devices
pushService.notifyDevices(alice.devices, TaskCreatedEvent);

// Analytics: Update metrics
metrics.increment("tasks_created_today");

Alice edits on laptop while offline:

// Event Sourcing: Local event stored
localEventStore.append({
  type: "TaskTitleChanged",
  id: task.id,
  newTitle: "Buy oat milk",
  deviceId: "alice-laptop",
  timestamp: "2025-10-07T10:45:00Z",
  version: { "alice-laptop": 1, "alice-phone": 1 }
});

Alice also edits on phone:

// Conflict! Both devices have version 1 from different sources
localEventStore.append({
  type: "TaskTitleChanged",
  id: task.id,
  newTitle: "Buy almond milk",
  deviceId: "alice-phone",
  timestamp: "2025-10-07T10:46:00Z",
  version: { "alice-phone": 2 }
});

Sync happens - Conflict Resolution kicks in:

// System detects conflicting versions
// Version vectors: 
//   Laptop: { phone: 1, laptop: 1 }
//   Phone:  { phone: 2 }

// Neither is strictly newer - show both to user
conflictUI.show({
  option1: { title: "Buy oat milk", device: "laptop", time: "10:45" },
  option2: { title: "Buy almond milk", device: "phone", time: "10:46" }
});

Alice chooses “almond milk”:

// Event Sourcing: Record the resolution
eventStore.append({
  type: "ConflictResolved",
  id: task.id,
  chosenVersion: "alice-phone-v2",
  timestamp: "2025-10-07T11:00:00Z",
  version: { "alice-phone": 3, "alice-laptop": 2 }
});

// CDC picks this up and syncs everywhere
cdcStream.publish(ConflictResolvedEvent);

Real-World Applications

Banking: Account Balance

Traditional approach: Store current balance

Problem: Can’t explain how you got to this balance
Problem: Difficult to audit for errors

With Event Sourcing:

events = [
  { type: "AccountOpened", balance: 0 },
  { type: "Deposited", amount: 1000 },
  { type: "Withdrew", amount: 50 },
  { type: "Deposited", amount: 200 },
  { type: "Withdrew", amount: 100 }
]

currentBalance = events.reduce((bal, e) => {
  if (e.type === "Deposited") return bal + e.amount;
  if (e.type === "Withdrew") return bal - e.amount;
  return bal;
}, 0); // Result: 1050

If there’s ever a dispute, you have a complete audit trail.

E-commerce: Shopping Cart

CQRS + Event Sourcing:

Command side: Handle “AddToCart”, “RemoveFromCart”, “Checkout”
Query side: Show cart with product images, prices, availability
Event sourcing: Track every cart modification for analytics

Conflict Resolution:

User adds item on phone and laptop simultaneously
Both additions succeed (no conflict - adding is commutative)

Healthcare: Patient Records

Event Sourcing:

Critical for medical history
Never lose information about previous diagnoses, medications, or treatments
Complete audit trail for legal compliance

CQRS:

Write side: Doctors update records (strict validation)
Read side: Various views (labs dashboard, medication list, treatment timeline)

CDC:

Sync to billing system, insurance, pharmacies
Alert system triggers on critical events

Trade-offs and Considerations

When to Use CQRS

Good fit:

Read and write patterns are very different
Need to optimize reads and writes independently
Multiple read representations of the same data

Not worth it:

Simple CRUD apps
Reads and writes are similar in complexity
Small scale where one database works fine

When to Use Event Sourcing

Good fit:

Need complete audit trail
Temporal queries (“what did it look like last week?”)
Domain events are first-class citizens in your business

Challenges:

More storage (you’re keeping everything)
Complexity (replaying events, snapshots for performance)
Schema evolution (old events need to work with new code)

When to Use CDC

Good fit:

Need to react to database changes
Integrating multiple systems
Don’t want to modify existing application code

Alternatives:

Application-level events (publish events from your code instead)
Simpler when you control the application

Conflict Resolution Strategies

Choose based on your domain:

LWW: Acceptable data loss? Use it (simple cache invalidation)
Manual resolution: User must decide (document editing)
CRDTs: Mathematical merge possible (collaborative lists, counters)
OT: Order matters (text editing)

Common Pitfalls

1. Over-Engineering

Don’t start with all these patterns! Begin simple:

Start with a regular database
Add read replicas if reads are slow
Add CQRS if read/write patterns diverge significantly
Add event sourcing if you need the audit trail

2. Eventual Consistency Confusion

With CQRS, your read side might lag behind writes:

// User creates task
POST /tasks { title: "Buy milk" }
// Returns: 201 Created

// User immediately searches
GET /tasks?search=milk
// Returns: [] (search index hasn't updated yet!)

Solutions:

Show optimistic UI updates
Add version numbers to track sync state
Consider “read your own writes” consistency for critical paths

3. Event Schema Evolution

// Version 1: Simple event
{ type: "TaskCreated", title: "Buy milk" }

// Version 2: Added priority
{ type: "TaskCreated", title: "Buy milk", priority: "high" }

// Problem: Old code reading new events?
// Problem: New code reading old events?

Solutions:

Always include version in events
Use upcasting (transform old events to new format when reading)
Make changes backward compatible

Conclusion

These four patterns - CQRS, CDC, Event Sourcing, and Conflict Resolution - are powerful tools for building robust distributed systems. But remember:

Start simple: Don’t use these patterns unless you have the problems they solve
Understand trade-offs: Each pattern adds complexity
Mix and match: You don’t need all four - use what fits your needs
Iterate: Add patterns as your system grows and requirements evolve

The key is recognizing when your simple todo list has outgrown its architecture, and knowing which tools can help you scale intelligently.

From Todo Lists to Distributed Systems

The Simple Beginning

When Things Get Complicated

Problem 1: The Search Disaster

Problem 2: The Analytics Request

Problem 3: The Mobile App Sync Nightmare

Problem 4: The Audit Trail Mystery

Enter: The Patterns

Pattern 1: CQRS (Command Query Responsibility Segregation)

How It Works

Scenario: Solving the Search Problem

The Synchronization Challenge

Pattern 2: Change Data Capture (CDC)

How It Works

Scenario: Keeping Everything in Sync

Real-World Example: Uber’s Architecture

Pattern 3: Event Sourcing

Traditional Storage vs Event Sourcing

Scenario: The Audit Trail

Scenario: Time Travel Debugging

Scenario: Business Analytics Gold Mine

Pattern 4: Conflict Resolution

The Offline Sync Problem Revisited

Strategy 1: Last Write Wins (LWW)

Strategy 2: Version Vectors

Strategy 3: CRDTs (Conflict-free Replicated Data Types)

Strategy 4: Operational Transformation

Putting It All Together

Scenario: A Distributed Todo App

Real-World Applications

Banking: Account Balance

E-commerce: Shopping Cart

Healthcare: Patient Records

Trade-offs and Considerations

When to Use CQRS

When to Use Event Sourcing

When to Use CDC

Conflict Resolution Strategies

Common Pitfalls

1. Over-Engineering

2. Eventual Consistency Confusion

3. Event Schema Evolution

Conclusion

Further Reading