From Todo Lists to Distributed Systems: Understanding CDC, CQRS, Event Sourcing, and Conflict Resolution
From Todo Lists to Distributed Systems
Let’s talk about how simple applications evolve into complex distributed systems, and the patterns we use to tame that complexity. We’ll start with something everyone understands: a todo list app.
The Simple Beginning
Imagine you’ve built a todo list app. It’s straightforward:
- Users add tasks
- Users mark tasks as complete
- Users can edit or delete tasks
Your database has a simple tasks table:
CREATE TABLE tasks (
id INT PRIMARY KEY,
user_id INT,
title VARCHAR(255),
completed BOOLEAN,
created_at TIMESTAMP,
updated_at TIMESTAMP
);
Life is good. Your app works perfectly for the first 100 users.
When Things Get Complicated
But then your app becomes popular. Suddenly you’re facing real-world problems:
Problem 1: The Search Disaster
Your users love your app, but they’re complaining: “Search is too slow!”
You have 10 million tasks in your database. When someone searches for “buy milk”, your database has to scan through millions of records. Even with indexes, complex searches (like “find all my incomplete tasks from last month that contain ‘meeting’”) are killing your response times.
The naive solution: Just add more database resources!
The problem: Your database is now doing two very different jobs:
- Handling writes (creating, updating, deleting tasks)
- Handling complex searches and analytics
These have completely different optimization needs. Writes need consistency and durability. Searches need speed and flexibility.
Problem 2: The Analytics Request
Your business team wants a dashboard showing:
- How many tasks are created per hour
- Which users are most productive
- What times of day see the most activity
- Completion rate trends over time
Running these queries on your main database would slow down everyone’s todo lists. Users would start seeing “Loading…” spinners everywhere.
Problem 3: The Mobile App Sync Nightmare
You’ve built a mobile app. Users want to use it offline on the subway, then sync when they get signal back.
Here’s what happens:
- Alice edits task #42 on her phone (offline): “Buy milk” → “Buy almond milk”
- Meanwhile, on her laptop, she edits the same task: “Buy milk” → “Buy oat milk”
- Both devices come online at the same time
- Which change wins?
If you just use “last write wins”, Alice might lose data she cared about. But how do you know which change she actually wanted to keep?
Problem 4: The Audit Trail Mystery
Six months later, a task mysteriously disappeared from your database. The user insists they never deleted it. Your customer support team asks you: “Can you tell us what happened?”
You look at your database:
SELECT * FROM tasks WHERE id = 12345;
-- 0 rows returned
That’s all you know. The task is gone. You have no idea:
- When it was deleted
- Who deleted it
- What it said before deletion
- Whether it was deleted or just never existed
Enter: The Patterns
Now that we understand the problems, let’s see how four key patterns solve them.
Pattern 1: CQRS (Command Query Responsibility Segregation)
The Core Idea: Split your system into two sides:
- Command side: Handles all writes (Create, Update, Delete)
- Query side: Handles all reads (Search, List, Analytics)
How It Works
Instead of one database doing everything, you have:
User writes "Create task"
↓
Command Handler (validates, saves to write database)
↓
Write Database (optimized for consistency)
↓
[synchronization happens]
↓
Read Database (optimized for queries)
↑
Query Handler (searches, filters, aggregates)
↑
User requests "Search for tasks"
Scenario: Solving the Search Problem
With CQRS, your architecture now looks like:
Command Side (PostgreSQL):
-- Simple, normalized structure
CREATE TABLE tasks (
id INT PRIMARY KEY,
user_id INT,
title VARCHAR(255),
completed BOOLEAN
);
Query Side (Elasticsearch):
{
"task_id": 12345,
"user_id": 789,
"title": "Buy milk",
"completed": false,
"tags": ["shopping", "groceries"],
"priority": "high",
"created_by": "Alice",
"created_at": "2025-10-01T10:30:00Z"
}
Now when users search, you hit Elasticsearch (blazing fast). When they create or update tasks, you hit PostgreSQL (reliable and consistent).
The Synchronization Challenge
But wait - how do changes in PostgreSQL get to Elasticsearch? That’s where our next pattern comes in…
Pattern 2: Change Data Capture (CDC)
The Core Idea: Watch your database for changes and broadcast them to interested parties.
How It Works
CDC tools (like Debezium) tap into your database’s transaction log:
PostgreSQL transaction log:
[10:30:15] INSERT INTO tasks (id, title) VALUES (42, 'Buy milk')
[10:31:22] UPDATE tasks SET completed = true WHERE id = 42
[10:32:45] DELETE FROM tasks WHERE id = 42
CDC Tool captures these and publishes:
Event: TaskCreated { id: 42, title: "Buy milk" }
Event: TaskCompleted { id: 42 }
Event: TaskDeleted { id: 42 }
Scenario: Keeping Everything in Sync
Here’s the flow:
- User creates a task
- Your app saves it to PostgreSQL
- CDC tool notices the change in the transaction log
- CDC publishes a “TaskCreated” event to a message queue (Kafka)
- Multiple consumers listen:
- Elasticsearch consumer: Indexes the task for search
- Analytics consumer: Updates the “tasks per hour” metric
- Email consumer: Sends a reminder email if it’s due today
- Mobile sync consumer: Pushes update to user’s devices
Real-World Example: Uber’s Architecture
Uber uses CDC extensively. When you request a ride:
- The ride request goes to the main database
- CDC captures the change
- Multiple systems react:
- Driver matching service finds nearby drivers
- Pricing service calculates surge pricing
- Analytics tracks demand patterns
- Your app gets real-time updates
Each system gets the data it needs without overloading the main database.
Pattern 3: Event Sourcing
The Core Idea: Instead of storing current state, store the sequence of events that led to that state.
Traditional Storage vs Event Sourcing
Traditional approach:
-- You only see the current state
tasks table:
id: 42, title: "Buy oat milk", completed: true
If you want to know “What was the original title?” or “When was it completed?” - too bad, that information is gone.
Event Sourcing approach:
// You store every event that happened
events = [
{ type: "TaskCreated", id: 42, title: "Buy milk", timestamp: "10:30" },
{ type: "TaskTitleChanged", id: 42, oldTitle: "Buy milk",
newTitle: "Buy almond milk", timestamp: "10:45" },
{ type: "TaskTitleChanged", id: 42, oldTitle: "Buy almond milk",
newTitle: "Buy oat milk", timestamp: "11:20" },
{ type: "TaskCompleted", id: 42, timestamp: "14:30" }
]
To get the current state, you “replay” all events:
function getCurrentState(events) {
let task = {};
events.forEach(event => {
switch(event.type) {
case "TaskCreated":
task = { id: event.id, title: event.title, completed: false };
break;
case "TaskTitleChanged":
task.title = event.newTitle;
break;
case "TaskCompleted":
task.completed = true;
break;
}
});
return task;
}
Scenario: The Audit Trail
Remember our mystery deletion? With event sourcing:
// Query: What happened to task 12345?
events_for_task_12345 = [
{ type: "TaskCreated", id: 12345, title: "Submit expense report",
user: "alice@example.com", timestamp: "2025-09-15T09:00:00Z" },
{ type: "TaskCompleted", id: 12345,
user: "alice@example.com", timestamp: "2025-09-16T14:30:00Z" },
{ type: "TaskDeleted", id: 12345,
user: "bob@example.com", timestamp: "2025-09-20T16:45:00Z" }
]
Now you can tell the user: “Bob deleted this task on September 20th at 4:45 PM. Here’s what it looked like before deletion.”
Scenario: Time Travel Debugging
A user reports: “My task was marked complete, but I never did that!”
With event sourcing, you can replay history:
// What did the task look like at 2 PM?
getStateAt(task_id: 42, timestamp: "14:00")
// Returns: { title: "Buy milk", completed: false }
// What about at 3 PM?
getStateAt(task_id: 42, timestamp: "15:00")
// Returns: { title: "Buy milk", completed: true }
// Who completed it?
events.find(e => e.type === "TaskCompleted" &&
e.timestamp > "14:00" &&
e.timestamp < "15:00")
// Returns: { user: "mobile_app_sync", reason: "offline_sync" }
Aha! It was marked complete by the mobile sync service, probably from an offline action.
Scenario: Business Analytics Gold Mine
Your business team asks: “How long does it typically take users to complete tasks after creating them?”
With traditional storage, you’d only have creation and completion timestamps for tasks that still exist. With event sourcing:
// Analyze all tasks ever created
events
.filter(e => e.type === "TaskCreated")
.map(created => {
const completed = events.find(e =>
e.type === "TaskCompleted" && e.id === created.id
);
return completed
? completed.timestamp - created.timestamp
: null;
})
.filter(duration => duration !== null)
.reduce((sum, duration) => sum + duration, 0) / count
You can answer questions about deleted tasks, abandoned tasks, and historical patterns that would be impossible with traditional storage.
Pattern 4: Conflict Resolution
The Core Idea: When multiple changes happen to the same data simultaneously, have a strategy to merge them intelligently.
The Offline Sync Problem Revisited
Remember Alice editing her task on two devices? Let’s see different conflict resolution strategies:
Strategy 1: Last Write Wins (LWW)
Simple but dangerous:
// Phone edit at 10:30: "Buy milk" → "Buy almond milk"
// Laptop edit at 10:31: "Buy milk" → "Buy oat milk"
// Result: "Buy oat milk" (10:31 is later)
// Problem: Alice's phone edit is completely lost!
Strategy 2: Version Vectors
Track changes from each device:
// Initial state
{
title: "Buy milk",
version: { phone: 0, laptop: 0 }
}
// Phone edit
{
title: "Buy almond milk",
version: { phone: 1, laptop: 0 }
}
// Laptop edit (doesn't know about phone edit yet)
{
title: "Buy oat milk",
version: { phone: 0, laptop: 1 }
}
// Sync: System detects conflict!
// phone: 1, laptop: 0 vs phone: 0, laptop: 1
// Neither version is strictly "newer"
Now you can present both options to Alice: “You changed this to ‘almond milk’ on your phone and ‘oat milk’ on your laptop. Which do you want to keep?”
Strategy 3: CRDTs (Conflict-free Replicated Data Types)
For certain data types, you can merge changes automatically:
Scenario: Collaborative Task Lists
Alice and Bob are both adding items to a shared grocery list offline:
// Initial: ["Milk", "Bread"]
// Alice adds: "Eggs"
// Alice's list: ["Milk", "Bread", "Eggs"]
// Bob adds: "Butter"
// Bob's list: ["Milk", "Bread", "Butter"]
// When they sync, CRDT merges:
// Final list: ["Milk", "Bread", "Eggs", "Butter"]
CRDTs guarantee that both Alice and Bob end up with the same list, without losing either addition.
Strategy 4: Operational Transformation
Used in collaborative editing (like Google Docs):
// Initial text: "Buy milk"
// Alice at position 4: Insert "almond "
// Alice's intent: "Buy almond milk"
// Bob at position 8: Insert " today"
// Bob's intent: "Buy milk today"
// System transforms Bob's operation:
// "Position 8" becomes "position 14" (accounting for Alice's insert)
// Final: "Buy almond milk today"
Both edits are preserved intelligently.
Putting It All Together
Let’s see how all four patterns work together in a real system:
Scenario: A Distributed Todo App
Architecture:
-
User creates a task on their phone:
Command: CreateTask({ title: "Buy milk" }) -
Command handler (CQRS) validates and saves:
// Event Sourcing: Store the event eventStore.append({ type: "TaskCreated", id: generateId(), title: "Buy milk", userId: "alice", deviceId: "alice-phone", timestamp: "2025-10-07T10:30:00Z", version: { "alice-phone": 1 } }); -
CDC captures the new event:
// Debezium notices the new row in event_store table cdcStream.publish(TaskCreatedEvent); -
Multiple consumers react:
// Query side (CQRS): Update search index elasticsearch.index({ id: task.id, title: task.title, userId: task.userId, searchableText: "buy milk" }); // Sync service: Push to Alice's other devices pushService.notifyDevices(alice.devices, TaskCreatedEvent); // Analytics: Update metrics metrics.increment("tasks_created_today"); -
Alice edits on laptop while offline:
// Event Sourcing: Local event stored localEventStore.append({ type: "TaskTitleChanged", id: task.id, newTitle: "Buy oat milk", deviceId: "alice-laptop", timestamp: "2025-10-07T10:45:00Z", version: { "alice-laptop": 1, "alice-phone": 1 } }); -
Alice also edits on phone:
// Conflict! Both devices have version 1 from different sources localEventStore.append({ type: "TaskTitleChanged", id: task.id, newTitle: "Buy almond milk", deviceId: "alice-phone", timestamp: "2025-10-07T10:46:00Z", version: { "alice-phone": 2 } }); -
Sync happens - Conflict Resolution kicks in:
// System detects conflicting versions // Version vectors: // Laptop: { phone: 1, laptop: 1 } // Phone: { phone: 2 } // Neither is strictly newer - show both to user conflictUI.show({ option1: { title: "Buy oat milk", device: "laptop", time: "10:45" }, option2: { title: "Buy almond milk", device: "phone", time: "10:46" } }); -
Alice chooses “almond milk”:
// Event Sourcing: Record the resolution eventStore.append({ type: "ConflictResolved", id: task.id, chosenVersion: "alice-phone-v2", timestamp: "2025-10-07T11:00:00Z", version: { "alice-phone": 3, "alice-laptop": 2 } }); // CDC picks this up and syncs everywhere cdcStream.publish(ConflictResolvedEvent);
Real-World Applications
Banking: Account Balance
Traditional approach: Store current balance
- Problem: Can’t explain how you got to this balance
- Problem: Difficult to audit for errors
With Event Sourcing:
events = [
{ type: "AccountOpened", balance: 0 },
{ type: "Deposited", amount: 1000 },
{ type: "Withdrew", amount: 50 },
{ type: "Deposited", amount: 200 },
{ type: "Withdrew", amount: 100 }
]
currentBalance = events.reduce((bal, e) => {
if (e.type === "Deposited") return bal + e.amount;
if (e.type === "Withdrew") return bal - e.amount;
return bal;
}, 0); // Result: 1050
If there’s ever a dispute, you have a complete audit trail.
E-commerce: Shopping Cart
CQRS + Event Sourcing:
- Command side: Handle “AddToCart”, “RemoveFromCart”, “Checkout”
- Query side: Show cart with product images, prices, availability
- Event sourcing: Track every cart modification for analytics
Conflict Resolution:
- User adds item on phone and laptop simultaneously
- Both additions succeed (no conflict - adding is commutative)
Healthcare: Patient Records
Event Sourcing:
- Critical for medical history
- Never lose information about previous diagnoses, medications, or treatments
- Complete audit trail for legal compliance
CQRS:
- Write side: Doctors update records (strict validation)
- Read side: Various views (labs dashboard, medication list, treatment timeline)
CDC:
- Sync to billing system, insurance, pharmacies
- Alert system triggers on critical events
Trade-offs and Considerations
When to Use CQRS
Good fit:
- Read and write patterns are very different
- Need to optimize reads and writes independently
- Multiple read representations of the same data
Not worth it:
- Simple CRUD apps
- Reads and writes are similar in complexity
- Small scale where one database works fine
When to Use Event Sourcing
Good fit:
- Need complete audit trail
- Temporal queries (“what did it look like last week?”)
- Domain events are first-class citizens in your business
Challenges:
- More storage (you’re keeping everything)
- Complexity (replaying events, snapshots for performance)
- Schema evolution (old events need to work with new code)
When to Use CDC
Good fit:
- Need to react to database changes
- Integrating multiple systems
- Don’t want to modify existing application code
Alternatives:
- Application-level events (publish events from your code instead)
- Simpler when you control the application
Conflict Resolution Strategies
Choose based on your domain:
- LWW: Acceptable data loss? Use it (simple cache invalidation)
- Manual resolution: User must decide (document editing)
- CRDTs: Mathematical merge possible (collaborative lists, counters)
- OT: Order matters (text editing)
Common Pitfalls
1. Over-Engineering
Don’t start with all these patterns! Begin simple:
- Start with a regular database
- Add read replicas if reads are slow
- Add CQRS if read/write patterns diverge significantly
- Add event sourcing if you need the audit trail
2. Eventual Consistency Confusion
With CQRS, your read side might lag behind writes:
// User creates task
POST /tasks { title: "Buy milk" }
// Returns: 201 Created
// User immediately searches
GET /tasks?search=milk
// Returns: [] (search index hasn't updated yet!)
Solutions:
- Show optimistic UI updates
- Add version numbers to track sync state
- Consider “read your own writes” consistency for critical paths
3. Event Schema Evolution
// Version 1: Simple event
{ type: "TaskCreated", title: "Buy milk" }
// Version 2: Added priority
{ type: "TaskCreated", title: "Buy milk", priority: "high" }
// Problem: Old code reading new events?
// Problem: New code reading old events?
Solutions:
- Always include version in events
- Use upcasting (transform old events to new format when reading)
- Make changes backward compatible
Conclusion
These four patterns - CQRS, CDC, Event Sourcing, and Conflict Resolution - are powerful tools for building robust distributed systems. But remember:
- Start simple: Don’t use these patterns unless you have the problems they solve
- Understand trade-offs: Each pattern adds complexity
- Mix and match: You don’t need all four - use what fits your needs
- Iterate: Add patterns as your system grows and requirements evolve
The key is recognizing when your simple todo list has outgrown its architecture, and knowing which tools can help you scale intelligently.
Further Reading
- Martin Fowler’s articles on CQRS and Event Sourcing
- Designing Data-Intensive Applications by Martin Kleppmann
- Building Microservices by Sam Newman
- Apache Kafka documentation for CDC and event streaming
- CRDTs research papers for conflict resolution
What patterns are you using in your systems? Share your experiences in the comments!