What's covered in each deep-dive?

Each deep-dive follows the same structure: problem statement (what was being solved and under what constraints), architecture overview with a diagram, stack rationale with alternatives considered, key code patterns, the specific bugs that wasted time, observability snapshot, and a 'what I'd change' retrospective. The format is deliberate — it forces honesty and makes posts easy to compare across projects.

How long are these posts?

Typically 2000–4000 words per deep-dive. Long enough to actually transmit engineering knowledge — short enough that readers will skim it on mobile before opening on desktop. Each architecture diagram is designed to be readable on a phone. The prose elaborates on the diagram rather than restating it, so a reader can pick their depth: scan the diagrams and headings for the gist, or read everything for full context.

Can I request a deep-dive on a specific kind of project?

Yes — reader-suggested topics are some of the strongest posts in this category. If there's a specific architecture, stack, or migration you're thinking about and want to see a real-world example written up, the contact form on the homepage works. Suggestions go directly into the queue, with the most-requested types of post getting priority.

Are these real projects you actually worked on?

Yes. Each deep-dive is based on a project I worked on personally — usually as the engineer or lead engineer. Some client or company names are redacted for confidentiality, but the architecture, stack choices, bugs, and outcomes are described accurately. The point of writing publicly is transmitting real engineering knowledge — if the details were invented, the posts would have no value over generic 'how to architect X' tutorials.

How are the architecture diagrams in these posts made?

Excalidraw for early drafts, Figma for final published versions, with a shared component library for recurring shapes (services, databases, queues). The visual language is consistent across posts — sources on the left, data stores on the right, queues called out explicitly, failure points marked with red dots. The constraint that drives most decisions: diagrams must be readable on a phone. That discipline tends to make them clearer for desktop readers too.

Why this format instead of typical 'How we built X' engineering blogs?

Typical 'How we built X' posts are sanitised — written by marketing teams to make the engineering org look more methodical than it actually was, with the wrong stack choices and embarrassing bugs quietly omitted. The deep-dives here include those parts. The retrospective section explicitly asks 'what would you change?' — which is the most useful part for another engineer reading the post and facing a similar decision. Honest > polished, in this format.

Project Deep-Dives — How I Shipped Real Products

The full story behind shipped products.

Marketing posts hide the messy bits. The "how we built X" articles you see from larger companies are almost always sanitized — written by marketing teams to make the engineering org look more methodical than it actually was, with the bugs and the wrong stack choices quietly omitted. This category doesn't do that. Each deep-dive walks through a real product I've shipped, including the architecture that worked, the architecture that didn't, the production incidents, the migration scars, and the things I'd do differently if I were starting over today.

The goal isn't entertainment — it's transmission of real engineering knowledge. The kind of knowledge that's normally locked in the heads of senior engineers and rarely makes it onto the public internet because it doesn't fit the format of "10 React tips you need to know." A good engineering deep-dive should give a reader enough specific signal that they can avoid the same mistakes on their own project. That's what these posts aim for.

Why this category exists

Engineering content online is heavily skewed toward two extremes: short tutorials that teach a specific technique in isolation, and abstract architecture posts that talk about principles without specifics. There's an enormous gap in the middle — the level of detail an engineer actually needs to make decisions on their own project. Why did this team pick Postgres over Mongo? Why did they move from EC2 to ECS three months later? What was the specific bug that caused the migration? How much did it cost in engineer-hours and how much did it save in cloud spend?

This category fills that gap. Each post walks through one project in enough detail that another engineer, looking at a similar problem, can borrow the parts that apply and adapt the parts that don't. The format is deliberate: each piece starts with the problem statement (what business problem was being solved, under what constraints), moves through the architecture and stack decisions with rationale, covers the specific things that went wrong, and ends with a retrospective on what would change.

The pieces aren't anonymous — they're about projects I worked on personally, with the technical details accurate to what was actually built. Some client names are redacted for confidentiality, but the engineering decisions and outcomes are described faithfully.

What you'll find here

The posts in this category cover five broad areas: the format itself (so you know what to expect), the architectural patterns that come up repeatedly, stack decision rationale in hindsight, the specific bugs and incidents that taught me the most, and the migration stories that are usually the longest sections of each piece.

Format and structure — each deep-dive follows the same shape: problem statement, constraints, architecture overview, stack rationale, what went wrong, what got migrated, and the retrospective
Architecture patterns — multi-tenant SaaS shapes, event-driven backends, RAG-augmented apps, B2B platforms with admin panels, and the shapes I've converged on after multiple iterations
Stack decisions in hindsight — Next.js vs Express, Mongo vs Postgres, monolith vs microservices, the choices that worked and the ones that didn't
Bugs that wasted weeks — race conditions, cache invalidation, third-party SDK quirks, the specific failures that took longest to diagnose
Migration stories — moving stacks, splitting services, joining services back together, the patterns for migrating without breaking everything in production

The format: what each deep-dive will look like

Every post in this category follows the same structure, because consistency makes the pieces easier to read and easier to compare. Each deep-dive runs 2000–4000 words and includes a problem statement, architecture diagram, stack rationale, key code patterns, observability snapshot, and a "what I'd change" retrospective. The structure forces honesty — it's harder to skip the part where you admit a stack choice didn't work when the format includes a dedicated retrospective section.

The problem statement comes first. What was the actual business problem? Who was the user? What were the constraints — budget, timeline, team size, existing systems to integrate with? Without this context, architecture decisions look arbitrary. With it, they make sense (or don't, which is also useful).

Architecture overview comes next, with an actual diagram. Not a generic "frontend → backend → database" diagram, but the specific systems involved, the data flows, and the points of failure. Where applicable, alternative architectures considered get a brief mention with the reason they weren't picked. The diagrams are designed to be readable on a phone, because most readers will skim on mobile before opening on desktop.

Stack rationale gets its own section. Why this database? Why this hosting provider? Why this auth library? The rationale matters because the reader's project will have different constraints and they need to evaluate whether the same decision applies. Posts here always include the alternative that almost got picked and what tipped the decision.

The bugs that wasted weeks

Every project has the "remember when we lost three weeks to that one bug" stories. They're the most useful part of engineering knowledge because the bugs follow patterns. The bug you wasted weeks on isn't unique — somebody else is about to waste weeks on the same one, and a clearly written post-mortem can save them that time.

The patterns I see most often: race conditions in async code (especially in Node.js where the concurrency model encourages them), cache invalidation logic that's correct in isolation but wrong in combination with the rest of the system, third-party SDK quirks that aren't documented (Stripe webhook ordering, OAuth flows that work in dev but fail in production due to redirect URI mismatches), and the specific category of bug that only shows up under sustained load.

The deep-dives in this category include detailed post-mortems for the specific bugs that came up in each project. Not "we had a race condition and we fixed it" — but "here's the specific code shape that caused the race condition, here's the production trace that finally led us to find it, and here's the pattern we now use to avoid the whole category." That's the level of specificity that transmits engineering knowledge.

The retrospective section in each deep-dive explicitly covers what the team would change. Sometimes it's a stack choice. Sometimes it's a process — "we should have done load testing before launch." Sometimes it's an architectural decision that was correct at the time but wouldn't be made the same way now. The honesty in this section is what makes the deep-dives useful, and it's the part most published "how we built X" articles refuse to do.

Migrations: where the longest sections live

Most non-trivial projects accumulate at least one significant migration: a database change, a stack swap, a service split, a hosting move. These migrations are where the most engineering knowledge is generated and where the deep-dive format pays off the most. They're also where most published engineering content falls short — migrations get summarised in a paragraph that doesn't transmit any of the actual difficulty.

The migrations covered in this category get the level of detail they actually deserve. The specific shape of "expand and contract" schema changes for MongoDB and Postgres. The blue-green deployment patterns for VM-based hosting. The dual-write phase for switching between two databases without downtime. The feature flag patterns for migrating user populations gradually. Each gets a worked example from a real project, with the specific bugs that came up.

The cost of migrations is the part most discussions skip. Engineer hours spent, calendar weeks elapsed, cloud spend during the dual-running phase, the productivity cost of the team being distracted from feature work. The deep-dives include these numbers when they're available and discuss them honestly. Migrations are expensive; pretending otherwise leads other teams to underestimate their own.

The decision framework for "should we migrate at all" deserves a dedicated section in each migration deep-dive. The migrations that worked were the ones where the team did the cost-benefit analysis honestly, with realistic estimates for the migration cost and the ongoing cost of not migrating. The ones that didn't tended to be either "we migrated for resume-driven reasons" (rarely worth it) or "we kept the old thing too long" (always more expensive than the migration would have been).

The retrospective: what I'd change

The retrospective section at the end of each deep-dive is the most honest part. With the benefit of hindsight, what would the team do differently? Sometimes the answer is "nothing material — the decisions were correct given what we knew." More often it's "we'd pick a different database" or "we'd skip the microservices stage entirely" or "we'd invest in load testing earlier." Each of these answers transmits real engineering knowledge.

The retrospectives also call out the decisions that were criticised at the time but turned out to be correct. Engineering is full of contrarian decisions that look bad in the moment but vindicate themselves later. Being honest about both directions — wrong decisions that looked right, right decisions that looked wrong — is the most useful thing a deep-dive can do.

The format of the retrospective: three bullet points for "what I'd change," three for "what I'd keep," three for "what I'd add (if budget allowed)." The constraint forces clarity. The bullets are then expanded into paragraphs in the body of the section. Readers can skim the bullets or read the prose, depending on how deep they want to go.

How the architecture diagrams in these posts actually get made

A meta point that comes up often: how do the architecture diagrams in these deep-dives get drawn, and why don't they look like generic boxes-and-arrows? The answer is a deliberate choice. The diagrams in this category use a consistent visual language across posts, designed to make scanning fast and to surface the right information.

The tooling: Excalidraw for early drafts, Figma for the final published versions, with a shared component library for the recurring shapes (services, databases, queues, edge functions). The visual hierarchy follows traffic flow — sources on the left, data stores on the right, queue/async work explicitly called out with their own shapes. Failure points are marked with red dots so readers can spot them at a glance.

The constraint that matters most: each diagram must be readable at phone size. Most readers will skim deep-dives on mobile before opening them on desktop. Diagrams that need a 4K screen to parse don't work. The discipline of designing for the small screen first usually improves clarity for desktop readers too — there's less room for incidental detail.

The captioning style is also consistent across posts. Each diagram has a one-sentence caption explaining what it shows, plus inline annotations for the parts that need more context. Posts here avoid the "here's a diagram, scroll down for the prose explanation" pattern in favour of diagrams that are mostly self-contained. The prose elaborates rather than restates.

The history of how this style evolved is its own post. The early deep-dives had diagrams that were too dense and required too much surrounding context. Reader feedback pushed me toward the current style. Posts here include the specific rules I follow, in case you want to adopt them for your own technical writing.

What's coming in this category

The first deep-dives on the docket: the architecture and migration story of a SaaS platform I migrated from a monolithic Express app to a Next.js + queue-based architecture, the build-out of an AI-powered tool that used RAG over a domain-specific corpus (with the specific failures that forced rewrites), a B2B platform with multi-tenant data isolation and the specific Postgres row-level security pattern that made it manageable, a mobile app that had to integrate with a legacy SOAP API (yes, in 2024), and a marketplace that had to handle peak traffic 50x the baseline during launch periods.

Each deep-dive will take time to write properly. The goal is one per quarter, not one per week — depth over cadence. The longer cadence is deliberate: writing a 4000-word post about a real project takes a few weeks of evenings, and rushing the work degrades the signal. If there's a specific kind of project you'd like to see a deep-dive on, the contact form on the homepage works, and project suggestions go directly into the queue.

Project Deep-Dives