Back to portfolio Case study · CaseAtlas
Case study 01 · 2025 · Solo full-stack build

CaseAtlas. 30,000+ ways to practice strategy.

A free, AI-powered case study simulator. Built solo because good preparation for strategy and consulting interviews should not cost $500 a year. Adopted by a group of consultants and MBA students who used it, gave direct feedback, and rewrote my roadmap for v2.

Role
Founder, PM, full-stack builder
Stack
Next.js 15, Supabase, Mistral, Vercel
Status
● Live · v2
Live at
01 · The problem

The existing tools were expensive, rigid, and fundamentally static.

Practicing business cases is how you get into consulting. The problem is that the tools available to practice with are subscription-based, typically $30 to $60 a month. What you get is a library of pre-written cases and, in better implementations, a chat interface that simulates a live interview.

The cases are fixed. The difficulty is fixed. The industry, function, and business tension are whatever the case writer chose. If you want to practice a cost-optimization case in healthcare, you take whatever the platform has in its library. If you want to go deep on a specific tension type, you either find it or you don't practice it.

The deeper problem is that the content gets stale fast. Once you've seen a case, you've seen it. For someone doing serious prep over weeks or months, the effective size of a fixed library shrinks quickly.

Anyone who has been through a consulting prep cycle knows these limitations and pays the subscription anyway, because there was nothing better.

02 · Research & observations

A consistent pattern across seven to ten early users.

My early users were business consultants and MBA students from my immediate network, actively preparing for interviews or staying sharp. I had direct access to their frustrations because they were the same as mine.

I also looked at what the paid tools were genuinely good at: clean layouts, a sense of progression, real cases sourced from credible frameworks. Those were worth keeping as a benchmark. The weakness was configurability. The user had no control over what kind of case they got.

The insight: AI changes this constraint entirely. If the case is generated fresh for every session, the library is infinite. The only question becomes whether the generation quality is good enough to be worth practicing with.

03 · Product vision

An infinite library, calibrated on demand.

You tell the app what you want to practice. It builds the case, right now, calibrated to your level. — Product north star, written before line 1 of code

Four configuration axes: business function, industry, business tension type, and region. Three difficulty levels. The user assembles parameters and gets a unique, structured case they have never seen before, because it did not exist until they asked for it.

The math behind 30,000+ combinations is real, but the point is not the number. The point is that the user is never stuck repeating the same case twice, and the configuration options cover the actual dimensions that matter in a real interview.

04 · What shipped

v1 validated the premise. v2 shipped what users asked for by name.

v1 · the narrow validation

The first version was narrow by design. A configurator across the four axes plus difficulty. An LLM pipeline that generated a structured case with company background, strategic challenge, stakeholder perspectives, supporting data and exhibits, and three to five discussion questions. A clean reading layout that felt like a Harvard-style case, not a chatbot response.

The goal was to validate one thing: can AI generate a case that is actually worth practicing with? The answer had to be yes before anything else mattered.

v2 · what real users surfaced

v2 · 01

Accounts & case history

Google OAuth, saved case history, with a full lifecycle: active, archived, soft-delete with 90-day grace period. v1 had no persistence. Users wanted to find their cases again.

v2 · 02

Case versioning

Practice the same scenario at different difficulty levels. Start at Foundational to learn the structure, return at Advanced to stress-test the analysis. Variants stay linked to the original.

v2 · 03

AI answer evaluation

Step-by-step answer submission, scored across six dimensions: structured thinking, quantitative reasoning, business judgment, communication clarity, creativity, and synthesis. Per-question feedback on strengths and gaps.

v2 · 04

Progress tracking

Score history across attempts, a progression chart, and a skill radar. Turns CaseAtlas from a one-off generator into a tool with a real feedback loop.

v2 · 05

Framework panel

Strategic frameworks (Porter's Five Forces, SWOT, BCG, others) surfaced contextually on the case page. Lightweight reference. Quietly the most-loved addition.

v2 · 06

PDF export

Cases downloadable for offline practice or sharing with a peer. Small but kept being requested.

05 · The hard problem

Building the app was easy. Getting the model to write good cases consistently was not.

The failure mode was prompt drift. A well-tuned prompt that produced excellent cases for technology strategy scenarios would produce thin, generic output for niche combinations: a logistics cost-optimization in Central Europe at Advanced difficulty, for example. The model would satisfy the format while missing the substance. Cases that looked structured but had no real analytical tension.

Three things solved it.

● Fix · S1

Structured output with schema validation

The generation pipeline uses a Zod schema as a contract between the prompt and the application. The model returns JSON matching a defined shape, section by section, field by field. If the output does not validate, it is rejected. This forces the model to be specific rather than vague — because vague outputs tend to fail schema checks.

● Fix · S2

Retry logic on parse failures

Mistral is non-deterministic. Even with a well-formed prompt, occasional responses come back malformed. The pipeline retries up to two times before surfacing an error. In practice, the retry resolves the problem almost every time, and the user never sees it.

● Fix · S3

Difficulty-calibrated prompting

The prompt does not just pass the difficulty as a label. Each level has its own instruction set. Foundational cases establish context clearly and ask direct analytical questions. Advanced cases embed ambiguity, trade-offs with no clean answer, and qualitative data requiring interpretation. Difficulty is baked into the prompt logic, not left for the model to infer.

The broader lesson is how to think about AI in a product. The model is one stage in a pipeline, not the whole product. Reliability comes from building structure around the model, not from hoping the model behaves consistently on its own.

06 · Product decisions

What I cut, and why.

● Cut · A1

Leaderboards and peer score comparison

An obvious social feature. Would have added stickiness. I cut it because it changes the incentive in a way that undermines the core purpose. The product is a practice tool. Optimizing for a visible score relative to other people is not the same as optimizing for getting better at case interviews. The two can look similar and diverge sharply. I wanted CaseAtlas to stay in the service of improvement, not performance.

● Cut · A2

A pre-built case library

The alternative to on-demand generation is curating a library of good cases. I considered it and dropped it immediately. A library turns CaseAtlas into the same kind of product it was built to replace: finite, static, stale. Configurability is the differentiator. As soon as you have a fixed library, you're competing on catalog size against platforms that have been building theirs for years.

● Cut · A3

A paid tier from day one

At several points the obvious move was to gate something behind a subscription: evaluations, advanced cases, account creation. I did not, and I would make the same call again. The product was built because expensive tools were the problem. Introducing a paywall before establishing that the product is genuinely better would have been inconsistent with the reason it exists. Free at the point of use is not a business-model decision at this stage. It is a product principle.

What surprised me

The framework panel turned out to matter more than I expected. I added it as a lightweight reference, something users could glance at without leaving the page. What I found was that it changed how people engaged with the case. Frameworks visible meant active rather than passive reading. Users started mapping case content against frameworks rather than just reading through. A small addition that changed the quality of the practice session.

07 · Distribution & adoption

Land on the page, click Sign in with Google, generate a case.

CaseAtlas is a web app at caseatlas.org. No install step, no platform fee, no native app. Users go to the site, create an account with email or Google, and start generating cases.

That simplicity was deliberate. The people I built this for are comfortable opening a browser. A web app with Google Sign-In removes every barrier between having the idea to practice and actually practicing. The onboarding funnel is three steps.

The early adopters came from my immediate network. Consultants and MBA students actively prepping or staying sharp. They didn't need convincing. They had experienced the same frustrations and a free configurable alternative was an easy switch.

08 · Iteration from real feedback

v2 was written by early users.

Each feature below traces back to a moment where someone hit the edge of v1 and told me what came next.

Observation · returning users · v1 had no persistence
"Where are my old cases?"

Accounts & case history

Nobody asked for "user accounts" as a feature. They asked to find their cases again. Accounts were the implementation, not the request. Same with archive and soft-delete: people wanted a sane way to manage a growing list without losing anything by mistake.

Observation · users reading cases without a feedback loop
"I answered the questions. Were the answers any good?"

Answer evaluation, six-dimension scoring

The discussion questions were useful. The feedback loop was missing. Evaluation closed the gap. The six-dimension scoring came from working backwards from what a real case interviewer actually evaluates — not what's easy to score.

Observation · users with multiple evaluation sessions
"Am I actually getting better?"

Score progression & radar chart

A single score tells you how you did. A chart across sessions tells you whether the practice is working. The radar surfaces where you're improving and where you've plateaued.

09 · Product in action

Configurator, case, evaluation.

Drop your real screenshots onto the slots below. (Phone-sized for now; can be swapped to wider browser frames if preferred.)

The configuratorPick business function, industry, tension type, region, and difficulty. The case is generated on the spot.
The caseCompany background, strategic challenge, stakeholder perspectives, supporting data, and three to five discussion questions. Frameworks panel on the side.
The evaluationSix-dimension scoring, per-question feedback, and a radar across structured thinking, quant, judgment, clarity, creativity, and synthesis.
10 · Outcome

Built solo. Live. Used.

~3 wks
v1 to working product
30k+
Case combinations
6
Evaluation dimensions
$0
Paywall, by design

v2 features were scoped and shipped based on direct feedback from the first group of users. The pipeline architecture — schema validation, retry logic, calibrated prompting — generalizes to any AI feature that needs to be reliable rather than impressive. Built and shipped solo across product decisions, infrastructure, AI pipeline, frontend, auth, admin tooling, and analytics.

11 · Lessons learned

What this taught me about shipping AI products.

Free is a product decision, not just a pricing decision

Choosing not to charge is a statement about what the product is for. Here, expensive tools were the problem being solved. The product had to be free to be coherent. That constraint shaped every other decision.

Treat AI as a pipeline, not a feature

One-shot prompts to an LLM are a prototype. A production AI feature is a system: inputs validated, outputs structured, failures handled, quality enforced. The difference between a demo and a product is what happens when the model returns something unexpected.

Let real usage rewrite the roadmap

I had a backlog. My early users had a different one. The features that made the biggest difference in v2 were not on my original list. They emerged from watching people use the product and listening to what they said when it ran out of road.

Scope discipline is how you ship

CaseAtlas doesn't have a leaderboard, a case library, or a monetization layer. Those were all reasonable ideas. Cutting them is what made it possible to ship a product that does its one job well instead of a half-built product that does four things poorly.

12 · PM skills demonstrated

A solo product. Full-stack muscles.

Problem identification
Recognized a structural gap in existing tools — not just personal frustration, but shared pain across a specific user segment.
User research
Observed real prep behavior in a known user group; mapped workflows and failure points before writing code.
Scope discipline
Cut leaderboard, case library, and paid tier gating to stay focused on the core use case.
AI product thinking
Treated the LLM as one stage in a reliability pipeline, not the whole product.
Prioritization
Shipped a narrow v1 to validate core value; deferred everything else until real usage informed what came next.
Feedback loops
v2 feature set came directly from early sessions. Accounts, evaluation, and progress tracking were all user-surfaced.
Rapid execution
Solo full-stack build: product, infrastructure, AI pipeline, auth, admin tooling.
Technical judgment
Schema validation, retry logic, and calibrated prompting as the answer to AI non-determinism.
Product coherence
Every decision — free pricing, no paywall, no social layer — serves the same principle: practice should be accessible and honest.
Distribution thinking
Web-first with Google OAuth as the simplest path from intention to first case.

Want to try it? It's free.

Live at caseatlas.org. No install. Sign in with Google, configure a case, practice.

Next case study

Slack & the K-12 cyber program · 17 to 57 boards

Built with Next.js 15, Supabase, Mistral, and Vercel. Designed around one principle: preparation for high-stakes work should not be gated by the price of a subscription.