Raj Kumar Mandal — Product Designer

Detailed

Deep Dive — The Full Story

Section 1: Overview

Craft is an AI-powered presentation engine that automates the creation of brand-compliant, personalized sales decks for banks and insurance companies. I was the sole designer — involved in every decision from project scoping and entity modeling to wireframing and detailed UI. The core design challenge was decomposing a deeply unstructured, expert-driven content creation process into a programmable system without losing the nuance that makes content good.

The domain was unforgiving: financial services content carries regulatory weight, brand sensitivity, and audience specificity all at once. There was no clean playbook for turning that into a repeatable AI workflow. That meant starting from first principles — mapping the expert process, identifying what could be structured, and designing the system architecture before touching a single screen.

Section 2: The Linear Scaling Trap

SharpSell serves nearly 100% of Indian banks and insurance companies. The platform's flagship value wasn't just software — it was a service: custom, brand-compliant sales presentations, built to spec for each client's products and audience. Every deck was handcrafted by our internal design team of 13 people. That was the business model, and for a while, it worked.

Then it stopped working.

The math that didn't work

The demand side was growing steadily: 40–50 new presentation requests per month, driven by existing banks launching new products and a continuous flow of newly onboarded clients. On the supply side, only 3–4 designers out of 13 could be allocated to deck work at any given time — the rest handled other product and marketing tasks. Each designer could manage roughly 6–7 decks simultaneously, factoring in the approval cycles and back-and-forth with account managers. That gave us a real capacity of around 20–25 decks per month — against a demand of 40–50.

The result: a 30-day backlog for personalized decks. Deals were going cold before pitches could even happen. And the bottleneck wasn't slow designers. It was the manual communication loop itself. Every review cycle added ~48 hours. Multiply that by four or five rounds per deck, and you have a structural problem that no amount of hiring could sustainably fix.

What Craft was NOT designed to do

Before going further, three explicit non-goals:

Not a visual design tool. Craft doesn't generate graphics, illustrations, or layout compositions from scratch. It works within existing templates.
Not a replacement for the design team's template work. Designers still create and own the master templates. Craft automates what happens after the template exists.
Not fully autonomous. Human review gates are built into the workflow by design — not as a concession, but as a deliberate product decision.

The strategic reframe

"We weren't building a presentation tool. We were building a Design Automation Protocol. The goal wasn't to help designers draw faster — it was to remove the need for designers to 'draw' at all for 80% of the work, moving them into the role of Template Architects."

That reframe changed everything about how we scoped the system.

The Scalability Paradox

The hardest constraint wasn't technical — it was conceptual. The system had to be strict enough to be trusted by compliance and brand teams: HDFC Blue is not Federal Bank Blue, and a wrong legal disclaimer doesn't just look bad — it can trigger regulatory fines. At the same time, the system had to be flexible enough to handle wildly varying content lengths, product categories, industries, and presentation purposes.

Strict enough to be safe. Flexible enough to be useful. These two demands pull in opposite directions, and every architectural decision in Craft — the entity model, the gate system, the edge case handling — was shaped by this tension.

Stakeholder tensions

Stakeholder	Pain	Need
Design team (13 people)	Burnout from repetitive pixel-pushing	An assistant, not a replacement
Product/Eng head	High headcount cost, technical debt	Deterministic AI — same input, same output
Bank sales reps	Losing deals waiting 30 days	Speed — pitch-ready deck in 60 seconds
Compliance officers	Wrong font/logo/disclaimer = fines	A system that cannot break brand rules
IT/Security heads	Data privacy concerns	Secure handling, human-in-the-loop

Section 3: Solo Designer, Every Decision

As the sole product designer on Craft, I was embedded in every layer of the work — from scoping what the system should and shouldn't do, to modeling entity relationships, to designing the detailed UI that content writers and admins would actually use. There was no handoff between "strategy designer" and "UI designer." No research team feeding me insights and no systems designer handing me an architecture. I was in the room for all of it, working directly with the PM.

What I owned

Product scoping — defining what Craft solves, what it doesn't, and where the system boundaries sit
Stakeholder and user research — understanding the content writers filling decks, the admins managing configurations, and the graphic design team whose manual work we were systematizing
Entity modeling — defining the object-oriented data architecture: Presentation Types, Templates, Layouts, Content Structures, Elements, and their interdependencies
System architecture — the 3-layer system (Governance, Structure, Matching) and the 7-step state machine workflow that drives every job from brief to output
Content co-pilot design — the AI-assisted experience for content writers: the step-by-step flow, the prompts, the guidance, the inline review checkpoints
Job flow design — every step, gate, edge case, and human-in-the-loop checkpoint across the full job lifecycle
Admin experience — how admins configure brand DNA, upload and manage templates, and set up Presentation Type configs
Detailed UI — high-fidelity screens across all of the above

How I worked

The PM and I operated as a tight loop. We'd align on a problem, I'd go deep on the system design — entity relationships, decision logic, state transitions — and bring back a structured proposal. We'd stress-test it together, poke at edge cases, challenge the model. Then I'd move into detailed design. This rhythm repeated throughout the project. There was no "handoff" phase; the design and the thinking evolved in parallel.

OOUX as a practice

Object-oriented UX is part of SharpSell's design DNA — it's how we approach complex product problems. Craft was the most demanding system I'd applied it to. The sheer number of entities, the layered interdependencies between them, and the overarching constraint that the system had to be simultaneously strict (brand compliance, regulatory accuracy) and flexible (varying content types, industries, presentation lengths) pushed my OOUX practice further than any previous project. Defining the objects before the screens wasn't just a method — it was the only way the system made sense.

Section 4: Immersing in the Content Machine

Before I touched any interface design, I needed to understand how the content and design team actually worked. Not at the process level — I already had a rough map of the steps — but at the decision level. What were the dozens of micro-judgments being made at each stage that never showed up in any brief or handoff doc?

The critical insight came early: the product team understood the process the content team followed, but we didn't understand the decisions embedded inside it. Which content to prioritize when the raw material is too long. When to expand a sparse brief vs. flag it back to the account manager. How to structure sections for a private bank launching a savings product vs. an insurance company selling term life. When the content already follows an established pattern vs. when the presentation calls for a different structure. When to push back on a client brief and when to work with it.

This gap — between knowing the process and knowing the decisions within the process — would become the central design challenge. And ultimately, it's the reason we paused (more on that later).

The content creation process I mapped

1. Get the client brief Is the brief specific or generic? Does the client have strong opinions about structure or tone? Is this a first-time client or a repeat with established preferences? 2. Get raw content Is there raw content at all, or does the team need to generate from scratch? Is there too much, or too little? Can sections be expanded, or must the team stay close to source material? Which features get prioritized when there's a conflict? 3. Structure the content How technical is the product? What structure works best for this industry and audience? When should the team follow a well-worn pattern, and when does the content demand something different? 4. Write the copy Section-specific tone and language guidelines, deduplication rules, reading level calibration, message hierarchy within each slide. 5. Review cycles How detailed is the feedback? Is it additive (build on what's there) or an overwrite? At what point does the team push back on a direction rather than execute it?

The key realization

Each of these steps contained multiple hidden decision paths that branched based on context — the client, the content type, the industry, the purpose of the presentation. This wasn't a linear pipeline. It was a decision tree with years of domain expertise baked into every branch.

The design challenge became precise: how do you encode tacit expert knowledge into a programmable system without flattening it into rules so rigid they break on contact with real content?

Section 5: Thinking in Objects — The Entity Architecture

Before I opened Figma, I opened a blank Notion table.

That's not a designer thing to say. But it's the truest description of how Craft got built. Because the most important design work I did on this project wasn't visual — it was conceptual. It was figuring out what things exist in this system, what they're made of, and how they relate to each other. Get that wrong, and no amount of polished UI saves you. Get it right, and the screens almost design themselves.

This is the core of object-oriented UX: define the objects before defining the experience. For Craft, the entity model wasn't a deliverable at the end of research. It was the foundation everything else was built on.

Why the entity model had to come first

Most presentation tools treat a slide as a single thing. You've got a canvas, you put stuff on it, done. That works when a human designer makes every judgment call in real time. It completely falls apart when you're trying to make a machine do it deterministically — across hundreds of clients, dozens of product types, and strict brand guardrails.

For Craft to work as an automated system, I needed to decompose "a slide" into its constituent parts: what's the structural logic here? What's the visual logic? What's the content logic? These three things look like one thing when a human designer does them simultaneously. They are three completely different concerns when a system has to handle them independently.

That decomposition led me to five core entities.

The five objects

1. Presentation Type (P-Type) — the DNA

The P-Type is the brain of the system. It doesn't contain content. It doesn't contain design. It contains instructions — a specification for what kind of presentation this is and how it should be built. A "Product Brochure for Savings Accounts" is a different P-Type than "Investor Update" or "Learning Flashcard Deck." Each has different rules for how a brief should be structured, what sections are expected, and how copy should be written for each section.

A P-Type carries: the brief property schema (the questions to ask about this specific presentation type), the outline structure (which sections are mandatory, which are optional), section-level content guidelines (voice, density, what to emphasize), and content handling heuristics (how to deal with sparse or overly long source material).

Think of the P-Type as the DNA. It doesn't create the presentation — it tells the system how to create it. Every job that runs through Craft begins with a P-Type, and nothing that happens downstream is possible without that specification already being in place.

2. Content Structure — the skeleton

If the P-Type is the DNA, the Content Structure is the skeleton of a single slide. It defines what can exist on that page — which elements are present, how many of each, and which ones are required. No styling, no color, no layout choices. Pure structure.

A Content Structure belongs to exactly one P-Type. And a P-Type has a list of Content Structures — one for each slide type the presentation needs. Think of it like the HTML of a website: the bones of the page, stripped of everything visual. A Content Structure might say: "this slide has one headline, two to four bullet points, and an optional image." That's the contract. Everything downstream has to honor it.

3. Template — the visual identity container

The Template is where brand lives. It's a coherent collection of layouts sharing a single design system — colors, typography, spacing, component styling. A Template for HDFC Bank looks entirely different from one for SBI Life, but both are Templates in the same sense: a set of visual decisions that have been made, packaged, and locked in.

The Template's relationship to P-Types is many-to-many. One Template can be applied to multiple P-Types (the same brand's design can support both a product brochure and a training deck). One P-Type can have multiple Templates applied to it (the same presentation type can be styled for different clients). This flexibility is deliberate — it's what makes the system scalable across clients without rebuilding from scratch for each one.

4. Layout — the visual realization

The Layout is a single, fully designed slide. It's the HTML and CSS: the Content Structure made visual. Every Layout belongs to exactly one Template, and it's built to realize one specific Content Structure — taking that structural contract and applying design on top of it.

But Layouts carry an additional attribute that becomes critically important later: capacity. A Layout knows what it can hold. "This layout fits one headline, three bullets, and one image." That number is real and binding. It's not a soft guideline — the system uses it to validate whether the content actually fits before ever generating an output.

5. Element — the atomic unit

Elements are the smallest thing that exists in the system. A heading. A bullet point. A subtitle. An image placeholder. A data label. Elements live inside Content Structures (as structural definitions) and inside Layouts (as designed components). They're the currency that flows between structure and visual — the thing that makes the structural contract and the visual realization talk to each other.

How the objects relate

P-Type <----many-to-many----> Template
  |                              |
  | has many                     | has many
  v                              v
Content Structure              Layout
  |                              |
  | has many                     | has many
  v                              v
Element                        Element

The relationship map looks simple. The implications aren't. The most important constraint in the entire system lives at the intersection of Content Structure and Layout: when a Template is mapped to a P-Type, every Content Structure in that P-Type must have at least one matching Layout in the Template. No Content Structure can be left without a visual counterpart. No structural promise can go unfulfilled. This validation rule is what guarantees that a presentation can actually be generated — that the system will never reach a slide and find nothing to put it in.

Some Layouts can exist in a Template without mapping to any Content Structure. That's fine — they're available but unused. But the reverse is never allowed. An orphaned Content Structure is a broken promise the system cannot keep.

The edge case that tested the model

Here's where the architecture faced its first real stress test. What happens when a Content Structure specifies up to five bullet points — and the best available Layout in the matched Template only has capacity for four?

The naive answer is: scale it down. Trim the fifth bullet, pick the closest layout, and ship it. That's what a generic AI presentation tool would do. But in a brand-compliance context, that answer is wrong. Silently hiding content, or quietly choosing a layout that doesn't match the structural spec, is worse than surfacing the problem. A compliance officer who discovers their approved content was truncated by the system isn't going to trust that system again.

So we chose explicit error reporting over silent adaptation. The system surfaces all unmapped or capacity-mismatched items at the end of the generation process. The designer downloads a CSV that details the exact issues: which Content Structures have no matching Layout, which ones exceeded capacity, and what the gap is. They fix the mismatches — either adjusting the content spec, building a new Layout with the right capacity, or updating the mapping — and re-upload. The next run is clean.

This decision is a design philosophy, not just an edge case handler. Determinism requires honesty. A system that makes invisible decisions erodes trust faster than one that asks you to fix something.

Why this architecture is what makes Craft a product

A generic AI presentation builder takes content and throws it at slides. It makes guesses. It adapts on the fly. Sometimes the output is fine; sometimes it's subtly wrong in ways that only an expert would catch — and in regulated industries, "subtly wrong" can mean a legal problem.

Craft's entity architecture removes that category of error entirely. Structure is separated from style. Capacity is explicit. Validation happens before generation. The same P-Type, the same Template, the same content — you get the same output. Every time. No structural hallucinations. No layout guesses. No silent omissions.

This is the difference between a tool that assists and a system that can be trusted. The entity model is where that trust is built — not in the UI, not in the prompts, not in the AI model. In the objects, their properties, and the contracts between them.

Defining these five entities — and the rules that govern their relationships — was the most consequential design decision on the project. Everything else followed from it.

Section 6: Three Layers — Governance, Structure, Matching

The entity model defines what exists. The system architecture defines how it all works together. Once I had the five objects and their relationships mapped, the next question was: how does this actually run? What keeps brand DNA from bleeding into content logic? What stops a template change from unraveling a presentation type that was working?

The answer was deliberate separation. I designed Craft as three distinct layers, each with a clear and bounded responsibility. This separation wasn't just conceptual cleanliness — it had a practical payoff. Each layer could evolve independently. You could change how brand rules are managed without touching how content gets matched to layouts. You could update a template's visual style without rewriting a P-Type's section logic. Decoupling at the architecture level is what makes the system maintainable as it grows.

Layer 1: Governance — The Source of Truth

Governance is where brand DNA lives. It's the admin layer — the place where the rules that define a client's identity are captured in a form the system can actually consume and enforce downstream.

A brand admin configures:

HEX codes, font files, and logo safe zones
Company dos and don'ts — text tone, visual style rules, what to avoid
Image style guidelines — when to use icons vs. illustrations vs. photos
Legal disclaimer requirements specific to each client or product category

The design challenge here was fundamental: how do you create an interface for admins to "dump" brand data in a way the system can actually reason about? Unstructured brand guidelines — the kind that live in PDF brand books — had to become structured, queryable properties. I designed the Governance layer as a property-based system: every brand rule is a discrete, typed input, not a free-text document the AI tries to interpret at generation time. The difference matters enormously. An AI that interprets a loose brand doc mid-run is a liability. A system that checks a brand rule against a typed property is deterministic.

Layer 2: Structure — The Ingestion Engine

Structure is where presentations get deconstructed into programmable parts. This layer is where P-Types, Content Structures, Templates, and Layouts get defined and configured — the domain of the template architect, not the content writer.

Three things happen at this layer:

P-Types define the skeleton — what sections are expected, what brief properties apply, what rules govern each section
Content Structures define what goes on each slide semantically — "this isn't just text, it's a Headline Type A with three supporting bullets and an optional image"
Templates and Layouts define the visual containers with explicit capacity attributes

The design challenge was a mental model shift. Designers coming into this layer had to stop thinking "I'm designing a slide" and start thinking "I'm designing a reusable container with defined capacities." That's a different cognitive frame entirely. The upload and configuration flow had to be intuitive enough that designers didn't need to think in code — but structured enough that the system could reason about what fits where. The two demands are in tension, and almost every UI decision in the Structure layer was shaped by that tension.

Layer 3: Matching — The Logic Gate

Matching is the engine. It's where content meets structure meets design — and where the actual generation happens.

Three things make up the Matching layer:

A rule engine that runs 1:1 validation: every page in a P-Type must find a Template match before generation proceeds
Brand rules applied as constraints, not suggestions — the Governance layer's properties surface here as hard checks
Conditional logic: IF page elements == template placeholders AND brand rules == applied THEN generate slide

The design challenge was failure handling. What happens when matching fails? I designed failure states to be actionable, not just error messages. If nine out of ten pages map successfully, the system doesn't halt entirely — it flags the one that failed and surfaces exactly why. The designer fixes only that exception. This is where the 70% time saving actually lives. The system handles the repetitive successful work; humans handle the edge cases.

The data journey across all three layers

Ingestion — Admin uploads a presentation type; the system decomposes it into atomic components: heading, pointer, image, disclaimer
Standardization — Designer uploads templates; the system defines capacity and spacing rules for every layout
Matching — Engine runs conditional checks across all three layers simultaneously
Human loop — Designer reviews the result, intervenes only where matching failed

The layers don't just separate concerns — they separate accountability. Governance issues belong to brand admins. Structure issues belong to template designers. Matching failures belong to whoever configured the P-Type. When something breaks, the system knows where to look. 3-layer system architecture

Section 7: Designing the Pipeline — A State Machine with Human Gates

The three layers define the system's architecture — what exists and how it's organized. The pipeline defines how work moves through that architecture. And the most important design decision in the pipeline wasn't about AI capability or interface layout. It was about the fundamental shape of the workflow itself.

A wizard assumes linear progression. Step 1, then Step 2, then Step 3. That model breaks immediately when applied to content creation, because content creation is inherently iterative. Writers change their minds about sectioning after they see the generated copy. They add a slide mid-flow. They revise a brief property after reading the coverage analysis. A wizard can't accommodate any of that without forcing a full restart.

So I designed the pipeline as a state machine.

The state object

Every job in Craft carries a single Presentation State object. It accumulates data as the job moves through the pipeline — brief properties, raw content metadata, sectioning decisions, generated copy, lint flags. Each step reads from the state object, writes to it, and sets its own completion status. Any step can be re-run independently without invalidating the steps that came before or after. This is what allows a writer to revise sectioning, regenerate a single slide's copy, and continue — without restarting from the beginning.

The state machine gives us two things that a simple wizard cannot: predictability (same inputs, same outputs, deterministic behavior) and control (humans stay in the loop at exactly the moments that matter).

The 7 steps

Step 1: P-Type Selection (Manual)

The writer selects which presentation type this job is. No AI involvement — this is a deliberate choice, not an inference. The selection loads the entire knowledge base configuration for that P-Type: brief schema, section outline, content guidelines, handling heuristics. Everything downstream depends on this selection being correct, which is why it stays in human hands.

Step 2: Brief Extraction (AI + Human)

The system takes the master brief and any raw content provided, then extracts structured brief properties: audience, key message, tone, reading level, distribution channel. The LLM infers what it can from the available inputs; the writer fills gaps and overrides anything that landed wrong.

Design decision: some brief properties are deterministic from explicit user input — page count, font preferences, content handling mode — and bypass the LLM entirely. If a writer sets something in advanced settings, the system locks it. AI cannot override explicit human choices. This is a principle, not just a rule.

Step 3: Raw Content Meta + Coverage Analysis (AI)

Before organizing anything, the system analyzes what's actually in the raw content. It extracts metadata — topics covered, key features, product specs, claims — and maps it against the P-Type outline to produce a coverage report: which sections are well-covered, which are partial, and which are missing entirely. No writing happens at this step. It's purely analytical — telling the writer what they have to work with before anyone starts organizing it.

Edge case: what if the raw content is too sparse? Too dense? The system flags both conditions explicitly and lets the writer decide how to proceed, rather than making silent assumptions about how to compensate.

Step 4: Raw Content Sectioning (AI + Human)

This is where actual organization happens. The system takes the raw text and groups it into sections — no rewriting, no inference about meaning, just physical sorting. Step 3 revealed the coverage landscape; Step 4 does the arranging.

The UI is a tree view where writers can drag content between sections, add or delete lines, and rename subsections. The AI does the first pass; the human refines it. The interaction is deliberately collaborative rather than autonomous — the AI's sectioning is a strong first draft, not a final answer.

Step 5: AI Review and Fix (Optional)

An extra AI pass for high-value jobs — the system reviews the sectioned content, moves misassigned items, removes duplicates, and ensures critical features appear in the correct sections. Every change is tracked and surfaced for writer review.

Hard constraint: the AI at this step cannot invent new factual content. It reorganizes only. This constraint is explicit in the system design, not just in the prompt engineering.

Step 6: Copy Generation (AI)

The system generates slide-ready copy for each section, following the section-specific guidelines baked into the P-Type. Truth-sensitive sections — anything involving numbers, claims, or product specifications — are restricted to the raw content only. If data is missing, the system marks it as a placeholder explicitly, rather than hallucinating a plausible-sounding substitute.

Step 7: Copy Editor + Human Approval (AI + Human)

A lint pass runs over the generated copy: casing, punctuation, formatting consistency. The AI flags issues but does not auto-fix. The writer makes every final call. This step also serves as the formal human approval gate before the job moves to slide generation.

Hard constraint: the editor cannot move content between sections, drop features, change numbers, or introduce new claims. Its scope is presentation polish, not content judgment.

The gate system

Every step has a logic gate. Some are purely technical — does the JSON validation pass? Are all required brief properties populated? Others require explicit human approval — the writer clicks "Approve sectioning" before Step 5 can run, and "Approve copy" before generation begins.

This dual-gate design — technical validation plus human sign-off — is what makes the system both reliable and accountable. Technical gates prevent malformed state from propagating downstream. Human gates prevent the system from proceeding on assumptions the writer hasn't confirmed. Neither alone is sufficient. Together, they create a workflow that can be trusted in a regulated environment.

Section 8: Designing the Co-Pilot — AI as Assistant, Not Replacement

Section 7 described the pipeline architecture — what steps exist, how they connect, and what each one does. This section is about something different: how the content writer actually experiences the AI assistance at each step. The architecture is invisible infrastructure. The co-pilot is what the writer sees, talks to, and develops a working relationship with.

The design team's biggest fear, surfaced early in research, was that Craft would replace them. That the system would automate not just the drudge work, but the judgment — and quietly make them redundant. This wasn't an irrational fear. It's the fear that follows every AI workflow tool into a creative organization. And the way you address it isn't with a press release about "human-in-the-loop" — it's by designing every single interaction to reinforce one consistent message: the AI proposes, the human decides.

What the co-pilot helps writers do

The co-pilot's job isn't to write presentations. It's to eliminate the friction that stops writers from doing their best work:

Understand the brief — extracts structured properties from messy client inputs, asks the right questions based on the P-Type's brief schema, and surfaces what's missing before the writer commits to a direction
Work with raw content — understands the sub-industry, the product, the intended audience, and how the presentation will actually be used, so the content organization decisions make sense in context
Structure content into sections — follows the P-Type outline faithfully, respects section-level guidelines, handles deduplication without losing anything
Generate copy — produces slide-ready language section by section, following explicit guidelines rather than generating generic AI prose that sounds like it came from nowhere
Iterate quickly — regenerates a single slide's outline, adds a page mid-flow, changes title and content independently, all without breaking the rest of the job's state

Three key interaction patterns

Regeneration with context. When a writer wants to redo a slide, they don't start from scratch — and neither does the AI. The regeneration prompt includes the full existing outline, so the new slide is generated with awareness of what comes before and after it. The writer also chooses the scope: regenerate the title and outline together, or leave the title and only rework the content outline. That choice matters because sometimes the title is right and the structure is wrong, and sometimes they need to move together. Adding pages intelligently. A new page can be blank — fully manual — or AI-generated from a text input. If AI-generated, the system reads the full existing outline before writing the new slide, so the addition lands coherently in context rather than feeling like a foreign object dropped into the middle of a presentation. The writer provides a short description of what the new slide should cover; the co-pilot handles the rest with full awareness of what's already there. Advanced settings with deterministic overrides. When a writer explicitly sets a value — page count, content handling mode, any property in the advanced settings panel — that value is locked. The AI cannot suggest a different page count because it thinks the content warrants more slides. It cannot reinterpret a content handling mode that the writer has already decided. Explicit human choices are not inputs to be weighed against AI judgment. They are instructions.

The design principle underneath all of it

Every AI action in the co-pilot is visible, overridable, and explainable. The writer can always see what the AI did, understand why, and change it. There are no black boxes in the workflow. No moments where the system made a quiet decision and moved on. No outputs that appear without a clear record of what went into them.

This isn't just a philosophical stance — it's the practical condition for trust in a regulated industry. Writers who can see and override every AI action are writers who feel ownership over the output. That sense of ownership is what makes the co-pilot feel like an assistant rather than an autonomous agent. And an assistant — not an agent — is exactly what this design team needed.

Section 9: Configuring the Intelligence — The Admin System

Craft's intelligence isn't hardcoded in prompts — it lives in configurable knowledge bases. The admin experience directly determines how good the output is. A poorly configured P-Type produces bad presentations. I had to design an admin system that makes complex configuration accessible without dumbing it down.

What admins configure

P-Type Master Config (per presentation type)

Every P-Type carries four layers of configuration: the brief property schema (what questions to ask, default values, inference rules), the outline structure (mandatory and optional sections, conditions, section descriptions), section content guidelines (structure rules, style rules, duplication rules, example copy), and content handling heuristics (how the AI should approach raw content for this specific presentation type). These aren't generic instructions handed to the LLM and hoped for the best — they're structured properties the system reads deterministically at each step.

Brand DNA (company level)

Color palettes, font presets, and logos live here alongside something harder to systematize: text and visual dos and don'ts. "Do not use jargon." "Keep sentences short." "Always prefer images over illustrations." "Do not mix icon styles." These rules — the kind that live in brand books no one reads — had to become typed, queryable inputs. I also designed an image style context field: detailed descriptions of when to use icons, illustrations, or photos, written in enough depth that the AI can interpret visual intent from text rather than guessing.

Templates and Layouts

Upload flows with validation — the system checks that every Content Structure in a mapped P-Type has at least one matching Layout. Layout capacity attributes so the matching engine knows what fits where. Mapping history so admins can track what changed and when, and understand why a previous configuration produced different output.

The design challenge

The entity property model I designed was documented in Notion — 40+ properties across entities, each with specific validation rules, default values, edit states, empty states, tooltips, and visibility conditions. Which properties appear in list view. Which appear in detail view. Which appear in the upload summary. Which appear in master settings. This Notion-based system became the source of truth engineering built from. It was the most detail-heavy design work on the project — not the most visually interesting, but arguably the most consequential.

Three key design decisions

Excel upload with validation. Admins were already comfortable in spreadsheets. Rather than forcing form-heavy UI, I designed an upload flow that validates against the schema and surfaces errors clearly — with the exact property, the exact row, and the exact rule it violated. This wasn't a compromise. It was meeting admins where they are, in the tool they already trusted, without sacrificing the structural rigidity the system required. Property visibility by context. Not every property matters in every view. List view shows what you need to scan quickly — name, type, status, a few key attributes. Detail view shows everything. Keeping these distinct prevented the admin UI from becoming an undifferentiated wall of fields. Admins could orient fast in list view and go deep only when they needed to. Super Admin versus regular admin. Certain properties — company-level text dos and don'ts, brand voice rules — are only editable by super admins. A regular admin configuring a new P-Type cannot accidentally overwrite the brand rules that every P-Type depends on. Scoping edit access by role wasn't just a security decision. It was a design decision about where accountability should live. P-Type list view

Section 10: The Honest Reckoning — Why We Paused

Craft worked. The pipeline ran end-to-end. The entity model held up. The state machine moved jobs through every gate cleanly. And then we showed the output to the people who would actually use it.

Content writers — the people who build these decks every day for India's biggest banks — looked at the generated copy and said: "It's decent, but not good enough to ship to a bank."

The GPT-3.5 moment

Craft was where ChatGPT was at GPT-3.5. For anyone unfamiliar with the domain, the output looked impressive. For the experts who would actually use it, it was meh. And every single one of our users was an expert. A content writer who has built 200 product brochures for HDFC Bank has opinions so specific and so refined that generic AI output doesn't just fall short — it falls short in ways that are immediately obvious to them. Being decent at many things doesn't move the needle when your entire user base has strong opinions and nowhere lower than excellent to accept.

The diagnosis

The problem wasn't the UI. It wasn't the entity model. It wasn't the workflow. The pipeline was architecturally sound. The problem was that we'd built a generic presentation builder trying to serve a specialist domain, and we'd underestimated how much of that specialist domain lives in people's heads rather than in any doc we could read.

The AI was making content decisions that should have been driven by deeply codified domain knowledge — not LLM judgment. Which product feature gets the headline treatment for a savings account brochure targeted at a semi-urban audience. How to write a disclaimer that sounds firm without making the reader anxious. When to structure a benefits section as a list versus a narrative. These aren't hard rules anyone had ever written down. They were the hundreds of micro-decisions that experienced content writers make instinctively, and that instinct had never needed to be made explicit before.

What we learned

We understood the process but not the decisions inside it. Section 4 mapped the content team's workflow — brief, raw content, sectioning, copy, review. We understood that map. What we didn't understand were the dozens of judgment calls embedded at every node of that map, invisible because they were never a decision to the experts making them. Tacit knowledge doesn't transfer by proximity. Sitting near the content team, reviewing their outputs, watching them work — none of that was sufficient. The micro-decisions were below the surface. Extracting them required a different kind of immersion: asking why at every step, writing content ourselves, getting it reviewed by the people who know what good looks like. Good content is opinionated, and generic AI can't fake that. The content that works for financial services clients is written by people with strong opinions, reviewed by people with strong opinions, and used by people with strong opinions. A system generating plausible-sounding prose doesn't pass that filter. It has to generate opinionated prose — which means those opinions have to be encoded somewhere the system can use them.

The strategic shift

We decided to pause. The product team would spend two weeks fully immersed with the content team — shadowing sessions, structured question-asking, writing content ourselves and getting it torn apart, reverse-engineering every "obvious" decision until it was explicit. The goal was to extract every heuristic and micro-decision, structure them into consumable P-Type configs, and feed them back into the pipeline. We were moving from letting the LLM make judgment calls to building an orchestration system that encodes the domain expertise the LLM couldn't infer on its own.

The system architecture was sound. It just needed richer inputs. The pause wasn't a failure — it was the system working as intended, surfacing exactly where the knowledge base layer was thin.

Most portfolios only show wins. This case study shows something more valuable: the ability to recognize when a well-designed system isn't delivering, diagnose why without blaming the design, and course-correct strategically. The entity model and state machine are still the right answer. The knowledge layer just needed to catch up.

Section 11: What I Learned

Object-oriented thinking changes how you see problems. Every messy domain has objects hiding inside it. The moment I decomposed "a presentation" into P-Types, Content Structures, Templates, Layouts, and Elements — each with clear properties and relationships — the design space became navigable. OOUX isn't just a method at SharpSell; on Craft, it was the difference between a system that could scale and one that couldn't. I won't approach a complex domain problem any other way. The hardest design problem is encoding tacit knowledge. The content team's expertise lived in their heads — years of instinct about what makes a good sales deck for a specific bank, built through feedback loops no one had ever written down. Designing a system that can absorb that knowledge without flattening it into generic rules was the real challenge. Not the UI. Not the AI integration. The knowledge layer. AI products need constraints, not freedom. The instinct with AI is to give it more latitude, more context, more room to reason. Craft taught me the opposite. The more we constrained the AI — with section-level guidelines, explicit capacity attributes, locked deterministic properties, hard rules about what it could not invent — the better and more trustworthy the output became. For high-stakes, expert-reviewed content, constraint is the product. Knowing when to pause is a design skill. Shipping mediocre output faster wouldn't have helped anyone. The content writers would have trusted it less, the account managers would have cleaned it up manually, and we'd have burned credibility while thinking we were shipping velocity. Recognizing that the architecture was right but the knowledge layer was thin — and choosing to pause rather than push — was the most important product decision we made. Designing for experts is fundamentally different. Consumer products can delight with novelty. Enterprise tools for domain experts must earn trust through precision. Every AI suggestion in Craft had to pass a single implicit question: "Would a senior content writer trust this enough to send it to HDFC Bank?" That question — held in mind throughout every design decision — is what kept the product honest.

Section 12: The Road Ahead

Craft is paused, not abandoned. The system architecture, entity model, and seven-step state machine are built and validated. The pipeline runs. The admin layer is designed. The co-pilot interaction patterns are in place. What comes next isn't a rebuild — it's a deepening.

What's concretely next

The product team immerses with content writers for two weeks: shadowing sessions, structured question-asking, writing content ourselves and having it reviewed by the people who know what shippable looks like. Every heuristic that survives that process gets structured into P-Type configs — not as loose guidelines, but as typed, queryable properties the system can enforce. Then the same pipeline runs again, with 10x richer knowledge base inputs. The architecture doesn't change. The intelligence does.

The extensible pattern

The OOUX entity model and state-machine workflow aren't presentation-specific. The same pattern — define objects with properties and relationships, build a configurable knowledge base, orchestrate through gated steps with human checkpoints — can apply to any domain where structured, expert-quality content needs to be generated at scale. Within SharpSell, the same architecture could extend to posters, quizzes, and learning modules. The P-Type equivalent for a training flashcard. The Content Structure equivalent for a poster layout. The entity model is already general enough. It's the domain intelligence layer that needs to catch up for each new content type.

What this project changed for me

Craft pushed me from designing interfaces to designing systems. The skill I developed here — decomposing a messy domain into clean objects, relationships, and orchestrated workflows — is the one I want to keep sharpening. It sits at the bridge between product design and systems thinking: you need the designer's instinct for what humans actually need, and the systems thinker's discipline to make that need programmable, scalable, and trustworthy. That intersection is where I want to grow, and Craft is the clearest evidence I've found so far that I belong there.

← Back to Work Next Case Study →