Structured Recipe Steps Beat Instruction Text
How to model cooking instructions as data for guided cooking, timers, AI assistants, and reliable recipe app UX.
Instructions are product behavior, not copy
A recipe app can display a paragraph of instructions and look complete. But the moment the product needs guided cooking, timers, prep grouping, temperature prompts, substitutions, or an AI assistant that understands what the cook is doing, instruction text becomes a bottleneck.
Builders should treat recipe steps as structured workflow data. The prose still matters for humans. The API contract needs enough structure for software to reason about sequence, timing, tools, heat, doneness, and the difference between active work and passive waiting.
Keep the readable instruction, then add structure
The safest model preserves the original step text and adds machine-readable fields around it. That gives editors and users a natural sentence while giving clients a stable shape for product features.
A practical step object can look like this:
{
"stepNumber": 3,
"phase": "cook",
"text": "Simmer until the sauce thickens, about 12 minutes.",
"action": "SIMMER",
"duration": "PT12M",
"temperature": null,
"donenessCues": {
"visual": "sauce coats the back of a spoon",
"tactile": null
}
}
The important design choice is separation. The sentence is for display. The action, phase, duration, and cues are for the application.
Sequence needs explicit order
Recipe instructions are inherently ordered. Relying on array position alone is usually fine inside one JSON response, but durable systems benefit from explicit step numbers because data gets copied into search indexes, analytics tables, AI prompts, voice flows, and offline caches.
Schema.org makes this same point in its how-to model. Recipe is a subtype of HowTo, and recipeInstructions can be expressed as text, an ordered list, HowToStep, or HowToSection values. The HowToStep type also notes that when order matters, list items should carry a position property rather than assuming markup order is enough.
For application APIs, the equivalent is simple: every step should have a stable sequence field, and grouped phases should not destroy the global order.
Durations should be machine-readable
Cooking interfaces almost always need time: prep time, simmer time, rest time, refrigerator storage, reheating time, or total recipe time. If those values are stored as English phrases, every client has to parse language before it can start a timer or compare recipes.
Use standard duration strings. Schema.org's Recipe type defines prepTime, cookTime, totalTime, and related how-to time fields as Duration values in ISO 8601 duration format. Google's recipe structured data documentation also recommends ISO 8601 duration values for prepTime and cookTime.
That is a good pattern for API responses too:
{
"activeTime": "PT20M",
"passiveTime": "PT1H30M",
"totalTime": "PT1H50M"
}
Clients can still render "1 hour 50 minutes" in friendly language, but the underlying contract stays easy to sort, filter, and compute.
Phases make guided cooking easier
A single flat list is not enough for many cooking products. A guided-cooking UI often wants to group work into prep, cook, rest, garnish, and serve phases. A meal planning product may need to estimate active time separately from unattended oven time. A voice assistant may need to warn the user before a passive wait starts.
That works best when each step carries a phase field:
prepfor chopping, measuring, mixing, and stagingcookfor heat-driven actionsrestfor passive waits, cooling, marinating, or proofingfinishfor plating, garnishing, adjusting seasoning, or serving
The exact vocabulary can vary by product. What matters is that the vocabulary is controlled and documented. Free-form phases like "first part" or "next thing" are not much better than plain prose.
Doneness cues are better than optimistic timers
Timers are useful, but cooking is full of variable conditions: pan material, stove output, ingredient size, altitude, oven calibration, and user technique. A step that only says "cook for 8 minutes" can produce bad guidance when the food is not actually done.
Structured doneness cues make the API more useful:
{
"duration": "PT8M",
"donenessCues": {
"visual": "edges are browned and center is just set",
"temperatureF": 165,
"texture": "firm but still springy"
}
}
This lets a guided flow show both a timer and a real-world check. It also gives AI assistants safer context. Instead of inventing advice from one sentence, the assistant can point back to the recipe's own doneness criteria.
Schema markup is output, not the whole internal model
Public recipe markup is useful. Google explains that recipe structured data can help Search understand recipe pages and show richer results such as cooking time, nutrition, images, and instructions. Schema.org gives publishers a shared vocabulary for Recipe, HowToStep, ingredients, yield, timing, and nutrition.
But markup is not a complete product data model. A meal planner, grocery app, or AI cooking assistant may need fields that search engines do not require: controlled action verbs, equipment IDs, ingredient references, active versus passive time, structured doneness cues, safety warnings, substitutions, or step-level dependencies.
Treat public markup as one output format. The API response should remain the source of truth.
Generated recipes need the same step schema
AI recipe generation makes this more important, not less. If generated recipes return loose paragraphs while catalog recipes return structured steps, every client has to branch. Search, saved recipes, guided cooking, meal planning, and nutrition flows all become harder to maintain.
Generated recipes should pass through the same instruction contract as retrieved recipes:
- the same phase vocabulary
- the same duration format
- the same step numbering
- the same action fields
- the same doneness-cue shape
- the same null-handling rules
That keeps the frontend honest. It also makes generated content easier to validate before it reaches users.
What Recipe API optimizes for
Recipe API is built around structured cooking data, not just recipe text. Its public documentation describes phased cooking instructions with doneness cues, active and total timing as ISO 8601 durations, and a consistent schema across catalog recipes and generated recipes.
For teams building recipe apps, the practical checklist is straightforward:
- keep the human instruction text
- expose explicit step order
- use a controlled phase and action vocabulary
- store durations in ISO 8601 format
- include visual, tactile, and temperature doneness cues where available
- make generated recipes return the same instruction shape as catalog recipes
That is the difference between rendering a recipe card and building cooking software. A string can tell a person what to do. A structured step can also help the product do something useful.
Start Building
One consistent schema on every response. Get a free key and ship in minutes.