Skip to content
AI-Native PM
8 min · 0 of 7 in The Model Layer

Give the model tools to fetch and act

The plant shop that agreed to pilot your ordering assistant is a real store with a point-of-sale system, and the owner is beside you when the first live question arrives at 4:12 pm: "do you still have the large monstera?" The reply takes two seconds: "Yes! We have 12 large monsteras in stock at $34." The owner does not need to check; she sold the last one eleven days ago and repriced the reorder at $42. The facts in your prompt are her inventory export, pasted the week you built the feature and true that day. The model answered from the copy, because the copy was all it had, and nothing in a prompt updates itself when a shelf empties.

Give the model the facts it wasn't trained on left your feature's facts riding in the prompt, stuffed or retrieved, and either way a copy made before the question existed. This chapter adds the rung above: a way to fetch the fact at the moment of the question, and a way to act.

One round trip, printed in full

A tool is a function your code owns that the model can request by name. Our fridge feature has never needed one, because the parent hands it the whole input in a photo, but "what can we cook tonight?" depends on whatever the kitchen holds at the moment of asking, so we wired the smallest tool worth printing, a pantry lookup in the same kitchen. Here is every message that moved.

The definition, sent along with the question. The request carries, beside the user's words, a name, a description, and the parameters the function accepts.

{
  "name": "get_pantry",
  "description": "Returns every item currently in the family's pantry, with quantities. Call this before answering any question about what the family has on hand.",
  "parameters": {
    "type": "object",
    "properties": {
      "category": { "type": "string", "description": "Optional filter, such as 'grains' or 'canned'" }
    }
  }
}

The description is not documentation, it is prompt work: those two sentences decide whether the tool fires at the right moments, never, or on everything, and writing them is Prompting is engineering, not wording in one new spot.

The question, as the parent typed it.

What can we cook tonight without a store run?

The model's reply, which is not an answer.

{ "type": "tool_call", "name": "get_pantry", "arguments": {} }

That block, give or take each provider's dressing, is the whole event: a structured request naming a function and its arguments. Nothing has run. The model cannot reach a database, a file, or the network; it produced text in a format your code parses, the same schema discipline you already run on its answers.

What our code did. get_pantry is ordinary code, a query against the pantry table, run on our server with our credentials, where the model call already runs. The result goes back as one more message:

{ "items": ["rice, 2 kg", "black beans, 3 cans", "canned tomatoes, 2", "pasta, 500 g", "onions, 4"] }

The second reply, grounded. Handed the result, the model produces the answer the parent sees:

Tonight you can make black bean and rice bowls, or pasta with
tomatoes and onions; everything for both is already in the pantry.

The model never runs anything. A tool call is a request: the model asks by name, your code runs the function and hands back the result, and the final answer is produced from what came back.

The meter runs twice per tool answer, two model calls around one function run, still a fraction of a cent at this size.

When a tool beats stuffing

The obvious objection comes straight from the last chapter: our code could run get_pantry itself and stuff the result into the prompt before every call. It could, and when one predictable fact serves every question, that is the right move; the fridge prompt ships with zero tools. A tool earns its wiring when stuffing cannot keep up.

  • Too fresh. The plant shop's get_stock(product): the right answer changed eleven days ago and will change again before closing, so only a lookup at answer time is ever current.
  • Too large. A support product's search_help_center(query) against 4,000 articles: the library cannot ride on every call, so a search tool looks things up only when a question needs it.
  • Behind an action. add_to_list(item): there is no fact to paste, because "rice is on the shopping list" only becomes true when code runs. A prompt can carry any fact you own; it cannot carry an action.

The third kind moves the model from fetching to acting: every tool you wire is authority you hand the model, and what a product owes its users before letting one act fills a part of The Practice.

MCP: the standard plug for tools

Each loop above is code somebody wrote against one provider's format. The Model Context Protocol, MCP, standardizes the plug: one open format for any product to expose tools and any client to call them, published in late 2024 and adopted across the major providers since. In 2026 you mostly meet it as one setting: the connectors page in Claude, ChatGPT, or Gemini, where switching on your calendar or your CRM hands the assistant that product's tools, no loop written by anyone. The wiring stays the same; who can plug into whom changes, and The action surface: every tool is delegated authority takes up the authority half in The Practice.

The ladder this part climbed

When the shopping list padded itself with yogurt no photo showed, the fix was instructions: rules and one worked example. When an answer needed facts the model was never trained on, the fix was context, stuffed or retrieved. When it needed the pantry as it stands right now, or a change made in the world, the fix was a tool. Each rung costs more than the last: a prompt edit is free and reverts in a keystroke, context adds tokens to every call and a store to keep current, and a tool adds code you write, secure, and maintain.

The ladder has a fourth rung, and you will almost never stand on it. Fine-tuning trains a model further on your own examples, and what comes back is a new model, not your old one improved: it costs real money and days where a prompt edit costs minutes, it teaches form and behavior rather than facts (a missing-facts problem climbs back down to context), and every check you have run so far runs again from scratch, because nothing you proved about the old model carries over. The providers' own guidance is to exhaust prompting and retrieval, with evidence, before training anything, and the Frontier's Choosing the right model covers the rare day fine-tuning earns its seat.

Customization runs from cheap and reversible to costly and locking, instructions, then context, then tools, then fine-tuning: reach for the cheapest rung that works and stop there.

The customization ladderFour rising rungs for customizing a model, from cheap and reversible to costly and locking. One, instructions, change the words you send. Two, context, add your facts through retrieval. Three, tools, let it call your functions. Four, fine-tuning, train a new model, marked last resort. Start at the cheapest and stop at the first rung that works.CUSTOMIZE FROM THE BOTTOM UP1INSTRUCTIONSchange the wordsyou send2CONTEXTadd your facts(retrieval)3TOOLSlet it callyour functions4FINE-TUNEtrain a newmodelLAST RESORTSTART HERECHEAP · FAST · REVERSIBLECOSTLY · SLOW · LOCKS YOU INReach for the cheapest lever that works, and stop there.

Try it now

This drill takes about twenty minutes; the no-setup path spends nothing, the tooled path a few cents. You arrive with your feature's facts riding in the prompt, the artifact from Give the model the facts it wasn't trained on. You leave with your feature's first tool, wired if you have tools, specced in full on paper if not: a name, a one-line description, the parameters, and what your code does when the call arrives.

No setup: Spec the tool on one page. Pick the fact in your prompt that goes stale fastest, or the one action your feature keeps wanting to take, then write a verb_noun name, a description saying what comes back and when to call it, written like the prompt it is, each parameter with a type, and what your code does when the model calls it: what it reads, what it returns, what it refuses. Then play the code role against our printed exchange: write the result message your function would return for a question your user would really type, and check that the answer you want back contains nothing your result did not hand over.

With your tools: Hand Claude Code the spec and ask it to wire the tool onto your feature's model call: the definition on the request, the function pointed at wherever the fact really lives, the round trip logged. Run one real question and read the log: the tool call out, the result in, the grounded answer. If the model answers without calling the tool, sharpen the description and rerun; it revises like the prompt it is. Same move in Codex or Cursor: paste the spec and ask for the definition, the function, and the logged round trip. If nothing is installed yet, the Setup Clinic gets you running.

Chapter Summary

  • A fact stuffed into a prompt is a copy made before the question arrived, and nothing in a prompt updates itself when the world moves.
  • A tool is a function your code owns that the model can request by name; its definition is a name, a description, and parameters, sent along with the question.
  • The model never runs anything. A tool call is a structured request, your code runs the function and hands back the result, and the final answer is produced from what came back.
  • The description is prompt work: it is what makes the tool fire at the right moments, and a tool that fires too rarely or too often is fixed by revising it.
  • A tool answer runs the meter twice, two model calls around one function run.
  • Tools earn their wiring when facts are too fresh to paste, too large to send every time, or behind an action that only code can take.
  • Wire reading tools before writing tools, and keep a check in code between the model's request and anything that changes the world.
  • MCP is the standard plug: connectors in the provider consoles hand an assistant another product's tools, and the authority questions that raises live in The Practice.
  • The customization ladder orders everything this part taught: instructions, then context, then tools, then fine-tuning, and you stop at the cheapest rung that works.
  • Fine-tuning produces a new model you re-check from scratch, so it comes last and rarely comes at all.

Part IV puts your build on the bench, starting with Choose the model you ship.

Sources

  • OpenAI, Anthropic, and Google Gemini tool use and function calling documentation (last verified July 2026).
  • Model Context Protocol specification, modelcontextprotocol.io (last verified July 2026).
  • OpenAI guidance on model optimization order: prompting, retrieval, then fine-tuning (last verified July 2026).
  • Qi and colleagues (2023), research finding that fine-tuning can weaken trained refusals.
  • FuelTheFam (fuelthefam.com), our family nutrition app; the fridge feature is live in the product, and the pantry exchange was wired and run for this chapter.
Marks this chapter complete on your course map. Reaching the end does this for you.