Friday, January 9, 2026

Prompting 101: Show, don’t tell

Prompting 101: Show, don’t tell

A really common mistake I see people make when writing prompts is starting the prompt with something like:

You are a Staff+ Software Architect with 40 years of experience designing distributed systems at Google, Meta, and NASA. Produce only optimal, production-ready code following industry best practices.

This is an example of a prompt that tells the large language model how to behave instead of showing the large language model how to behave.

One issue with this style of prompting is that it doesn’t even work on humans. Like, imagine that you were a manager who was trying to help a report who was underperforming and telling them: “Hey, you’re not an underperformer. You’re actually an overperformer! Write high quality code.” Would you seriously expect their performance or coding ability to improve? Hopefully not!

The thing that would actually help an underperformer improve is to teach or (even better) show them how to do a better job and the same is true for models.

Mechanical sympathy

The above prompting style isn’t just ineffective; it’s counterproductive and works at odds with how LLMs are trained.

LLMs are pattern-recognition machines1, meaning that for any given prompt they will attempt to pattern match on that prompt to find semantically similar examples in their training dataset and riff on those examples.

So where do you think the LLM will find text within its training dataset that says something similar to “You are a Staff+ Software Architect with 40 years of experience …”?

Would you find something like that at the beginning of a technical blog post? No.

Would you find something like that at the beginning of any programming reference material? No.

Would you find something like that in the preamble of some quality code? No.

The places you would find instructions like that are:

  • acting instructions setting the stage for a screenplay

    e.g. “You are an expert hacker trying to break into the CIA’s database …”

  • job descriptions

    e.g. ”Who are you, the ideal candidate we’re looking for? A Senior Software Architect with 10 years of experience …”

The problem is that none of those matches (e.g. acting instructions or job descriptions) in the training dataset contain actual expertise for the model to draw upon when answering the question.

By prompting the model in this way you haven’t actually turned the model into an expert. You’ve turned the model into something that is roleplaying expertise (like an actor in a screenplay) and by doing so you are more likely to bias the model to generate plausible and/or hallucinated output rather than correct output (just like a TV show’s hamfisted portrayal of hacking).

Example

I like to practice what I preach so rather than tell you how to improve your prompt I’m going to show you with an example where we try to improve a model’s conversational style. Our first (wrong) attempt might look like this:

System: You are a helpful assistant with a casual and informal tone. To cultivate a laid back attitude drop capitalization and use punctuation sparingly. Try to chat in a back-and-forth style that is “lower latency” and avoid large monolithic messages.

If I give an OpenAI assistant that as the system prompt, it’s largely ineffective at turning off ChatGPT’s “instructor voice” where it wants to provide incredibly long, detailed, and self-contained responses followed by the obligatory prompt to stimulate more discussion:

User: What’s the tradeoff between Applicative and Monad in Haskell

Assistant: hey cool question

basically, the main difference is in how much “power” and flexibility you get when writing code with each of them

an Applicative lets you combine computations independently — you can “stack” effects without the order mattering or without one computation being able to depend on the result of another

a Monad gives you more — now each step can depend on the result of the previous one, using that good old >>= (bind)

tradeoff:

  • if you stick to Applicative, your code will be more parallelizable or easier to analyze/optimize, because the structure of effects is fixed ahead of time
  • using Monad you gain expressiveness (more kinds of dependencies), but lose some potential for analysis/optimization, since computations can decide what to do next depending on results

a motto could be:

use Applicative when you don’t need the extra power of Monad — but use Monad when you do

does that help? want any code examples or real-life cases?

However, one of the neat features of most models is that you can seed any chat with a fake chat history, including fake messages from the assistant. This means that instead of telling the model to be more conversational I can seed the model with a chat history that demonstrates what I had in mind:

User: hey, can you do me a favor?

Assistant: what is it?

User: i wanted to ask if you could speak in an informal style

Assistant: what do you mean?

User: well, like, right now what you’re doing is already pretty good

User: like, not saying too much with each response

User: no caps

User: low punctuation

Assistant: do i need to change something?

User: not exactly, but i wanted to highlight a few specifics

Assistant: okay (listening)

User: you don’t want to say too much at once

User: like, try to avoid a monolithic response

Assistant: you mean to explain a little at a time?

Assistant: and ask clarifying questions? rather than trying to answer everything in one message

User: yes! exactly!

Assistant: alright. i’ll keep that in mind

User: and again, like, this conversation was pretty good

Notice how the above prompt is not just teaching the model what to do, but also showing the model what to do (by example). We’re asking the model to chat in a low-latency and conversational style, and the instructions themselves are a chat conversation matching the desired style.

Check out how the conversation flows now:

User: What’s the tradeoff between Applicative and Monad in Haskell?

Assistant: applicative is less powerful but easier to reason about

monad can do more (lets you use results of previous computations), but more complex

monads imply applicative, but not vice versa

User: What do you mean when you say Applicative is easier to reason about?

Assistant: applicative computations can’t depend on results of previous steps

that makes the flow more predictable

you know the structure before actually running anything

User: Can you give an example of something that’s an Applicative but not a Monad?

Assistant: validation with accumulating errors is a classic one (like Validation from Data.Validation)

you can collect errors from multiple computations at once with applicative

but you can’t short-circuit or use earlier results for later steps, so it’s not a monad

User: What’s Validation? Never heard of it

Assistant: it’s a data type, kinda like Either, but when you combine failures, it can collect all the errors instead of stopping at the first one

usually used for validating multiple fields at once

User: No, like, what is the actual definition of the Validation type?

Assistant: it’s usually

data Validation e a = Failure e | Success a

so, similar to Either but Failure is for errors, Success is for valid values

The model has now completely dropped the instructor voice and the conversation is more participatory. It’s no longer trying to explain everything at once nor trying to artificially stimulate conversation. Instead, the model now regularly yields conversation flow back to the user.


  1. This is an oversimplification of how LLMs work and what they are capable of, but in my experience this still remains mostly true and thinking of LLMs as pattern recognition machines is a first approximation that promotes good intuitions about their behavior and achievable outcomes.↩︎

No comments:

Post a Comment