From ‘write me a thing’ to outcome specs + evals

Longer context didn’t just allow bigger prompts. It changed the workflow. Instead of one-shotting tasks, you could build multi-pass pipelines: inventory → structure → draft → revision. Scaffolding became the main move because the model could now keep enough state in view to follow it.

This is where “prompting” starts to resemble managing a strong collaborator. You stop telling it what to write and start defining what good looks like: length, structure, inclusions, exclusions, and success criteria you can actually check.

The modern version of this idea is evaluation-driven prompting: describe the test, let the model iterate until it passes.