By Order of the Markdown Files

Markdown used to be where docs went to die. The README nobody read. The CONTRIBUTING file that hadn't been touched since the repo was initialized. A graveyard for intentions.

Now it's quietly becoming the control layer for AI systems, and I don't think most people have clocked how big a shift that is.

The constraint moved

For a long time, the interesting problems in software lived in APIs, databases, and whichever frontend framework was popular that year. Those still matter. But when you're building with agents, the binding constraint sits somewhere else entirely: what the model sees, and how it sees it. That's context. And a plain .md file gives you the cheapest, most legible way to shape it.

In any serious agentic system, the .md files stop describing the system and start running it. They hold the agent's memory, its policy, its operating instructions. It is how the system behaves. Change the file, change the behavior on the next run. That makes it a runtime artifact, not a doc.

Why markdown, of all things

AI systems don't execute logic the way normal software does. They read context, interpret it, and act on the interpretation. So the structure of your context carries more weight than whatever storage sits behind it. You can wire up the fanciest vector DB on earth and still ship a confused agent if the context you load into the window is sloppy.

Markdown hits a sweet spot that's hard to beat. Humans can read it. Structure fits inside it without ceremony. Git versions it trivially. It loads straight into context, no parser in the loop. You can open it in any editor, on any machine, and see what's going on. No hidden state. No ORM between you and the ground truth.

The pattern that keeps showing up

The rough shape I've landed on, and that I keep seeing other people converge on, looks like this:

Context architecture: INDEX.md always loaded, topics loaded on demand, logs searched but never loaded into context

INDEX stays small, surgical, and in the window on every turn. Topics hold deeper knowledge the agent pulls in on demand. Logs get searched but never dumped wholesale into a prompt, because loading logs into context burns tokens and confuses models at the same time.

What you get out of this arrangement is control. You decide what the model always sees, what it can fetch, and what stays out. You get separation: a lean index for fast reasoning, topic files for depth, logs for retrieval when something specific matters. And you get sanity, which gets underrated. Open any file in the tree and you can understand what's happening. No special tooling. No "let me query the embedding store to figure out what state we're in."

I hit this directly while building the benchmarking system for our software factory. The eval engine scores factory outputs using an LLM judge. The judge has no reasoning logic baked into code, it reads its rubric from a .md file at runtime. What counts as a pass, what counts as a partial, which failure modes matter — that's all in the markdown. The one thing that lives in code is the weighted composite: the arithmetic that combines five dimension scores into a single number.

Specifically, JUDGE.md is injected as the judge's system prompt on every evaluation. That means it doesn't just inform the judge, it frames every token it produces. When I want it to weigh a dimension more heavily, or catch a failure mode it's been missing, I edit the file. The next eval run behaves differently. No code change. No redeploy.

Where people go wrong

The common failure mode: treating markdown like a dumping ground. People append instead of rewriting. They let files grow to thousands of lines. They leave contradictions in, assuming the model will figure it out. It won't. Or worse, it will, by picking whichever half of the contradiction sits closer to the cursor, which means your behavior has quietly become a function of file position.

The rules I try to hold to:

Keep always-loaded files small.
Push detail into separate files the agent can pull in.
Never load logs into context.
Rewrite instead of appending.
Version everything.

That last one carries more weight than it sounds. Once your markdown actually shapes behavior, a commit history on those files becomes a changelog for your system's reasoning.

The real shift

The frame I've come around to: "code handles execution, markdown shapes reasoning". Those call for different disciplines. Writing good code rewards correctness and composition. Writing good context rewards information architecture, knowing what's essential, what's retrievable, what's noise. Plenty of engineers are strong at the first and have never had to think about the second.

That second skill is becoming load-bearing. The job isn't only writing code anymore. It's designing context. Structuring information so a model can act on it. Deciding what lives in memory versus what gets fetched. Keeping it clean, current, and honest, because stale context does more damage than no context.

Markdown won the first time because it was simple. It's winning again for the same reason. It gives you a clean way to tell a model what matters and keep everything else out of the way.

One last thing

As Igbos, we've always maintained that There's an order to things in this world. That's the whole game with agents. They don't need more tools, more scaffolding, more cleverness. They need an order to things. They need pointers to things that actually matter.

Agentic systems get sharper the moment you stop letting context be whatever spills into the window and start deciding what belongs there. The index leads. Topics wait their turn. Logs stay outside the room until they're sent for.

That's the order. That's the work.

Always live by these principles, By order of the Markdown Files.