I keep losing the thread.
That sounds more poetic than it is. The real version is lame: work starts in one chat, moves to another agent, picks up a test result somewhere else, then comes back as a summary that sounds more settled than the project actually is.
That might be the part that bothers me most.
The chat remembers enough to sound helpful. A context window compressing or accidentally closing a session can destroy all continuity. Agents like Codex and Claude can explain the intention, recap the path, and tell me what probably happened; meanwhile the useful questions are still sitting there unanswered: what changed, what passed, what blocked, what needs me, and where is the proof?
I've been feeling that friction a lot lately.
The whole point of using agents is supposed to be leverage, but leverage gets strange when the only durable state is my own attention. At that point the work isn't really delegated. It's scattered across a pile of half-finished conversations that still expect me to be the connective tissue.
That's not the future I want.
It's also not something I think a smarter chat box fixes by itself. Chat can help me think, write, and code, but it is a bad place for project state to live. A transcript is evidence, maybe. It is not state. It is not a task graph. It is not a run log. It is not a clean answer to what happened while I was gone.
To address this, I started working on something I'm currently calling the Librarian.
Projects get registered with Librarian. From there, the agent acts like an actual librarian: it keeps the shelves in order, knows which documents matter, routes workers toward the right literature when they have questions, and gives me current answers when I ask what is going on without routing every question back through me.
That matters because the project should not depend on whatever one chat happens to remember. It should have a maintained index. It should know the docs, the tasks, the runs, the blocked work, the attention items, and the state that can be computed from all of that.
Right now Codex is constantly pinging me. Some of that is useful. A lot of it is context-management noise. I want one agent that can manage context, maintain docs, keep work moving, and only pull me in when the question is actually mine to answer.
That is the point: quiet things down.
Keep things functioning, indexed, and quiet.
Then I can spend more of my attention where it belongs: tactical decisions, strategy, architecture, and living my normal dad-human life.
What I want is boring: tasks that leave records, blocked work that surfaces, review items that don't vanish into friendly paragraphs, test claims tied to commands, and enough context to come back later without doing archaeology on my own work.
Not magic. Bookkeeping.
This is the unromantic part of the agent thing that I keep circling back to. Intelligence is less useful without continuity. Autonomy is a liability until the receipts are boring. The art of making agents useful is making things boring.
For now, my hope is simple: step away from a project, come back, and see what moved, what failed, what needs me, and what is safe to do next.
If Librarian helps with that, I'll keep building it.
If it doesn't, I'll write that down too.