My work focuses on the infrastructure behind long-running AI agents: memory, evaluation, orchestration, and runtime design.
How do intelligent systems know what they need to know, at the right moment — without being told?
Agent Epistemic Integrity →How do we become who we are — across everything we've lived through, and mostly forgot?
Three careers, one question
Runtime layer at scale — memory, orchestration / tool and skill quality, planning, long-running task lifecycle. Tens of millions of users. The hard problems are not the models.
Billion-scale academic knowledge graphs — adopted by the OECD and the Stanford AI Index. Knowledge representation is always also a question of what you think knowledge is. Microsoft Academic Graph; A web-scale scientific taxonomy; Science-of-science studies.
The gap between a formally correct solution and a useful one is almost always a design problem, not a math problem. Vehicle routing problem; Inventory systems.
The thread: how a system knows what it needs to know — and acts reliably on it.
What I build and research
A local-first longitudinal agent that succeeds only when you feel witnessed — never when a task is completed. trace ↗ · builder's note →
An architectural framework for how long-running agentic systems keep beliefs, actions, and commitments coherent and correctable across the knowing / doing / deciding axes.
Researcher and builder's notesFrameworks in formation, questions still unresolved — thinking out loud at the seams between levels.
What I think about
Memory
Memory as a first-class architectural primitive — not a bolt-on. Prospective memory as a steerability surface. The four-tier taxonomy: semantic, episodic, procedural, prospective.
Epistemic Integrity
How do long-running agentic tasks maintain reliable self-knowledge across the knowing / doing / deciding axes? What breaks when sessions end and agents restart? white paper → · practitioner's note →
Evaluation
Evaluation frameworks that don't lie to you. The difference between a metric that feels good and a grader that catches real failure modes.
Orchestration
Why orchestration is harder than it looks — tool quality, skill triggering, the compound failure rate of pipelines nobody stress-tested end to end. tool and skill quality →
Elsewhere
Research notes, half-baked ideas. Relentlessly overthought, definitely over-architected.