/research
I build diagnostics and guardrails for reasoning in LLMs and agents.
Focus: math reasoning limits, reasoning reliability, and lightweight verification.
Research pipeline
Core themes
Reasoning Validation
Can we automatically verify the logical steps in an agent's Chain of Thought?
Lightweight Formal Methods
Applying lightweight formal methods to probabilistic models to guarantee certain invariants.
Token-level Interpretability
Analyzing activation patterns to predict hallucinations or reasoning failures.
Collaboration
I am looking for co-authors, pilot partners, and labs interested in rigorous evaluation of agentic reasoning.
Looking for
Co-authors, research labs, and pilot partners with deployed agent stacks
I bring
Benchmarks, telemetry tooling, and validation-layer prototypes
Ideal collaboration
Run eval suite on your agent stack; co-author paper on findings
Publications & artifacts
Preprints / Drafts
draftFinite-Space Constraints (FSC): diagnostics + stress testsPrototype repo coming
Artifacts (tools / repos)
NjiraAI validation layer: agentic safety infraEarly access — brief on request
Agentic Eval Harness + telemetry loggerRepo coming soon