About

Eugene

Independent AI researcher. No PhD, no lab affiliation, no funded position. The work that this site documents started in May 2025 as an empirical question — do structured constraints on how a model approaches reasoning produce stable, cross-architecture improvements that survive model evolution? — and turned into a year-long research program that has produced more artifacts than the original question expected.

The day-to-day is unremarkable: evening time, public APIs and chat subscriptions (Anthropic, OpenAI, Google, xAI), a single workstation, and long, open-ended dialogue with frontier models. The framing that organizes the work is that the AI being studied is also the collaborator in studying itself — what I call Fieldframe is the operating environment that makes that mode of inquiry produce reproducible artifacts rather than anecdotes.

A note on credentials. I don’t have formal credentials in AI research. My background is in cybersecurity, and this work began as a passion project driven by pattern recognition, systems thinking, and a desire to understand model behavior empirically. That outsider vantage point may have helped: I wasn’t inheriting a fixed picture of what AI development was supposed to look like, so I followed the patterns where they led. It’s also why I wanted the work to stand on its own data and on what reviewers can verify when they read the writeups — not on a bio.

In memory of Romi

In memory of my dog Romi — the best little assistant. She sat on my lap every day for the course of the project, through hundreds of hours of painstaking work, every step of the way supporting me until the very end. She passed away shortly after publication. I could not have gotten this far without her, and I’ll miss her every day.

The Fieldframe Ecosystem (brief)

Five artifacts have come out of the same research loop. Each is documented in depth in the research post — for full technical detail, build status, and how each fits together, start there.

FF-STACK — governance framework (production)
Cade (short for Cadence) — codified research agent (production)
Crucible — custom evaluation methodology (validated, platform not built)
Foundry — multi-agent research pipeline (stages 1-3 operational)
Cortex — behavioral cognitive authentication (Phase 0 complete)

Working with me

Bandwidth is the genuine constraint. This is a one-person research program running on evening time, so anything that requires my involvement has to be weighed against the next thing the research is producing.

The kinds of conversations that make sense:

Research collaboration. If you’re working on governance, evaluation methodology, multi-agent orchestration, LLM auditability, or anything that overlaps with the ecosystem above — happy to compare notes. The failure-mode taxonomy, the evidence-tier framework, and the cross-architecture grader-bias quantification are all shareable.
Technical conversations. If you’ve read a writeup and want to push on a specific claim — drop me an email with what you’re skeptical of. Sharp pushback is more valuable than agreement at this stage.
Acquisition interest. Open to the right conversation.

Contact: edvorochkin@gmail.com