We publish our research openly and build production systems that apply what we learn.
M. Cray, C. Schmidt — April 2026
We introduce CBTS, an inference-only per-head attribution and intervention framework that maps behavioral architecture in RLHF-trained transformers across 131 behavioral and 19 structural directions. Validated across five RLHF-trained transformer configurations (Qwen-2.5 at 3B, 7B, 14B; Phi-3.5-Mini; Llama-3.2-3B).
The paper documents a benchmark-insensitive failure mode in current safety evaluation methodology, characterizes multi-gate safety architecture at per-head resolution, and demonstrates cross-dimensional intervention specificity (69× on-target to off-target ratio).
Behavioral control in language models is implemented through structured internal mechanisms that can be characterized, measured, and modified. Current evaluation methodology treats these mechanisms as a black box; current interpretability research focuses heavily on single behavioral dimensions.
Our work extends direction-based mechanistic analysis to a 131-direction behavioral catalog at per-head, sub-layer resolution. The framework is general: safety is the first rigorous demonstration because it supplies clean quantitative scoring, but the same methodology applies to creative expression, reasoning style, aesthetic preference, and other behavioral dimensions we are actively investigating.
Independent auditing capability — the ability to characterize and verify model behavior without relying on the developer's evaluation methodology — is a public good. The benchmark-insensitive failure mode we document suggests that aggregate safety metrics may be systematically inadequate at frontier scales. We publish our findings openly because the defensive value exceeds the restricted-access alternative.
Alongside our research, we build and deploy AI systems for operating businesses. These systems serve as practical tools and as live deployments where research insights can be tested against real-world usage patterns.
Deployed at The Metalsmiths since 2025. Handles appointment scheduling, customer intake, and team identification by voice.
Built on ElevenLabs, Twilio, Supabase.
Reduces quote generation from 3 hours to 15 minutes. Deployed in service-business operations.
Built on FastAPI, React, PostgreSQL.
For inquiries about custom applied systems: hello@2kingsdev.ai
2KINGSDEV LLC is an independent research lab founded in 2026 by Michael Cray and Christopher Schmidt. We publish research openly on Zenodo and arXiv. Our work is supported by revenue from applied systems and by research funding.
Based in the Pacific Northwest. Incorporated in Washington State.
Active patent filings in mechanistic interpretability and behavioral analysis methodology.