PHINEAS | Jim Vinson - Forward Deployed / Applied AI

Context

The head of English at an ESL academy in Singapore needed a way to produce reading material at defined CEFR levels (the Common European Framework of Reference, the standard for language proficiency), at scale, with deterministic adherence to the level. Off-the-shelf LLM output couldn’t be trusted to stay on-level, even vaguely. A passage requested at B1 would drift into B2 vocabulary, or slip back to A2 grammar, or include culture-specific references that broke the assessment.

I took on the build on a speculative basis. The head of English championed the work internally and committed beta-trial time from students and faculty.

Approach

A six-step state machine that forces deterministic output from a probabilistic model. The corpus grounding does the heavy lifting: a custom CEFR corpus of roughly 14,000 core words and phrases, extended with extrapolated morphologies to speed processing and sharpen determinism.

Topic seed and CEFR target lock
Word-frequency check against the corpus for level-appropriate vocabulary
Grammar constraint check against CEFR descriptors
Draft generation with frozen vocabulary and grammar windows
Self-review by a second agent against the CEFR rubric
Final pass that re-checks frequency drift and outputs a confidence score

Each step has a defined input/output schema. The LLM produces probabilistic text inside hard guardrails.

Outcome

Now at phineas.app, in formal product development. The pilot bar was outcome-based: a generated passage had to be classroom-ready without edits beyond formatting, and clear unanimous staff review. The target was 60% of passages clearing that bar; the pilot reached just under 85%, which is what moved it into open beta. Beta-user trials are running with students and faculty at the original academy.

The pattern (rubric-grounded multi-step state machine for deterministic LLM output) generalizes to any domain where the output needs to hit a precise level or category: medical literacy, legal writing, regulatory compliance, technical documentation tiered by audience.