Flagship Case Study

Wrench Scribe

AI-assisted maintenance documentation for technical users working in real plant workflows. Built with explicit focus on trust, editability, structured outputs, and reliability under operational constraints.

What it is

Wrench Scribe is an AI workflow product for maintenance teams. It converts unstructured service notes into structured work order language that technicians can review, edit, and submit quickly.

Users

The workflow was designed for the people who create, review, and depend on maintenance records across daily plant operations.

Maintenance technicians closing corrective and preventive work
Supervisors responsible for record quality and repeatability
Planners and reliability teams that depend on clean maintenance history

Problem

In many plants, maintenance notes are written under time pressure and vary by person, shift, and site. That inconsistency makes records hard to trust and hard to reuse. Technicians needed speed at close-out, while supervisors and planners needed clean records they could rely on.

Why this mattered

Documentation quality directly affects planning, repeat-failure analysis, and preventive maintenance decisions. If the workflow adds friction, adoption drops. If quality is weak, downstream teams lose signal. The product had to solve both.

Product approach

The product pairs structured input capture with AI drafting so outputs fit work order workflows instead of free-form chat. Human-in-the-loop review is built in so technicians keep final control. Access was designed to be anonymous-first and low-friction, with privacy-first handling for operational context.

Key product decisions

Optimized for structured outputs compatible with work order workflows, not open-ended chat responses
Invested in reliability hardening and error handling before broad rollout
Balanced latency and cost through iterative model, prompt, and orchestration tuning
Protected editability so users could correct edge cases without losing speed gains

What Kevin owned

I owned product direction from scope definition through launch readiness, while working closely with engineering on system behavior, reliability, and evaluation quality.

Defined product scope, success criteria, and release priorities
Led user and domain framing based on maintenance operations constraints
Partnered with engineering on prompt design, schema strategy, evaluation, and reliability
Drove launch readiness, instrumentation, and post-launch iteration planning

Evidence and outcomes

Reduced p50 latency from about 9.0s to 3.1s

Reduced p95 latency from about 13.0s to 4.6s

Reduced cost per request by about 70%

Achieved 100% schema-valid outputs on a 100-prompt eval set

Tradeoffs and lessons

Fast output only matters when users trust the draft, so editability and consistency were essential adoption levers
Strict structure improved downstream usability but needed careful handling of uncommon edge-case language
Operational AI products require product judgment on reliability and workflow fit, not only model quality

Demo and screenshots

This public page focuses on product decisions and measurable results. Interface walkthroughs are shared directly in interviews because the workflow includes customer-sensitive operational detail.

Technician input capture and draft generation flow
Human-in-the-loop review and edit path before submission
Structured output handoff for work order and planning workflows