Agent Skill for Cost-Optimized Model Selection
Overview
A published agent skill that helps developers wire the cheapest model that still clears their own accuracy bar, behind one stable OpenAI-compatible endpoint. It is the agent-facing companion to an internal model-optimization / LLM-gateway product.
Why It Exists
Picking the right model for a reasoning step (extraction, classification, summarization, an agent step) is usually guesswork, and the “right” choice changes every time a new model ships. This skill turns that into a benchmark: point it at a step in your code, evaluate candidate models against your own examples, and route to the cheapest one that passes.
What We Built
A skills package installable via npx skills add, containing the optimized-model skill (a SKILL.md definition). The skill benchmarks candidate models against user-supplied examples, selects the cheapest model above the accuracy threshold, and writes a stable om_<id> routing endpoint into the caller’s pipeline, so the id never changes even when the underlying model is later swapped for a better one. Onboarding is intentionally lightweight (email plus a small one-time top-up to fund the benchmark), and each run produces a shareable benchmark report.
Technologies & Approach
A Markdown-defined agent skill following the npx skills convention, fronting a benchmarking-and-routing service exposed over an OpenAI-compatible endpoint. MIT-licensed. The design bet is stable, swappable endpoints that decouple “which model” from the calling code.
Outcome / Impact
Packages an opinionated, rerunnable workflow for keeping pipelines on the cheapest model that meets quality, letting teams continuously re-optimize cost without touching application code.
Capabilities Demonstrated
- Authoring reusable agent skills for the skills ecosystem
- Benchmark-driven, accuracy-gated model selection
- Stable endpoint routing that decouples model choice from code
- Cost optimization across an evolving model catalog