← All work
Tooling · 2026

Agent Skill for Cost-Optimized Model Selection

Overview

A published agent skill that helps developers wire the cheapest model that still clears their own accuracy bar, behind one stable OpenAI-compatible endpoint. It is the agent-facing companion to an internal model-optimization / LLM-gateway product.

Why It Exists

Picking the right model for a reasoning step (extraction, classification, summarization, an agent step) is usually guesswork, and the “right” choice changes every time a new model ships. This skill turns that into a benchmark: point it at a step in your code, evaluate candidate models against your own examples, and route to the cheapest one that passes.

What We Built

A skills package installable via npx skills add, containing the optimized-model skill (a SKILL.md definition). The skill benchmarks candidate models against user-supplied examples, selects the cheapest model above the accuracy threshold, and writes a stable om_<id> routing endpoint into the caller’s pipeline, so the id never changes even when the underlying model is later swapped for a better one. Onboarding is intentionally lightweight (email plus a small one-time top-up to fund the benchmark), and each run produces a shareable benchmark report.

Technologies & Approach

A Markdown-defined agent skill following the npx skills convention, fronting a benchmarking-and-routing service exposed over an OpenAI-compatible endpoint. MIT-licensed. The design bet is stable, swappable endpoints that decouple “which model” from the calling code.

Outcome / Impact

Packages an opinionated, rerunnable workflow for keeping pipelines on the cheapest model that meets quality, letting teams continuously re-optimize cost without touching application code.

Capabilities Demonstrated

  • Authoring reusable agent skills for the skills ecosystem
  • Benchmark-driven, accuracy-gated model selection
  • Stable endpoint routing that decouples model choice from code
  • Cost optimization across an evolving model catalog
More work See all →