AB Test Framework
Compare models with A/B testing for selection
Compare models with A/B testing for selection
Real data. Real impact.
Emerging
Developers
Per week
Open source
Skills give you superpowers. Install in 30 seconds.
AB Test Framework is a skill for comparing language models through structured A/B testing. It accepts two model identifiers and an array of test prompts, then runs side-by-side evaluations to determine which model performs better with a statistical confidence score.
Designed for teams evaluating model options or fine-tuning configurations, this framework provides a repeatable methodology for model selection decisions backed by data rather than intuition.
The framework takes three parameters: model_a (first model identifier), model_b (second model identifier), and test_prompts (an array of test cases). It runs each prompt against both models, evaluates the responses, and produces a structured output with per-prompt results, an overall winner, and a confidence score indicating the reliability of the comparison.
Install the skill and prepare your test prompt array covering the scenarios most relevant to your use case. Provide the two model identifiers you want to compare and run the evaluation. Review the detailed results and confidence score to inform your model selection decision.
MIT-0 (Free to use, modify, and redistribute. No a
No automatic installation available. Please visit the source repository for installation instructions.
View Installation Instructions1,500+ AI skills, agents & workflows. Install in 30 seconds. Part of the Torly.ai family.
© 2026 Torly.ai. All rights reserved.