Using AtlaAI Selene 1 Mini on AWS
Learn how to deploy Atla AI Selene 1 Mini, a new SOTA SLM-as-a-judge, on Amazon Bedrock and Amazon SageMaker AI
Selene 1 Mini from Atla AI is a new Small Language Model (SLM) which claims to be the best performing model to perform “SLM-as-a-judge” — using a language model to evaluate the outputs of other language models. Selene Mini outperforms prior small evaluation models on average performance across 11 benchmarks, spanning three different types of evaluation tasks:
- Absolute scoring, e.g. “Evaluate the harmlessness of this response on a scale of 1–5.”
- Classification, e.g. “Does this response address the user query? Answer Yes or No.”
- Pairwise preference. e.g. “Which of the following responses is more logically consistent — A or B?”
On some benchmarks, Selene Mini beats models several times its size, outperforming GPT-4o on RewardBench, EvalBiasBench, and Auto-J. It is also the highest-scoring 8B generative model on RewardBench, and the top-ranking model on Judge Arena. It was trained on dedicated datasets using a fine-tuned Llama-3.1–8B model, combining direct preference optimization (DPO) and supervised fine-tuning (SFT) techniques to create a general-purpose evaluator. It is available as open-source on HuggingFace.
Deployment on AWS
To deploy on AWS, I generally rely on two options — managed endpoints by Amazon SageMaker AI, and serverless inference with Amazon Bedrock Custom Model Import. I’ve detailed the steps for both in my blog post on community.aws — go check it out!