
Selene
Overview of Selene
Selene by Atla AI: Frontier AI Evaluation Models
What is Selene?
Selene is a suite of open-source LLM Judge models developed by Atla AI, designed to provide precise and reliable evaluations of AI application performance. It helps developers build trust with customers by ensuring the reliability of their generative AI apps through detailed scores and actionable critiques.
How does Selene work?
Selene models function as LLM-as-a-Judge, analyzing AI responses to provide scores and critiques. You can use the Selene models through Hugging Face Transformers, Ollama, or Github.
Selene Models
Explore the right size for your evaluation needs with two primary models:
- Selene 1: The flagship model offering industry-leading accuracy across a wide variety of evaluation tasks. Ideal for pre-production evaluations.
- Selene 1 Mini: A lean, optimized version perfect for running evaluations at inference time, prioritizing speed and efficiency.
Key Features and Benefits
- High Accuracy: Selene is designed to provide the most accurate evaluations available.
- Versatile Evaluation: Suitable for a wide variety of eval tasks.
- Optimized for Speed: Selene 1 Mini is optimized for running evals quickly during inference.
- Open Source: Use and contribute to the models through Hugging Face Transformers.
How to Use Selene
To use Selene, you can leverage the Hugging Face Transformers library. Here's a simple example:
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto
model_id = "AtlaAI/Selene-1-Mini-Llama-3.1-8B"
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_id)
prompt = "I heard you can evaluate my responses?" # replace with your eval prompt
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=512, do_sample=True)
generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
Use Cases
- Evaluating Agent Performance: Use Selene to evaluate the performance of AI agents, track errors, and gain instant insights.
- Building Trust: Ensure the reliability of your generative AI app to build trust with customers.
- Pre-Production Evals: Use Selene 1 for rigorous evaluations before deploying your AI application.
- Inference-Time Evals: Use Selene 1 Mini for quick evaluations during inference.
Why is Selene important?
As AI applications become more prevalent, ensuring their reliability and trustworthiness is crucial. Selene provides a robust and accurate means of evaluating AI performance, empowering developers to create safer and more reliable AI systems. It is particularly important for building trust with customers, especially in generative AI applications where outputs can be unpredictable.
Where can I use Selene?
You can integrate Selene into your AI development workflow using Hugging Face Transformers. Also, you can explore Agent Evals by Atla to enhance and track Agents.
By providing open-source evaluation models, Atla AI contributes to a future with safe and reliable AI.
Best Alternative Tools to "Selene"

Browse AI: Extract web data, monitor changes, and turn websites into APIs without coding. AI-powered for easy and reliable data extraction.

Huawei's open-source AI framework MindSpore. Automatic differentiation and parallelization, one training, multi-scenario deployment. Deep learning training and inference framework supporting all scenarios of the end-side cloud, mainly used in computer vision, natural language processing and other AI fields, for data scientists, algorithm engineers and other people.

Build Telegram apps for AI startups fast. Chatbots, Mini Apps and AI infrastructure. From idea to MVP in 4 weeks.

EnergeticAI is TensorFlow.js optimized for serverless functions, offering fast cold-start, small module size, and pre-trained models, making AI accessible in Node.js apps up to 67x faster.

Rowy is an open-source, Airtable-like CMS for Firestore with a low-code platform for Firebase and Google Cloud. Manage your database, build backend cloud functions, and automate workflows effortlessly.

Focus Gulf is a leading supplier of industrial equipment and spare parts in Saudi Arabia. Discover quality products tailored to your business needs, including pumps, generators, and testing tools.

DomainScore.ai is an AI-powered tool providing comprehensive domain name evaluation and scoring based on relevance, brandability, trustworthiness, SEO, and simplicity.

Visage Technologies specializes in AI/ML solutions, offering consultancy and engineering services optimized for performance, accuracy, and compliance. Experts in edge AI and computer vision.

CopyFrog is an AI content creator that generates high-quality images, text, and video content for marketing, social media, and product descriptions. Try it for free!

KushoAI transforms your inputs into a comprehensive ready-to-run test suite. Test web interfaces and backend APIs in minutes with our AI Agents.

myGPTReader: AI chatbot for reading and summarizing web pages, documents, and YouTube videos, powered by chatGPT.

Refact.ai, the #1 open-source AI agent for software development, automates coding, debugging, and testing with full context awareness. An open-source alternative to Cursor and Copilot.

Predibase is a developer platform for fine-tuning and serving open-source LLMs. Achieve unmatched accuracy and speed with end-to-end training and serving infrastructure, featuring reinforcement fine-tuning.

Discover Molmo AI, the state-of-the-art open-source multimodal AI model. Powerful, free, and easy to use for image processing, text analysis, and more.

CloudVerse.AI is a cloud financial management platform for multicloud FinOps, optimizing spending with AI-driven insights.