ML Engineer — Evaluation

Scale AI·San Francisco, CA

Machine LearningMidFull-time

$190K-$300KPosted 3 weeks ago

About the Role

Scale AI is seeking an ML Engineer to build automated evaluation systems for large language models. You will develop benchmarks, create model comparison tools, and design metrics that accurately measure model capabilities across diverse tasks.

Requirements

3+ years of ML engineering experience
Strong skills in Python and statistical analysis
Experience with LLM evaluation methodologies
Understanding of NLP benchmarks and metrics
Strong software engineering fundamentals

Nice to Have

Experience building leaderboards or benchmark platforms
Background in psychometrics or test design
Familiarity with Scale AI's evaluation platform
Experience with multi-turn evaluation frameworks

Benefits

Equity in a fast-growing AI company

Full benefits package

In-office perks (meals, gym)

Professional development budget

Annual team retreats

Commuter benefits

Skills

PythonLLM EvaluationNLPBenchmarkingStatistical AnalysisML Engineering

Apply for this Position

Related Jobs

Senior Machine Learning Engineer

OpenAI · San Francisco, CA

$250K-$400KMachine Learning

ML Engineer — FAIR

Meta · Menlo Park, CA

$200K-$350KMachine Learning

Data Operations Lead — RLHF

Scale AI · San Francisco, CA

$160K-$240KData Science

ML Engineer — Constitutional AI

Anthropic · San Francisco, CA · Remote

$220K-$370KMachine Learning

Preparing for Your AI Career?

Vincony has all 400+ AI models in one place — compare responses, AI debate, Image/Video/Voice generator, and 20 more tools to help you learn and build with AI.

Visit Vincony.com