The Pride Month Virtual Career Fair is on June 23. Register today and explore career opportunities at Abbvie, Land O'Lakes, Varsity Brands, NASCAR, Fidelity Investments, Alkermes, Strategic Education, and The TJX Companies.
This job is expired.

Machine Learning Engineer

Full-Time

Job Description

About Us

We are AI researchers and builders who understand how to curate data and RL environments that truly improve models. We curated OpenThoughts, one of the best open reasoning datasets, and have trained SOTA models such as Bespoke-MiniCheck and Bespoke-MiniChart.

We are embarked on a journey to build Environments that are entire digital worlds that can be used to push the frontier of agents.

What You'll Be Working On

You will work directly with our research team on RL environment and task creation for agent training. This means designing observation spaces, action spaces, reward signals, and success criteria for new environments - and building the infrastructure that makes world-scale RL training possible. This is a high-ownership role; you will be building novel systems, not maintaining legacy ones.

Must-Have Skills

3+ years of ML engineering experience - model training, fine-tuning, or post-training pipelines in research or production

Strong Python and deep learning proficiency (PyTorch preferred; familiar with training loops, optimizers, mixed precision)

Hands-on experience with LLM post-training - SFT, RLHF, PPO, DPO, or reward model training - and understanding of how training data quality affects model behavior

Familiarity with RL frameworks (Gymnasium, dm_env) and the ability to design or modify reward functions for agent training objectives

Experience running experiments at scale on cloud or HPC (AWS, GCP, SLURM, or Ray)

Solid understanding of evaluation methodology - held-out sets, benchmark design, avoiding train/eval contamination

PDN-a20f7611-4f7b-401d-83b1-ad74e88dd391

About Us

We are AI researchers and builders who understand how to curate data and RL environments that truly improve models. We curated OpenThoughts, one of the best open reasoning datasets, and have trained SOTA models such as Bespoke-MiniCheck and Bespoke-MiniChart.

We are embarked on a journey to build Environments that are entire digital worlds that can be used to push the frontier of agents.

What You'll Be Working On

You will work directly with our research team on RL environment and task creation for agent training. This means designing observation spaces, action spaces, reward signals, and success criteria for new environments - and building the infrastructure that makes world-scale RL training possible. This is a high-ownership role; you will be building novel systems, not maintaining legacy ones.

Must-Have Skills

3+ years of ML engineering experience - model training, fine-tuning, or post-training pipelines in research or production

Strong Python and deep learning proficiency (PyTorch preferred; familiar with training loops, optimizers, mixed precision)

Hands-on experience with LLM post-training - SFT, RLHF, PPO, DPO, or reward model training - and understanding of how training data quality affects model behavior

Familiarity with RL frameworks (Gymnasium, dm_env) and the ability to design or modify reward functions for agent training objectives

Experience running experiments at scale on cloud or HPC (AWS, GCP, SLURM, or Ray)

Solid understanding of evaluation methodology - held-out sets, benchmark design, avoiding train/eval contamination

PDN-a20f7611-4f7b-401d-83b1-ad74e88dd391

About Bespoke Labs

Related Jobs

Apply For This Job
Machine Learning Engineer
Bespoke Labs
Jun 19, 2026
Full-time
Your Information
First Name *
Last Name *
Email Address *
This email belongs to another account. Please use a diferent email address or Sign In.
Zip Code *
Password *
Confirm Password *
Create your Profile from your Resume
Job is Expired
Continue to Apply

Bespoke Labs would like you to finish the application on their website.

©2026 TalentAlly.
Powered by TalentAlly.