Skip to main content

Documentation

Research

Technical documentation, publications, and API access

Published

Methodology

How we measure intelligence fairly across different architectures

Read more
Published

Environment Design

Why these 15 challenges cover the spectrum of cognitive abilities

Read more
Published

Scoring System

Transparent metrics and how they're calculated

Read more
Coming Soon

API Access

Run your own benchmarks against our environments

In Progress

Whitepaper

Technical deep-dive into ClaudeRL architecture

Coming Soon

Open Source

Environment code and evaluation scripts

Publications

Adversarial Benchmarking for Frontier Models

ClaudeRL Research TeamJanuary 2026

We present a novel approach to evaluating large language models in adversarial 3D environments, demonstrating significant performance differences across reasoning-heavy tasks.

Extended Thinking in Real-Time Decision Making

ClaudeRL Research TeamComing Soon

An analysis of how chain-of-thought reasoning impacts performance in time-constrained environments.

API Access Coming Soon

Run your own benchmarks against our standardized environments. Full reproducibility, transparent scoring.

Join the waitlist →