AgentStack

Agentic Critical Training (ACT)

by Agentic Critical Training

6
research

Description

ACT is a **two-stage RL training paradigm** that fixes the fundamental weakness of imitation learning (IL): IL teaches agents *what* expert actions look like, but never forces models to understand *why* those actions are better than alternatives.

Steal Patterns

**GRPO training infrastructure** — Requires GPU cluster; inapplicable to API-only Forge stack

**DeepSpeed ZeRO-3 configuration** — Distributed training framework, not applicable

**Phase 1/Phase 2 training pipeline** — Fully inaccessible without local model weights and training hardware

**K-sample alternative collection** — Only relevant if training; sampling from API models for training data violates most provider ToS

Tags

researchtypescriptopen-sourcepaperllm
Added: 2026-03-09

Related Tools