Full Pipeline Guide¶

This guide walks through building a complete ACE pipeline from scratch — choosing components, defining an environment, running training, and saving results.

Components¶

A full pipeline needs four things:

LLM Client — the language model powering all three roles
Three Roles — Agent, Reflector, SkillManager
Environment — evaluates agent outputs
Samples — training data with questions and ground truth

Step 1: Create the LLM Client¶

from ace_next import LiteLLMClient

llm = LiteLLMClient(model="gpt-4o-mini")

For robust JSON parsing with small models, wrap with Instructor (requires pip install ace-framework[instructor]):

from ace_next import LiteLLMClient, wrap_with_instructor

llm = wrap_with_instructor(LiteLLMClient(model="ollama/gemma3:1b"))

Step 2: Create the Roles¶

from ace_next import Agent, Reflector, SkillManager

agent = Agent(llm)
reflector = Reflector(llm)
skill_manager = SkillManager(llm)

Optionally use a cheaper model for learning:

agent_llm = LiteLLMClient(model="gpt-4o")
learning_llm = LiteLLMClient(model="gpt-4o-mini")

agent = Agent(agent_llm)
reflector = Reflector(learning_llm)
skill_manager = SkillManager(learning_llm)

Step 3: Define an Environment¶

The environment evaluates agent outputs. Extend TaskEnvironment and implement evaluate():

from ace_next import TaskEnvironment, EnvironmentResult

class MathEnvironment(TaskEnvironment):
    def evaluate(self, sample, agent_output):
        correct = str(sample.ground_truth).lower() in str(agent_output.final_answer).lower()
        return EnvironmentResult(
            feedback="Correct!" if correct else f"Incorrect. Expected: {sample.ground_truth}",
            ground_truth=sample.ground_truth,
            metrics={"accuracy": 1.0 if correct else 0.0},
        )

Or use the built-in SimpleEnvironment for basic ground-truth matching:

from ace_next import SimpleEnvironment

environment = SimpleEnvironment()

Step 4: Prepare Samples¶

from ace_next import Sample

samples = [
    Sample(question="What is 2+2?", context="", ground_truth="4"),
    Sample(question="Capital of France?", context="", ground_truth="Paris"),
    Sample(question="Who wrote Hamlet?", context="", ground_truth="Shakespeare"),
]

Step 5: Build and Run the Pipeline¶

from ace_next import ACE

runner = ACE.from_roles(
    agent=agent,
    reflector=reflector,
    skill_manager=skill_manager,
    environment=environment,
)

results = runner.run(samples, epochs=3)

Step 6: Save the Skillbook¶

runner.save("trained.json")
print(f"Learned {len(runner.skillbook.skills())} strategies")

Complete Example¶

from ace_next import (
    ACE, Agent, Reflector, SkillManager,
    LiteLLMClient, Sample, SimpleEnvironment,
)

# LLM and roles
llm = LiteLLMClient(model="gpt-4o-mini")
agent = Agent(llm)
reflector = Reflector(llm)
skill_manager = SkillManager(llm)

# Pipeline
runner = ACE.from_roles(
    agent=agent,
    reflector=reflector,
    skill_manager=skill_manager,
    environment=SimpleEnvironment(),
)

# Training data
samples = [
    Sample(question="What is 2+2?", context="", ground_truth="4"),
    Sample(question="Capital of France?", context="", ground_truth="Paris"),
]

# Train and save
results = runner.run(samples, epochs=3)
runner.save("trained.json")

Checkpoints¶

Save the skillbook automatically during long training runs:

runner = ACE.from_roles(
    agent=agent,
    reflector=reflector,
    skill_manager=skill_manager,
    environment=environment,
    checkpoint_dir="./checkpoints",
    checkpoint_interval=10,  # Save every 10 samples
)

This creates:

ace_checkpoint_10.json, ace_checkpoint_20.json, etc.
ace_latest.json (always the most recent)

Deduplication¶

Prevent duplicate skills from accumulating (requires pip install ace-framework[deduplication]):

from ace_next import DeduplicationConfig, DeduplicationManager

dedup = DeduplicationManager(DeduplicationConfig(
    enabled=True,
    embedding_model="text-embedding-3-small",
    similarity_threshold=0.85,
))

runner = ACE.from_roles(
    ...,
    dedup_manager=dedup,
    dedup_interval=10,
)

Custom Prompts¶

The default prompts are v2.1 and work well out of the box. You can pass your own templates via the prompt_template parameter:

agent = Agent(llm, prompt_template="Your custom agent prompt with {skillbook}, {question}, {context}")
reflector = Reflector(llm, prompt_template="Your custom reflector prompt ...")
skill_manager = SkillManager(llm, prompt_template="Your custom skill manager prompt ...")

See Prompt Engineering for template variables and more examples.

Testing Without API Calls¶

Use a mock to test pipeline wiring without making real LLM calls. Any object satisfying the LLMClientLike protocol (with complete() and complete_structured() methods) works:

from unittest.mock import MagicMock

mock_llm = MagicMock()
mock_llm.complete.return_value = '{"reasoning": "test", "final_answer": "4", "skill_ids": []}'

agent = Agent(mock_llm)
reflector = Reflector(mock_llm)
skill_manager = SkillManager(mock_llm)

Observability¶

Add Opik tracing to any pipeline via extra_steps (requires pip install ace-framework[observability]):

from ace_next import ACE, OpikStep, register_opik_litellm_callback

runner = ACE.from_roles(
    agent=agent,
    reflector=reflector,
    skill_manager=skill_manager,
    environment=environment,
    extra_steps=[OpikStep(project_name="my-experiment")],
)

# Optionally add per-LLM-call cost tracking
register_opik_litellm_callback(project_name="my-experiment")

See Opik Observability for full details.

Going Deeper: Manual Pipeline Composition¶

The ACE.from_roles() runner composes a Pipeline internally. You can build the same pipeline yourself for full control over step ordering, branching, and custom steps:

from ace_next import Pipeline, AgentStep, EvaluateStep, learning_tail

pipe = Pipeline([
    AgentStep(agent),
    EvaluateStep(environment),
    *learning_tail(reflector, skill_manager, skillbook),
])

See Composing Pipelines for the complete guide.

What to Read Next¶

Composing Pipelines — compose custom pipelines from steps
Async Learning — parallel Reflector execution
Prompt Engineering — customize prompt templates
Integration Pattern — wrap existing agents instead
Opik Observability — monitor costs and traces