Simulators

API reference for chia.simulators. These pages are generated from the docstrings in the source, so they stay in sync with the code.

gem5

chia.simulators.gem5 — gem5 build / run / source-state primitives.

Functions are PATH-BASED and co-located: a gem5 binary is far too large to ship through the object store the way chipyard ships its Verilator binary as bytes, so build / run / capture / restore all operate on a gem5 checkout on the worker’s filesystem and must land on the SAME worker. Gem5Node enforces that via a placement group (see its docstring). Portability across workers is achieved by shipping the small source diff (Gem5Node.capture_gem5_source_state()) and rebuilding, not by moving the binary.

Also defines Gem5ToolServer, the LLM-facing MCP adapter over a Gem5Node (see Gem5Node.spawn_tool()). Because the tool subclasses ChiaTool, importing this module pulls in the MCP/FastMCP stack.

class chia.simulators.gem5.Gem5Isa(*values)[source]

Bases: str, Enum

class chia.simulators.gem5.Gem5Variant(*values)[source]

Bases: str, Enum

class chia.simulators.gem5.Gem5BuildArtifact(binary_path: str, isa: str, variant: str, gem5_root: str, base_rev: str, success: bool, returncode: int, build_duration_s: float, stdout_tail: str, stderr_tail: str)[source]

Bases: object

Result of an scons build. Path-based: binary_path lives on the worker and is only reachable from a run pinned to the same node.

class chia.simulators.gem5.Gem5RunResult(workload_name: str, status: str, returncode: int, outdir: str, num_cycles: int | None, sim_insts: int | None, sim_seconds: float | None, host_seconds: float | None, wall_s: float | None, stats: dict[str, float]=<factory>, stdout_tail: str = '', error_messages: str = '', stats_content: str | None = None, debug_trace: bytes | None = None)[source]

Bases: object

Result of a single gem5 invocation on one workload.

class chia.simulators.gem5.Gem5SourceState(base_rev: str, source_diff: str, config_contents: str = '')[source]

Bases: object

Portable snapshot of edits to a gem5 checkout (ship this, not the binary).

class chia.simulators.gem5.Gem5Node(placement_group=None, require_colocated: bool = True, *, bundle_index: int = 0, reserve_bundle: dict | None = None, pg_strategy: str = 'STRICT_PACK', wait_for_pg: bool = True, pg_ready_timeout_s: float | None = None)[source]

Bases: object

gem5 build / run / source-state primitives sharing one placement.

The four core operations are @staticmethod @ChiaFunction(resources= {"gem5": 1.0}) members; __init__ re-binds each into a per-instance _PinnedChiaFn so node.<op>.chia_remote(...) lands on this node’s bundle. Gem5Node.<op>.chia_remote(...) (the class attribute) is the raw, unpinned form.

Co-location: because gem5 binaries stay on the worker filesystem, every member must run on the same node. Placement is decided once at construction:

  • placement_group given -> pin members to bundle_index of it (the node will NOT release a PG it did not create).

  • none + require_colocated=True -> reserve a 1-bundle {"gem5": 1, "CPU": 1} PG (owned + released by this node).

  • none + require_colocated=False -> no pinning; the caller schedules each .chia_remote / .options(...) call itself.

spawn_tool() (opt-in) creates a Gem5ToolServer co-located on this node’s bundle, exposing the same build/run logic to an LLM over MCP; spawned tools are stopped by close().

Usable as a context manager so spawned tools are stopped and a self-reserved PG is released on exit.

Set up placement and bind the member functions.

Parameters:
  • placement_group – an existing Ray PlacementGroup to schedule onto. If given, require_colocated is moot (placement is already fixed) and this node will not remove the PG on close.

  • require_colocated – when no PG is given, reserve one so all members co-locate. When False, leave placement to the caller.

  • bundle_index – which bundle of the (given or reserved) PG to pin to.

  • reserve_bundle – resource shape of a self-reserved bundle (default {"CPU": 1, "gem5": 1.0}); must provide gem5 >= each member’s requirement (1.0).

  • pg_strategy – placement strategy for a self-reserved PG.

  • wait_for_pg – block on pg.ready() for a self-reserved PG so the node is usable immediately.

  • pg_ready_timeout_s – optional timeout for that wait.

property placement_group

The PG members are pinned to (None when the caller handles placement).

property owns_placement_group: bool

True iff this node reserved its PG and will release it on close().

property task_options: dict

Scheduling opts to co-locate an actor (e.g. a ChiaTool) with this node’s bundle. Empty when the node has no placement group (require_colocated=False) — there is no shared placement to inherit, so an actor given these opts would not be pinned.

spawn_tool(name: str, *, gem5_root: str, config_script: str, workloads, **tool_kwargs)[source]

Create a Gem5ToolServer co-located on this node’s bundle.

Exposes this node’s gem5 build/run/stats to an LLM over MCP, pinned to the same worker so the tool and node.<op>.chia_remote(...) operate on the same on-disk gem5 checkout. The returned tool is tracked and stopped by close().

Requires a placement group — construct the node with require_colocated=True or pass placement_group=...; otherwise there is no shared placement for the tool to join.

tool_kwargs are forwarded to Gem5ToolServer (e.g. config_args, isa, variant, stats_keys, excluded_workloads, timeouts).

close() None[source]

Stop any spawned tools, then release the PG iff this node reserved it.

Tools are stopped BEFORE the PG is removed: a tool’s _ToolServerActor is pinned to this PG, so tearing the PG down first would orphan it. Idempotent.

static build_gem5(gem5_root: str, isa: str = 'RISCV', variant: str = 'opt', *, target: str | None = None, jobs: int | None = None, extra_scons_args: str = '', timeout_s: int = 3600) Gem5BuildArtifact[source]

Incremental scons-build a gem5 binary in gem5_root.

Returns a path-based artifact; on failure stderr_tail carries the filtered compiler/linker diagnostics (so this doubles as a fast compile check). Records base_rev (git HEAD) for later diffing.

static run_gem5(gem5_bin: str, config_script: str, outdir: str, *, workload_name: str | None = None, config_args: list[str] | None = None, gem5_args: list[str] | None = None, debug_flags: str | None = None, debug_file: str | None = None, stats_keys: dict[str, list[str]] | None = None, stats_block: int | str = 'last', capture_stats: bool = False, capture_debug_trace: bool = False, cwd: str | None = None, env: dict[str, str] | None = None, timeout_s: int = 3600) Gem5RunResult[source]

Run a single gem5 invocation and parse cycle/instruction counters.

Builds the command line as:

gem5_bin [gem5_args] --outdir=<outdir> [--debug-flags=..] [--debug-file=..]
         config_script [config_args]

--kernel / --memory-backend etc. are config-script arguments, so pass them via config_args (keeps this node agnostic to any specific config .py). After the run, reads <outdir>/stats.txt and fills num_cycles / sim_insts / stats (logical names from stats_keys, defaulting to DEFAULT_STATS_KEYS).

stats_block picks which dump to read (“last” = final cumulative dump; pass “first” for configs that reset stats at an ROI workbegin).

static capture_gem5_source_state(gem5_root: str, *, base_rev: str | None = None, diff_paths: list[str] | None = None, config_path: str | None = None) Gem5SourceState[source]

Capture a portable snapshot of the current edits in gem5_root.

Computes git diff <base_rev> -- <diff_paths>, first marking untracked files under those paths as intent-to-add so brand-new files (e.g. a new SimObject’s .py/.cc/.hh) are included, then un-staging them to leave the index clean. Optionally snapshots config_path.

static restore_gem5_source_state(gem5_root: str, state: Gem5SourceState, *, restore_paths: list[str] | None = None, config_path: str | None = None) tuple[bool, str][source]

Reset gem5_root to state.base_rev and apply state.source_diff.

git checkout <base_rev> -- <paths> + git clean -fd <paths> (to drop untracked leftovers from a prior apply that would collide with the next apply) + git apply, then optionally write state.config_contents to config_path. Caller is responsible for a subsequent build_gem5. Returns (ok, message).

static parse_gem5_stats(stats_text: str, stats_keys: dict[str, list[str]] | None = None, stats_block: int | str = 'last') dict[str, float][source]

Map logical name -> value from stats.txt text.

stats_keys maps a logical name to ordered candidate gem5 stat names (first present wins); defaults to DEFAULT_STATS_KEYS. stats_block selects which Begin/End Simulation Statistics dump to read (“first” | “last” | int index, negatives allowed).

static parse_gem5_stats_file(stats_path: str, stats_keys: dict[str, list[str]] | None = None, stats_block: int | str = 'last') dict[str, float][source]

Like parse_gem5_stats() but reads from a path; missing file -> {}.

static truncate_gz_trace(path: str, max_decompressed_bytes: int) tuple[int, bool][source]

In-place head-truncate a gzipped trace to a decompressed-byte cap, ending on a line boundary; returns (retained_bytes, was_truncated). Stays valid gzip so downstream readers needn’t know it was trimmed.

static summarize_o3_pipeview(trace_path: str, max_first: int = 30, max_slowest: int = 10, tick_per_cycle: int = 1000, reservoir_size: int = 1000) str[source]

Stream an O3PipeView --debug-file trace into a compact markdown digest (per-stage wait-cycle stats, slowest instructions, first-N table).

O3-CPU-specific. Memory stays bounded (a few tens of KB) regardless of trace length: percentiles come from a reservoir sample, mean/max/count are exact. The raw trace on disk remains the authoritative artifact.

class chia.simulators.gem5.Gem5ToolServer(name: str, gem5_root: str, config_script: str, workloads: dict[str, str] | str, *, gem5_bin: str | None = None, isa: str = 'RISCV', variant: str = 'opt', config_args: list[str] | None = None, kernel_flag: str = '--kernel', stats_keys: dict[str, list[str]] | None = None, stats_block: int | str = 'last', excluded_workloads: set[str] | None = None, workload_glob: str = '*.elf', max_workloads_per_call: int = 5, build_timeout_s: int = 3600, run_timeout_per_workload_s: int = 3600, expose: tuple[str, ...] | None = None, task_options: dict | None = None)[source]

Bases: ChiaTool

LLM-facing MCP tool server exposing Gem5Node’s build/run/stats.

A thin adapter over the node , not a reimplementation:

  • Each method calls the corresponding Gem5Node function locally (a @ChiaFunction invoked directly runs in-process — no extra Ray hop), so one battle-tested build/run implementation serves both the programmatic node path and this agentic path.

  • Operational arguments (gem5_root, gem5_bin, config_script, timeouts, …) are bound at construction, NOT exposed to the LLM — the model only chooses what to run, never where/how the environment is wired.

  • Results are rendered to compact text (the dataclasses are for code, not for a model to read), and long-running calls emit SSE keepalives so the CLI’s idle timer doesn’t abandon the stream and drop the result.

Co-location: the tool runs build/run against gem5_root on the worker’s filesystem, so pin the actor to the same bundle as whatever bash tool the LLM edits source through, via task_options (exactly as the loop pins QuickRunTool). Gem5Node.spawn_tool() wires this up automatically.

LLM-facing methods:
  • {name}_build() — incrementally rebuild gem5 from the current on-disk source and report OK / compiler errors (a fast compile check).

  • {name}_run(workloads, skip_build=False) — build (unless skipping) and run 1..N workloads, returning a small cycles/insts/status table.

  • {name}_stats(workload, pattern) — grep that workload’s most recent stats.txt for a regex (drill into gem5 counters).

  • {name}_list_workloads() — list the workloads available here.

Pass expose=(...) to register only a subset of the four tools — e.g. expose=("build",) makes this server a pure compile-check tool.

Initializes ChiaTool with a name and optional resource requirements.

async build(ctx: Context) str[source]

Incrementally rebuild gem5 from the current on-disk source and report the result.

Use after a meaningful source edit to catch compile errors before spending a run on a binary that won’t build. Returns OK (<secs>) on a clean build, or FAIL ... followed by the relevant compiler/linker diagnostics. A TIMEOUT is inconclusive — the build did not finish in time, NOT proof your code is broken; narrow your edit or try again.

async run(workloads: list[str], ctx: Context, skip_build: bool = False) str[source]

Build (unless skip_build) and run 1..N workloads against the current source/config, returning a markdown table of cycles + instructions + status.

Set skip_build=True to reuse the existing binary when you’ve only changed the gem5 config, not source. After this returns, call {name}_stats("<workload>", "<regex>") to drill into any workload’s gem5 counters. Run at most max_workloads_per_call per call.

stats(workload: str, pattern: str, max_lines: int = 40) str[source]

Grep the most recent run of workload for stats.txt lines matching the regex pattern (capped at max_lines).

Use to read gem5 internal counters (e.g. IntDiv|FuncUnit, iqFullEvents|blockedCycles, dcache.demand, IPC|CPI).

list_workloads() str[source]

List the workloads available on this worker (excluded ones hidden).