Quickstart: Say Hello (World) with CHIA
In this brief introduction to using CHIA, we explore 4 ways you can use CHIA to say Hello (World)!
What we will do in this tutorial
As we describe in chia-basics, a CHIA workflow consists of two parts: a flow and a cluster. The flow is a Python script which orchestrates loops and sequences of tasks, and a cluster is the compute substrate on which a flow executes.
In this tutorial, we will design a simple CHIA workflow (both flow and cluster) to say Hello.
First, we will build a very simple cluster (only requiring a single machine, with the option to use multiple) with a few worker nodes from Chipyard, and write a flow which prints “Hello World!” from a worker in the cluster.
Second, we will use that simple cluster to build a verilator simulator of an in-order RocketChip core in Chipyard, compile a C program which prints “Hello World!” for the RocketChip core, and run the program in our Verilator simulator.
Third, we will add an OpenCode AI agent to the cluster and flow, and ask it to say “Hello World!”
Fourth, and finally, we will ask the AI agent to edit our RocketChip’s RTL using an MCP tool. We will ask it to add a line to the RocketChip source which prints to the Verilator output “Hello World!” after 1000 instructions have been committed by the processor.
Requirements
For this tutorial you need a Linux machine with an SSH server. It needs to be able to use public key authentication, and you will need a key set up for SSHing into the machine. This machine must have Docker installed. Additionally, this machine must have CHIA installed.
Optionally, if you have multiple machines available to you on the same LAN (for example, multiple research computers at a single university), there will be sections of this tutorial which allow you to see CHIA’s ability to spread compute across multiple physical machines. Each of these machines must have Docker installed, but does not need CHIA installed.
The chia/examples/hello-world directory contains checkpoints of the code after each step.
0. Project Setup
Start by creating a directory for the project on your machine. Open a shell in that directory and activate the conda environment where you have installed CHIA. We will assume in this tutorial that this environment is named chia_env.
mkdir chia-hello-world
cd chia-hello-world
conda activate chia_env
You should assume that, unless otherwise specified, all commands in this tutorial are run from within the chia_env conda environment and in the project’s directory.
1. print() ing Hello from a worker
CHIA clusters are described in .yaml format. We will start by creating a file called cluster.yaml.
First, we need to add the provider key. The provider key is where we specify the head machine of the cluster. Provider only requires the subkey head_ip.
head_ip specifies the hostname or IP address of the physical machine from which the cluster will be managed. You should fill in this key with according to the hostname or private IP address of the machine on which you installed CHIA. CHIA always runs the head node on an existing machine; cloud workers are added via aws_nodes / gcp_nodes (not covered in this tutorial).
There is no worker_ips key: the machines eligible to host workers are taken from the compatible_ips you list on each node type below, so every worker machine is named exactly once next to the workers that run on it.
provider:
head_ip: ${THIS_MACHINE}
CHIA fills variables specified with ${VARIABLE} in the configuration from environment variables if they are defined, so you could copy this code block directly and just export a value for ${THIS_MACHINE} as an environment variable.
Next, we need to add the authentication credentials we will use to bring the cluster up. The auth key needs two subkeys: ssh_user and ssh_private_key. ssh_user is the username you use to login to the physical machines in the cluster, and ssh_private_key is a key for authenticating your user. If your keys are saved in your SSH agent, you do not need to specify ssh_private_key. CHIA allows you to override the username and key on a per machine basis.
auth:
ssh_user: ${USER}
ssh_private_key: ~/.ssh/<ssh_key_name> # Comment if credentials are in SSH agent
overrides: # Not needed for this tutorial
<hostname>:
ssh_user: otheruser
Now, we will specify our logical workers to map onto our physical machines. As described above, we will need a worker for Chipyard, so let’s add that into our configuration.
available_node_types:
hello_chipyard:
resources: {"chipyard": 1}
num_workers: 1
compatible_ips: [${THIS_MACHINE}] # Or [Another_host]
worker_setup_commands: ["source /home/ray/chipyard/env.sh"]
docker:
image: "ghcr.io/ucb-bar/chia-chisel-build:latest"
container_name: "chia-chipyard-${USER}"
run_options:
- --shm-size=10.24gb # More shared memory
pull_timeout: 7200 # default is 600 seconds but chia-chisel-build is large
Node types are specified under the available_node_types key, where subkeys are the names of different node types. Below, we introduce a node type named hello_chipyard. This node exposes a resource named “chipyard” broadcasting to the flow that it has chipyard available. We will look at how this is leveraged later in this tutorial. We also specify the number of workers we want of this type (1), the IP addresses/hostnames of machines which can support this worker type (your head node machine or optionally another machine you have available to you), and a set of commands which are run before executing any tasks on this worker (in this case we source the chipyard environment shell script).
Finally, we also specify a docker image (we will use CHIA provided Docker images for Chipyard and Verilator), a container_name, arguments to pass to Docker during container construction (run_options), and in the Chipyard case, a large pull_timeout because the Chipyard container is fairly large. Container names are appended with numbers so that multiple workers can be constructed of a given type without any container naming conflicts. When multiple CHIA users are sharing a system, we recommend including your username in the container name.
Let’s also add a Verilator node.
available_node_types:
# ...
hello_verilator:
resources: {"verilator_run": 1}
num_workers: 1
compatible_ips: [${THIS_MACHINE}] # Or [Another_host]
docker:
image: "ghcr.io/ucb-bar/chia-verilator-run:latest"
container_name: "chia-verilator-${USER}"
run_options:
- --ulimit nofile=65536:65536
- --shm-size=10.24gb
The final piece for the cluster is specifying commands which are run automatically at different moments when bringing up the cluster and running jobs on the cluster.
head_env_commands are run to set up the environment for all job’s run on the head node.
head_setup_commands are run to set up the environment on the head only when bringing up the cluster
head_start_ray_commands are run to start Ray on the head node.
worker_start_ray_commands are run to start Ray on the worker nodes.
In general, the these commands should source your shell environment scripts (like ~/.bashrc), and activate your conda CHIA environment as shown below. When activating Ray, setting --dashboard-agent-listen-port=0 ensures that the Ray dashboard agent will choose an open port instead of failing to open an in-use port, avoiding conflicts when multiple CHIA clusters have workers on the same machines. Currently, CHIA only supports one head node per physical machine, but a machine can host workers in many separate CHIA clusters.
head_env_commands: ["source ~/.bashrc && conda activate chia_env"]
head_setup_commands: ["source ~/.bashrc && conda activate chia_env"]
head_start_ray_commands:
- "source ~/.bashrc && conda activate chia_env && ray stop"
- "source ~/.bashrc && conda activate chia_env && \
ray start --head --port=6379 \
--include-dashboard=True --dashboard-agent-listen-port=0"
worker_start_ray_commands:
- "ray stop"
- "ray start --address=$RAY_HEAD_IP:6379 --dashboard-agent-listen-port=0"
That’s our cluster! With all of these keys in our cluster.yaml, we can now run the following command to create the cluster:
chia up cluster.yaml
Now let’s write the simplest CHIA flow we could write: a flow that dispatches a print() statement onto any available worker in the cluster. Create a script called hello-world.py, and add the following lines:
from chia.base.ChiaFunction import ChiaFunction, get
@ChiaFunction()
def print_hello_world():
print("Hello World (#1) from a remote call!")
Here we import the ChiaFunction decorator, and the get function. @ChiaFunction is how we declare that a function is a CHIA node. It gives us the ability to schedule the function onto the cluster, using fn.chia_remote(), as shown below. Additionally, any function annotated with @ChiaFunction, even when not called using fn.chia_remote(), is profiled by CHIA’s profiler. Next, add this to the python file:
def main():
get(print_hello_world.chia_remote())
if __name__ == "__main__":
main()
chia_remote() is non-blocking and returns a tag which is used to eventually wait for the function to finish and collect it’s results. The easiest way to wait for and collect these results is with get(), as we do above.
That’s the full script. We can schedule this script onto the cluster using the following command:
chia job submit --working-dir . -- python hello-world.py
In the output you should see lots of messages from the cluster, including a line towards the end of the run of the script which says:
(print_hello_world pid=979878) Hello World (#1) from a remote call!
2. Print Hello in Verilator
In this step, we will show how you can leverage CHIA to easily use existing hardware design tools. Our example uses Chipyard and Verilator. Specifically, we will add the following three things into our CHIA flow:
Compiling a single-file C program to run on RocketChip using a custom CHIA node.
Elaborating a RocketChip core and building a Verilator simulator for it using a provided CHIA node.
Running our C program on our RocketChip core in the Verilator simulator using a provided CHIA node.
We start by writing the simple C program we will use called helloworld.c, which just prints “Hello World (#2) from Verilator!”
#include <stdio.h>
int main(void)
{
printf("Hello World (#2) from Verilator!\n");
return 0;
}
We import CHIA’s pre-made Chipyard Chisel Build and Verilator nodes. We will also import some other libraries we will use in this section.
from chia.base.ChiaFunction import ChiaFunction, get # Already here
from chia.chipyard.chisel_build_node import *
from chia.chipyard.verilator_run_node import *
from pathlib import Path
import subprocess
from ray import ObjectRef # Only for type annotations
Next, we create a new CHIA node for compiling single-file C programs to run on our RocketChip core. CHIA provides a node for doing this, but we include it here for instructive purposes.
@ChiaFunction(resources={"chipyard": 1})
def compile_program(src_contents : str) -> bytes:
with open("temp.c", "w", encoding="utf-8") as src:
src.write(src_contents)
cmd = []
cmd += ["/home/ray/chipyard/.conda-env/riscv-tools/bin/riscv64-unknown-elf-gcc"]
cmd += ["-static"]
cmd += ["-specs=htif_nano.specs"]
cmd += ["temp.c", "-o", "temp.riscv"]
subprocess.run(
cmd,
capture_output=True,
text=True,
)
with open("temp.riscv", "rb") as elf:
elf_data = elf.read()
return elf_data
compile_program is annotated with @ChiaFunction, indicating that we want to be able to schedule it onto our cluster. In addition, it specifies resources={"chipyard": 1}, meaning that it must be scheduled onto a worker with an available exposed resource named chipyard, and will consume 1 of that resource for the duration of it’s running. Above, when creating our cluster, we defined our hello_chipyard worker with a resource called chipyard, so this node will be able to run on that worker.
compile_program takes as input the contents of a C source file. It writes the contents into a temporary file called temp.c on the worker, and then compiles it using a subprocess, into a file called temp.riscv, the contents of which are returned by the function.
Next, in our main function, we add the following code.
CHIPYARD_PATH = "/home/ray/chipyard" # Our Docker container's Chipyard path
CHIPYARD_CONFIG = "RocketConfig"
def main():
get(print_hello_world.chia_remote()) # Already here
# First new block
c_src = Path("helloworld.c").read_text(encoding="utf-8")
testBinFuture : ObjectRef[bytes] = compile_program.chia_remote(c_src)
# Second new block
cb_node = ChiselBuildNode(
CHIPYARD_PATH, CHIPYARD_CONFIG, target = BuildTarget.VERILATOR
)
buildFuture : ObjectRef[BuildArtifact] = cb_node.build.chia_remote(cb_node)
# Third new block
verilator_node = VerilatorRunNode()
run_output : RunResult = get(
verilator_node.run.chia_remote(
verilator_node, buildFuture, testBinFuture,
"helloworld.riscv", "/home/ray"
)
)
print("Output of Verilator run (run_output.log): ")
print(run_output.log)
The first new block reads in our C program and compiles it using the compile_program node we just made.
The second creates a ChiselBuildNode for a RocketConfig targeting a Verilator simulator, and builds the Verilator simulator.
The third block takes the results from the first two blocks, and runs our Verilator simulation in the home/ray directory on our hello_verilator worker, naming the passed in test binary helloworld.riscv for profiling purposes. Finally, we print the output (returned in run_output.log).
You may notice that we don’t call get() on the returns of the chia_remote() calls for compiling and building. This is because chia_remote() calls can wait for and collect the final results from past remote function executions. Any argument to a chia_remote() call can be of the parameter’s type T or can be an ObjectRef[T].
If you submit this script as above to the cluster, the output should now include the following (including our print from our C file):
(print_hello_world pid=996505) Hello World (#1) from a remote call!
Output of Verilator run (run_output.log):
Hello World (#2) from Verilator!
[UART] UART0 is here (stdin/stdout).
- /home/ray/chipyard/sims/verilator/generated-src/chipyard.harness.TestHarness.RocketConfig/gen-collateral/TestDriver.v:179: Verilog $finish
Note that this script will take a few minutes to build the RocketConfig and run Verilator. If you are working on a machine where you can open a web browser, you can monitor progress using the Ray dashboard by opening http://127.0.0.1:8265/#/overview. Alternatively, you can forward this address to another computer and look at it in your browser there (VSCode provides support for this when working in it’s Remote SSH extension).
3. Asking an LLM to say Hello
In this step, we add a new worker to the cluster which will provide the OpenCode CLI. OpenCode is a free and open source coding agent which comes with some free to use models. We will ask OpenCode to say “Hello World!”.
First, we will add a new hello_opencode worker type to our cluster.yaml configuration which uses our CHIA provided OpenCode docker image. The following block should be sufficient.
available_node_types:
# other worker types ...
hello_opencode:
resources: {"opencode_creds": 1}
worker_setup_commands: ["source ~/.bashrc"]
num_workers: 1
compatible_ips: [${THIS_MACHINE}]
docker:
image: ghcr.io/ucb-bar/chia-opencode:latest
container_name: "chia-opencode-${USER}"
We can expand our already running cluster with the following command.
chia up --add cluster.yaml
This version of chia up looks at the already running cluster, and compares it to the current configuration in cluster.yaml. CHIA will attempt to bring up any workers which were not yet instantiated or which have died.
Next, we will add a new import and a small block to our flow script which prompts OpenCode using the free big-pickle model.
from chia.base.ChiaFunction import ChiaFunction, get
from chia.chipyard.chisel_build_node import *
from chia.chipyard.verilator_run_node import *
from pathlib import Path
import subprocess
from ray import ObjectRef
from chia.models.opencode import * # New line
# ...
def main():
get(print_hello_world.chia_remote()) # Already here
llm = OpenCodeLLM(model="opencode/big-pickle")
resp : QueryResult = get(llm.prompt.chia_remote(llm,
"Can you please respond exactly \"Hello World (#3) from OpenCode!\""
))
print("LLM Responded:")
print(resp.result)
# Already here
c_src = Path("helloworld.c").read_text(encoding="utf-8")
testBinFuture : ObjectRef[bytes] = compile_program.chia_remote(c_src)
You can now submit the job. You can let it run to completion again if you want, but if you want to stop it early, you can use the following. First, give the job a submission id when you submit it:
chia job submit --submission-id RUNSTEP3 --working-dir . -- python hello-world.py
Then, after you’ve seen the LLM’s response, you can stop tailing the log of the run with CRTL+C and stop the job completely by running the following command:
chia job stop RUNSTEP3
Note that the job will continue running to completion unless you explicitly stop it.
In the output of this run, you should now see the following lines:
(print_hello_world pid=1057797) Hello World (#1) from a remote call!
LLM Responded:
Hello World (#3) from OpenCode!
Output of Verilator run (run_output.log):
Hello World (#2) from Verilator!
[UART] UART0 is here (stdin/stdout).
- /home/ray/chipyard/sims/verilator/generated-src/chipyard.harness.TestHarness.RocketConfig/gen-collateral/TestDriver.v:179: Verilog $finish
4. Editing RTL to say Hello
Finally, we get to tie all of this together with a nice demonstration of a very natural use-case for CHIA: LLMs writing RTL. Specifically, we are going to ask an LLM to have our RocketChip core print “Hello World (#4) from instruction 1000!” when the core has retired it’s 1000th instruction.
In our flow, let’s start with the following new import, the BashTool.
from chia.base.ChiaFunction import ChiaFunction, get
from chia.chipyard.chisel_build_node import *
from chia.chipyard.verilator_run_node import *
from pathlib import Path
import subprocess
from ray import ObjectRef
from chia.models.opencode import *
from chia.base.tools.BashTool import * # New line
# ...
In CHIA, just like any function can be a node, any function can be made into an MCP tool, by registering that function with an object of type ChiaTool. In this tutorial we do not create any new custom MCP tools. Any ChiaTool can be passed to any agent/LLM prompts in CHIA.
A BashTool is an MCP tool (its a child class of ChiaTool) which provides an agent/LLM with a bash interface to a specific worker.
We can instantiate the BashTool like this. We give it the name “chipyard_bash”, set it’s working directory to CHIPYARD_PATH, and set it’s task options so that it lands on our hello_chipyard worker by specifying that it requires a chipyard resource.
def main():
get(print_hello_world.chia_remote())
llm = OpenCodeLLM("opencode/big-pickle")
resp : QueryResult = get(llm.prompt.chia_remote(
llm, "Can you please respond just \"Hello World from OpenCode!\""))
print("LLM Responded:")
print(resp.result)
# ====== New code starts here ======
chipyBash : BashTool = BashTool(
"chipyard_bash", CHIPYARD_PATH,
task_options={"resources": {"chipyard": 1}}
)
Let’s write a prompt for the LLM and query it, passing in our chipyTool to the tools argument:
LLM_RTL_PROMPT = "" \
"Can you use the chipyard_bash tool to " \
"edit chipyard's generators/rocket-chip " \
"Chisel source and add a new print to the " \
"Chisel which says \"Hello World (#4) from " \
"instruction 1000!\" when instret reaches " \
"1000? If any message already exists, add " \
"the word \"again\" at the end of the existing " \
"message. Please respond explaining your change."
def main():
# ...
chipyBash : BashTool = BashTool(
"chipyard_bash", CHIPYARD_PATH,
task_options={"resources": {"chipyard": 1}}
)
resp : QueryResult = get(llm.prompt.chia_remote(
llm, LLM_RTL_PROMPT, tools=[chipyBash]
))
chipyBash.stop()
print(resp.result)
Note that it’s very important in this example to stop chipyBash because it consumes a chipyard resource. If you leave it running, the other nodes which use chipyard resources will be starved.
Finally, let’s parse the output of the Verilator run to check for the LLM’s change.
# Already Here
verilator_node = VerilatorRunNode()
run_output = get(verilator_node.run.chia_remote(verilator_node, buildFuture, get(testBinFuture), "helloworld.riscv", "/home/ray/verilator/"))
# Already Here
print("Output of Verilator run (run_output.log): ")
print(run_output.log)
# ===== New =====
# First line of run_output.out is "testing $random ..."
# Next 1000 lines are committed instruction trace
# 1002 line should be "Hello from instruction 1000!"
print("Parsed lines from simulation (run_output.out): ")
print(run_output.out.split('\n')[1001])
print(run_output.out.split('\n')[1002]) # Should have LLM's addition
print(run_output.out.split('\n')[1003])
Now, if all goes well, you should see in your output the following
(print_hello_world pid=1076432) Hello World (#1) from a remote call!
LLM Responded:
Hello World (#3) from OpenCode!
Followed by many messages related to the tool MCP server
(_ToolServerActor pid=6063, ip=172.17.0.1) INFO: Started server process [6063]
(_ToolServerActor pid=6063, ip=172.17.0.1) INFO: Waiting for application startup.
(_ToolServerActor pid=6063, ip=172.17.0.1) INFO: Application startup complete.
(_ToolServerActor pid=6063, ip=172.17.0.1) INFO: Uvicorn running on http://172.17.0.1:8000 (Press CTRL+C to quit)
chipyard_bash started at 172.17.0.1:8000 on node 07348f9a70938e126620044cda1e3f41554de0ad19fac5e54e9ead6e
(_ToolServerActor pid=6063, ip=172.17.0.1) INFO: ${IP ADDRESS} - "POST /chipyard_bash/mcp HTTP/1.1" 200 OK
(_ToolServerActor pid=6063, ip=172.17.0.1) INFO: ${IP ADDRESS} - "POST /chipyard_bash/mcp HTTP/1.1" 202 Accepted
(_ToolServerActor pid=6063, ip=172.17.0.1) INFO: ${IP ADDRESS} - "GET /chipyard_bash/mcp HTTP/1.1" 200 OK
...
Followed, eventually, by a response from the LLM, a tool shutdown message, and finally the verilator outputs. It should look something like this (though the LLM’s response will vary).
The change has been made successfully and compiles without errors. Here's a summary of what was done:
## Change Summary
**File modified:** `/home/ray/chipyard/generators/rocket-chip/src/main/scala/rocket/CSR.scala`
**What was added:** A Chisel `printf` statement that fires when the architecture counter `instret` (instruction retired count) reaches exactly 1000.
**The exact code added** (line 594, right after the `reg_instret` counter definition):
```scala
when (reg_instret === 1000.U) { printf("Hello World (#4) from instruction 1000!\n") }
```
**How it works:**
- The `reg_instret` signal is a `WideCounter(64, io.retire, ...)` that counts each retired instruction (line 593).
- The `when` block checks if `reg_instret` equals exactly `1000.U` (unsigned 1000).
- When the condition is true, Chisel's `printf` (which prints during simulation/emulation) outputs `"Hello World (#4) from instruction 1000!"`.
- Since no existing "Hello World" message was found in the file, the message was added fresh (no "again" suffix needed).
(_ToolServerActor pid=6063, ip=172.17.0.1) INFO: Waiting for application shutdown.
(_ToolServerActor pid=6063, ip=172.17.0.1) INFO: Application shutdown complete.
(_ToolServerActor pid=6063, ip=172.17.0.1) INFO: Finished server process [6063]
Output of Verilator run (run_output.log):
Hello World (#2) from Verilator!
[UART] UART0 is here (stdin/stdout).
- /home/ray/chipyard/sims/verilator/generated-src/chipyard.harness.TestHarness.RocketConfig/gen-collateral/TestDriver.v:179: Verilog $finish
Parsed lines from simulation (run_output.out):
C0: 2574 [1] pc=[0000000080000562] W[r 8=00000000800016c7][1] R[r10=00000000800016c0] R[r 0=0000000000000000] inst=[00750413] addi s0, a0, 7
Hello World (#4) from instruction 1000!
C0: 2575 [1] pc=[0000000080000566] W[r 8=00000000800016c0][1] R[r 8=00000000800016c7] R[r 0=0000000000000000] inst=[00009861] c.andi s0, -8
Where the line “Hello World (#4) from instruction 1000!” indicates that the LLM succeeded.
And with that, you’ve gotten CHIA to say Hello in 4 different ways! Thanks!