Introduction

Threadbare is a minimal demo of the Patchwork execution model. It demonstrates how a program can interleave deterministic computation with non-deterministic LLM "thinking" - and crucially, how the LLM can call back into the interpreter to execute more code.

The name is a pun: it's a bare-threads implementation of Patchwork, held together by minimal threads.

The Core Idea

Patchwork programs mix two kinds of computation:

Deterministic blocks - traditional code that always produces the same output
Think blocks - prompts sent to an LLM, whose output is non-deterministic

The interesting part: a Think block can invoke deterministic subroutines (via a do tool), and those subroutines might themselves contain Think blocks. This creates a recursive interplay between the interpreter and the LLM.

Why This Matters

This execution model enables:

Auditability: You can trace exactly what decisions the LLM made and why
Composition: Deterministic scaffolding with LLM "escape hatches" where judgment is needed
Recursion: LLM decisions can trigger further LLM decisions, nested arbitrarily deep

What's in This Book

The AST - The three node types: Print, Block, and Think
An Example - A concrete program that categorizes documents
The Interpreter - How the interpreter executes Think nodes, with recursive call tracing
The Agent - How the agent manages concurrent LLM sessions and routes messages

The AST

Threadbare programs are represented as JSON. There are three node types:

Print

Outputs a message. The simplest node.

{
  "Print": {
    "message": "Hello, world!"
  }
}

Block

Executes a sequence of children in order.

{
  "Block": {
    "children": [
      { "Print": { "message": "First" } },
      { "Print": { "message": "Second" } }
    ]
  }
}

Think

The interesting one. Sends a prompt to an LLM and waits for a response. The LLM has access to a do tool that can execute any of the children subroutines.

{
  "Think": {
    "think": {
      "prompt": "Pick a greeting. Call do(0) for formal, do(1) for casual.",
      "children": [
        { "Print": { "message": "Good morning, esteemed colleague." } },
        { "Print": { "message": "Hey!" } }
      ]
    }
  }
}

When the LLM calls do { "number": 0 }, the interpreter evaluates children[0] and returns the result to the LLM. The LLM can call do multiple times, or not at all.

Composition

These nodes compose naturally. A Think's children can themselves contain Think nodes, enabling recursive LLM reasoning:

{
  "Think": {
    "think": {
      "prompt": "Analyze this document. Call do(0) to get a summary.",
      "children": [
        {
          "Think": {
            "think": {
              "prompt": "Summarize: ...",
              "children": []
            }
          }
        }
      ]
    }
  }
}

An Example: Document Categorization

Let's walk through a concrete example. Imagine a program that categorizes a document and takes different actions based on the category.

The Program

{
  "Think": {
    "think": {
      "prompt": "You are categorizing a document. Based on the content below, decide its type. Call do(0) if it's a RECEIPT, do(1) if it's a CONTRACT, or do(2) if it's PERSONAL correspondence.\n\nDocument content:\n[... invoice for $542.00 from Acme Corp ...]",
      "children": [
        {
          "Block": {
            "children": [
              { "Print": { "message": "Categorized as: RECEIPT" } },
              { "Print": { "message": "Extracting amount..." } }
            ]
          }
        },
        {
          "Block": {
            "children": [
              { "Print": { "message": "Categorized as: CONTRACT" } },
              { "Print": { "message": "Flagging for legal review..." } }
            ]
          }
        },
        {
          "Print": { "message": "Categorized as: PERSONAL" }
        }
      ]
    }
  }
}

What Happens

The interpreter encounters the Think node
It sends the prompt to the LLM via SACP
The LLM reads the document content, decides it's a receipt
The LLM calls do { "number": 0 }
The interpreter executes children[0] - the Block that prints "RECEIPT" and "Extracting amount..."
The result is returned to the LLM
The LLM finishes its turn
The interpreter continues

Sequence Diagram

sequenceDiagram
    participant I as Interpreter
    participant A as Agent (SACP)
    participant L as LLM

    I->>A: Think { prompt: "Categorize..." }
    A->>L: Start session, send prompt
    L->>L: Reads document, decides RECEIPT
    L->>A: Tool call: do { number: 0 }
    A->>I: ThinkResponse::Do { uuid: 0 }
    I->>I: Execute children[0] (Block)
    I->>A: Result: "Categorized as: RECEIPT\nExtracting amount..."
    A->>L: Tool result
    L->>A: End turn
    A->>I: ThinkResponse::Complete

Nested Thinking

What if the "extract amount" step also needs LLM judgment? We can nest a Think inside the receipt handler:

{
  "Think": {
    "think": {
      "prompt": "Categorize this document. do(0)=RECEIPT, do(1)=CONTRACT",
      "children": [
        {
          "Block": {
            "children": [
              { "Print": { "message": "Categorized as: RECEIPT" } },
              {
                "Think": {
                  "think": {
                    "prompt": "Extract the dollar amount from this receipt. do(0) to confirm extraction.",
                    "children": [
                      { "Print": { "message": "Amount: $542.00" } }
                    ]
                  }
                }
              }
            ]
          }
        },
        { "Print": { "message": "Categorized as: CONTRACT" } }
      ]
    }
  }
}

Now when the outer LLM calls do(0), the interpreter runs the Block, which includes another Think. This spawns a second LLM session to extract the amount.

The next two chapters walk through how this nesting works in detail - first from the interpreter's perspective, then from the agent's.

The Interpreter

This chapter walks through how the interpreter executes the nested document categorization example. We'll trace the call stack and message flow step by step.

The Example Program

We're executing this program (simplified for clarity):

Think "Categorize. do(0)=RECEIPT"
  [0]: Block
         Print "RECEIPT"
         Think "Extract amount. do(0)=confirm"
           [0]: Print "$542.00"

The Interpreter Loop

The interpreter has two key methods:

interpret(ast) - Pattern matches on the AST node type and executes it
think(think) - Sends a prompt to the agent and waits for responses

#![allow(unused)]
fn main() {
fn interpret(&mut self, ast: &Ast) -> Result<String> {
    match ast {
        Ast::Print { message } => /* append message to output */,
        Ast::Block { children } => /* interpret each child */,
        Ast::Think { think } => /* call self.think(think) */,
    }
}

fn think(&mut self, think: &Think) -> Result<String> {
    // Send prompt to agent
    self.agent.send_prompt(AcpActorMessage::Think { prompt, tx });

    // Wait for responses
    for response in rx {
        match response {
            ThinkResponse::Do { uuid, do_tx } => {
                // LLM wants us to execute a subroutine
                let result = self.interpret(&think.children[uuid])?;
                do_tx.send(result);  // Send result back to LLM
            }
            ThinkResponse::Complete { message } => {
                return Ok(message);
            }
        }
    }
}
}

The key insight: think can call interpret, which can call think again. This is how nesting works.

Execution Trace

Let's trace through our example. The colored boxes show recursive call frames - when you see a nested box, we've recursed into interpret() again:

sequenceDiagram
    participant I as Interpreter
    participant S1 as Session 1
    participant S2 as Session 2

    rect rgb(200, 200, 240)
        Note over I: interpret(outer Think)
        activate I
        I->>S1: think "Categorize..."
        deactivate I
        activate S1
        S1-->>I: Do uuid=0
        deactivate S1
        activate I

        rect rgb(200, 240, 200)
            Note over I: interpret Block
            Note over I: Print "RECEIPT"

            rect rgb(240, 200, 200)
                Note over I: interpret(inner Think)
                I->>S2: think "Extract..."
                deactivate I
                activate S2
                S2-->>I: Do uuid=0
                deactivate S2
                activate I

                rect rgb(240, 240, 200)
                    Note over I: Print "$542.00"
                end

                I->>S2: send result
                deactivate I
                activate S2
                S2-->>I: Complete
                deactivate S2
                activate I
            end
        end

        I->>S1: send result
        deactivate I
        activate S1
        S1-->>I: Complete
        deactivate S1
        activate I
    end
    deactivate I

Each nested rect represents a recursive call to interpret(). The activation bars on the Interpreter show when it's actively running vs blocked waiting for a response:

The blue outer frame is the original Think
The green frame is the Block executed when the LLM calls do(0)
The red frame is the nested Think inside that Block - Session 2 activates here
The yellow frame is the Print inside the inner Think

Session 1 remains active (blocked) while we recursively handle Session 2.

The Call Stack

At the deepest point of execution, the call stack looks like:

interpret(outer Think)
  └─ think("Categorize...")        // waiting for outer LLM
       └─ interpret(Block)
            └─ interpret(Print "RECEIPT")
            └─ interpret(inner Think)
                 └─ think("Extract...")  // waiting for inner LLM
                      └─ interpret(Print "$542.00")

Notice that:

The outer think() is blocked, waiting for its rx channel
While blocked, it called interpret() which called another think()
The inner think() is now the active one
When inner completes, we unwind back to outer

The Channel Dance

Each think() call creates a channel pair (tx, rx):

tx is sent to the agent (so it can send ThinkResponse messages back)
rx is used in the for response in rx loop

When the interpreter calls interpret() recursively during a do, the outer think() is still holding its rx - it's just not reading from it yet. It will resume reading after the recursive call returns.

The Agent

This chapter explains how the agent manages concurrent LLM sessions and routes messages to the right place. We'll trace through the same nested example, but from the agent's perspective.

The Challenge

When the interpreter sends a Think request, the agent needs to:

Start an LLM session
Route incoming messages (notifications, tool calls, responses) to the right handler
Handle nested Think requests that arrive while an outer one is still active

The tricky part is #3. When the outer LLM calls do(0), the interpreter might execute another Think, creating an inner LLM session. Messages from the inner session shouldn't go to the outer handler.

Architecture Overview

The agent has four concurrent pieces:

graph TD
    I[Interpreter] -->|AcpActorMessage::Think| CL[Client Loop]
    CL -->|spawns| TM[think_message task]
    TM -->|PushThinker| RA[Redirect Actor]

    LLM[LLM / SACP] -->|SessionNotification| RA
    LLM -->|PromptResponse| RA
    MCP[MCP Server] -->|DoInvocation| RA

    RA -->|routes to top of stack| TM
    TM -->|ThinkResponse| I

Client Loop - Receives Think requests from interpreter, spawns think_message tasks
Redirect Actor - Maintains a stack of thinkers, routes all incoming messages to top of stack
think_message task - One per active Think; manages a single LLM session
MCP Server - Handles do tool calls from the LLM

The Client Loop

When the agent starts, it enters a loop waiting for messages from the interpreter:

#![allow(unused)]
fn main() {
while let Some(message) = rx.recv().await {
    match message {
        AcpActorMessage::Think { prompt, tx } => {
            cx.spawn(Self::think_message(
                cx.clone(),
                prompt,
                tx,           // channel back to interpreter
                redirect_tx,  // channel to redirect actor
                mcp_registry,
            ))?;
        }
    }
}
}

Each Think request spawns a new think_message task. This is important: multiple thinks can be in flight concurrently (though in our example, they nest rather than run in parallel).

The think_message Task

Each think_message task:

Creates an LLM session
Registers itself with the redirect actor (push onto stack)
Sends the prompt to the LLM
Processes messages from its channel until the LLM completes
Unregisters from the redirect actor (pop from stack)
Sends the final result back to the interpreter

#![allow(unused)]
fn main() {
async fn think_message(...) {
    // 1. Create session
    let session_id = cx.send_request(NewSessionRequest { ... }).await?;

    // 2. Push onto stack
    let (think_tx, mut think_rx) = channel(128);
    redirect_tx.send(RedirectMessage::PushThinker(think_tx));

    // 3. Send prompt (response will arrive via redirect actor)
    cx.send_request(PromptRequest { session_id, prompt })
      .await_when_result_received(|response| {
          redirect_tx.send(PromptResponse(response))
      });

    // 4. Process messages
    while let Some(message) = think_rx.recv().await {
        match message {
            SessionNotification(n) => /* accumulate text */,
            DoInvocation(arg, do_tx) => {
                // Tell interpreter to execute subroutine
                interpreter_tx.send(ThinkResponse::Do { uuid: arg.number, do_tx });
            }
            PromptResponse(r) => break,
        }
    }

    // 5. Pop from stack
    redirect_tx.send(RedirectMessage::PopThinker);

    // 6. Send result to interpreter
    interpreter_tx.send(ThinkResponse::Complete { message: result });
}
}

The Redirect Actor

The redirect actor is a simple loop that maintains a stack:

#![allow(unused)]
fn main() {
async fn redirect_actor(mut rx: Receiver<RedirectMessage>) {
    let mut stack: Vec<Sender<PerSessionMessage>> = vec![];

    while let Some(message) = rx.recv().await {
        match message {
            IncomingMessage(msg) => {
                // Route to top of stack
                if let Some(sender) = stack.last() {
                    sender.send(msg).await;
                }
            }
            PushThinker(sender) => stack.push(sender),
            PopThinker => { stack.pop(); }
        }
    }
}
}

All incoming messages go to whoever is on top of the stack. This is the key insight: when nested thinks are active, the inner one is on top, so it receives messages from its LLM session.

The MCP Server

When the LLM calls the do tool, the MCP server handles it:

#![allow(unused)]
fn main() {
McpServer::new().tool_fn("do", async move |arg: DoArg, _cx| {
    // Create a oneshot channel for the result
    let (do_tx, do_rx) = oneshot::channel();

    // Send through redirect actor to current thinker
    main_loop_tx.send(DoInvocation(arg, do_tx));

    // Wait for interpreter to execute and return result
    Ok(DoResult { text: do_rx.await? })
})
}

The MCP server doesn't know which thinker to send to - it just sends to the redirect actor, which routes to the top of the stack.

Full Trace: Nested Think

Let's trace through the nested example with all the agent components:

sequenceDiagram
    participant I as Interpreter
    participant CL as Client Loop
    participant RA as Redirect Actor
    participant T1 as Thinker 1
    participant T2 as Thinker 2
    participant MCP as MCP Server
    participant S1 as Session 1
    participant S2 as Session 2

    Note over RA: Stack: []

    activate I
    I->>CL: Think "Categorize..."
    deactivate I
    activate CL
    CL->>T1: spawn
    deactivate CL
    activate T1
    T1->>S1: NewSessionRequest
    deactivate T1
    activate S1
    S1-->>T1: session_id
    deactivate S1
    activate T1
    T1->>RA: PushThinker
    Note over RA: Stack: [T1]
    T1->>S1: PromptRequest
    deactivate T1

    activate S1
    S1->>RA: SessionNotification
    activate RA
    RA->>T1: forward
    deactivate RA

    S1->>MCP: tool call do(0)
    deactivate S1
    activate MCP
    MCP->>RA: DoInvocation
    deactivate MCP
    activate RA
    RA->>T1: forward
    deactivate RA
    activate T1
    T1->>I: Do uuid=0
    deactivate T1
    activate I

    Note over I: interpret(Block), hits inner Think

    I->>CL: Think "Extract..."
    deactivate I
    activate CL
    CL->>T2: spawn
    deactivate CL
    activate T2
    T2->>S2: NewSessionRequest
    deactivate T2
    activate S2
    S2-->>T2: session_id
    deactivate S2
    activate T2
    T2->>RA: PushThinker
    Note over RA: Stack: [T1, T2]
    T2->>S2: PromptRequest
    deactivate T2

    activate S2
    S2->>MCP: tool call do(0)
    deactivate S2
    activate MCP
    MCP->>RA: DoInvocation
    deactivate MCP
    activate RA
    RA->>T2: forward to T2!
    deactivate RA
    activate T2
    T2->>I: Do uuid=0
    deactivate T2
    activate I
    Note over I: interpret Print
    I->>T2: result
    deactivate I
    activate T2
    T2->>MCP: tool result
    deactivate T2
    activate MCP
    MCP->>S2: tool result
    deactivate MCP
    activate S2

    S2->>RA: PromptResponse
    deactivate S2
    activate RA
    RA->>T2: forward
    deactivate RA
    activate T2
    T2->>RA: PopThinker
    Note over RA: Stack: [T1]
    T2->>I: Complete
    deactivate T2
    activate I

    Note over I: inner Think done

    I->>T1: result
    deactivate I
    activate T1
    T1->>MCP: tool result
    deactivate T1
    activate MCP
    MCP->>S1: tool result
    deactivate MCP
    activate S1

    S1->>RA: PromptResponse
    deactivate S1
    activate RA
    RA->>T1: forward
    deactivate RA
    activate T1
    T1->>RA: PopThinker
    Note over RA: Stack: []
    T1->>I: Complete
    deactivate T1
    activate I
    deactivate I

Why This Design?

The stack-based routing is a workaround for a limitation: SACP doesn't currently provide per-session message routing. Ideally, each think_message would have its own isolated message stream, and we wouldn't need the redirect actor at all.

The comment in the code captures this sentiment:

"OK, I am a horrible monster and I pray for death."

A future version of the SACP client library will likely provide cleaner abstractions, eliminating the need for manual stack management.