Introducing Achronyme — a language for zero-knowledge proofs. Read the announcement arrow_right_alt

Lysis VM

Third VM: structural template instantiation with hash-consing and frame-spill heap. The canonical example of Bytecode-Oriented Compilation in Achronyme.

Why Lysis exists

Akron and Artik both execute once per proof. Lysis is different: it separates compile/instantiate from runtime emission. The walker emits a Lysis program that, when run, produces SSA IR — and the executor’s interning sink collapses structurally identical sub-trees during emission.

Concretely: the legacy eager template expander unrolled every loop and inlined every sub-template up front. Lysis defers that expansion to runtime and dedupes redundant sub-graphs via hash-consing. Current in-tree comments and tests identify SHA-256(64)-shaped circuits as the motivating scale target; exact wall-clock and constraint-delta figures are benchmark outputs, not architectural invariants.

Bytecode-Oriented Compilation (BOC)

Lysis is the canonical example of what we call Bytecode-Oriented Compilation (BOC) inside Achronyme: a compiler architecture where the compilation step itself is a bytecode program, and the output of compilation is the side-effect of running that bytecode on a VM.

The shape:

Source → Bytecode → Run on VM → Compile output

Two distinctions worth being explicit about, since BOC sits next to several closer-named patterns:

  • BOC is not “a compiler that emits bytecode.” Java’s javac and CPython produce bytecode as their final artifact — that bytecode is data, consumed later by a separate runtime. In BOC the bytecode is consumed as part of compilation itself: every step the VM executes pushes an entry into the output. In Lysis’s case, those entries are SSA IR nodes that go on to become R1CS constraints. The bytecode is the engine of compilation, not its product.
  • BOC is heterogeneous staged compilation. The closest neighbors in the literature are multi-stage programming (MetaOCaml, Template Haskell), compile-time function execution (Zig comptime, D CTFE), and metacompilation (Forth). All of them are typically homogeneous — stage-2 produces stage-1-shaped code, baked into the same target binary. BOC is heterogeneous: Lysis bytecode is a different ISA from its output, and the output is a constraint system, not an executable.

The practical consequence is that BOC pipelines fail and scale like programs, not like data structures. The frame-overflow handling, the spill heap, the validator’s single-static-store rule, the 8192 default call depth — none of these would exist for a passive IR. They exist because the meta-program is a real program subject to real resource limits, and the compiler’s compile-time is reified as this VM’s runtime. When Lysis pegs against a limit, the fix is rarely “change the data shape”; it is “give the executor more capacity” or “teach the executor a new opcode.” That is the signature of BOC.

Where Lysis sits

Flowchart diagram7 nodes, 6 edgesProveIRlysis_lift::walkerBTA · extract · diff · symbolicLysis bytecode.lysislysis::executeinterpreterInterningSinkhash-consVec<Instruction<F>>IR-SSAR1CS / Plonkish

The walker is the IR-side lifter: it consumes ExtendedInstruction<F> (from ir-forge) and produces Lysis bytecode. The executor is the VM side: it runs that bytecode and emits SSA via a sink.

Lift pipeline (ir-forge::lysis_lift)

File: ir-forge/src/lysis_lift/mod.rs. Five stages:

StageModuleDeliverable
Symbolic walkir-forge/src/lysis_lift/symbolic.rsSymbolicTree with placeholder slots for loop-var derivatives
Structural diffir-forge/src/lysis_lift/diff.rsDiff with AST-path slot identity
Binding-Time Analysisir-forge/src/lysis_lift/bta.rs3-point classification (Constant / Invariant / DataDependent)
Template extractionir-forge/src/lysis_lift/extract/Lambda lifting + frame size, capture layout
Walker (driver)ir-forge/src/lysis_lift/walker/Orchestrates the full lift pass

All five implement deferred evaluation — templates remain symbolic until proof-time.

Bytecode header

0..4    magic "LYSI"
4..6    version (u16 LE)          — current: 2 (v1 streams decode with heap_size_hint = 0)
6       field_family (u8)         — BnLike256 | Goldilocks64
7       flags (u8)                — bit 0: has_witness_calls
8..12   const_pool_len (u32 LE)
12..16  body_len (u32 LE)
16..20  heap_size_hint (u32 LE)   — spill heap size in entries (v2 only)
20..…   const_pool (FieldConst entries + bytecode blobs for Artik calls)
…..     body (instruction stream)

The decoder dispatches on version and reads the v1-shaped header (16 bytes) when it sees a v1 stream, defaulting heap_size_hint to 0.

Register / frame model

  • Registers: 256 per frame, stored as Vec<Option<FieldElement<F>>> (undefined until written).
  • Frame stack: Vec<Frame> with pc, body_start_idx, body_end_idx, template_id, loop_stack: Vec<LoopState>.
  • Loop state: each LoopUnroll pushes a LoopState tracking iteration bounds; the dispatcher’s advance_loops() step increments and detects completion.
  • Default call depth: 8192 (raised from the original 64 to accommodate deep template trees like SHA-256 message schedules).

Spill heap

The frame heap is a flat Vec<FieldElement<F>> allocated per program execution, sized by heap_size_hint. It holds values that outlive the frame that wrote them — captured loop-invariant values, materialised arrays from rolled loops, witness-call buffers — that don’t fit into the 256-register window of the active frame.

Each heap slot is governed by a single-static-store invariant: a slot is written exactly once per execution. The validator rejects programs that emit two StoreHeap to the same slot, and the executor traps on a LoadHeap from an unwritten slot. This makes the heap behave like SSA registers extended into a wider address space, with the same dataflow guarantees the rest of the VM relies on.

Opcode table (37 opcodes)

File: lysis/src/bytecode/opcode.rs.

CodeMnemonicOperandsSemantics
0x01LoadCapturedst, idxR[dst] = captures[idx]
0x02LoadConstdst, idxR[dst] = const_pool[idx]
0x03LoadInputdst, name_idx, visR[dst] = inputs[name_idx] (Public/Witness)
0x04EnterScopescope_depth += 1
0x05ExitScopescope_depth −= 1
0x10JumpoffsetPC += offset
0x11JumpIfcond, offsetconditional
0x12Returnpop frame
0x13Haltend execution
0x14Trapcodeabort
0x20LoopUnrolliter_var, start, end, body_leninline body N times
0x21LoopRollediter_var, start, end, template_idcall template repeatedly
0x22LoopRangeiter_var, R(end), template_iddynamic-bound loop
0x30DefineTemplatetemplate_id, frame_size, n_params, body_offset, body_lenregister template metadata
0x31InstantiateTemplatetemplate_id, capture_regs, output_regspush frame + call
0x32TemplateOutputoutput_idx, srcpublish output value
0x40EmitConstdst, srcemit Const(R[src])
0x41EmitAdddst, lhs, rhsemit Add
0x42EmitSubdst, lhs, rhsemit Sub
0x43EmitMuldst, lhs, rhsemit Mul
0x44EmitNegdst, srcemit Neg
0x45EmitMuxdst, cond, t, femit Mux
0x46EmitDecomposedst_arr, src, n_bitsemit Decompose
0x47EmitAssertEqlhs, rhsemit AssertEq
0x48EmitRangeCheckvar, max_bitsemit RangeCheck
0x49EmitWitnessCallbytecode_idx, in_regs, out_regsemit WitnessCall (Artik)
0x4AEmitPoseidonHashdst, in_regsemit PoseidonHash
0x4BEmitIsEqdst, lhs, rhsemit IsEq
0x4CEmitIsLtdst, lhs, rhsemit IsLt
0x4DEmitIntDivdst, lhs, rhs, max_bitsemit IntDiv
0x4EEmitIntModdst, lhs, rhs, max_bitsemit IntMod
0x4FEmitDivdst, lhs, rhsemit field-Div
0x50StoreHeapslot, srcspill R[src] into heap slot (single-static-store)
0x51LoadHeapdst, slotread heap slot back into a register
0x52EmitWitnessCallHeapbytecode_idx, in_descs, out_regsWitnessCall whose inputs may live in heap slots
0x53EmitAssertEqMsglhs, rhs, msg_idxAssertEq carrying a constant-pool message id
0x54EmitIsLtBoundeddst, lhs, rhs, max_bitsemit bounded IsLt

Emit* opcodes don’t compute results — they push entries into the sink.

Sinks

The dispatcher writes IR-emitting opcodes into a IrSink trait object. Two implementations exist:

  • StubSink — no-op, used by skeleton tests that exercise control flow without materialising IR.
  • InterningSink — hash-cons via IndexMap<NodeKey, NodeId>. Default in production. Materialises a flat Vec<Instruction<F>> via sink.materialize().

Dispatcher loop

File: lysis/src/execute/mod.rs. Outer loop over frames + inner step:

loop {
    check_instruction_budget()?;
    let instr = program.body[frame.pc];
    match dispatch(instr, ...)? {
        Step::Next => frame.pc += 1,
        Step::JumpToIndex(idx) => frame.pc = idx,
        Step::PushFrame(f) => stack.push(f),
        Step::PopFrame => stack.pop(),
        Step::Halt => break,
    }
    advance_loops()?;
}

Memory model

No GC. Frame registers live in the frame struct. The interner internally uses IndexMap<NodeKey, NodeId> borrowed mutably during execution. The spill heap is a single Vec<FieldElement<F>> owned by the executor for the duration of one program run, reset between executions.

Lift output → walker

File: ir-forge/src/lysis_lift/walker/. Driver that consumes Vec<ExtendedInstruction<F>>:

VariantWalker action
Plain(ir::Instruction)emit corresponding Emit* opcode
LoopUnroll { iter_var, start, end, body }emit LoopUnroll opcode
TemplateCall {…}emit InstantiateTemplate; spill long-lived captures via StoreHeap
TemplateBody {…}bound to a DefineTemplate block; rolled-loop bodies live here

When a body is too wide to keep its values in the 256-register window — the SHA-256 round body is the canonical example — the walker drops to per-iteration unroll and routes loop-carried values through heap slots so each iteration starts with a fresh register file.

E2E coverage

CircuitPipelineResult
SHA-256(64)walker (per-iter unroll, heap-routed) → optimize → R1CSdigest equivalence and full R1CS witness verification are covered by ignored regression tests in circom/tests/e2e_sha256/witness.rs
Poseidon(2)structural lift → R1CS → Groth16proof-generation path covered by Groth16 and Solidity E2E tests
EdDSAPoseidonstructural lift + heap capturescompile + R1CS clean
MiMCSponge(2,220,1)structural lift → R1CScovered by circomlib E2E and benchmark harnesses; exact constraint counts belong in benchmark output
Adversarial constraint preservationbenchmark + cross-mode testsround-trip and preservation tests check byte-identical R1CS where that property is required

Open questions

  • Current benchmark numbers for SHA-256(64), MiMCSponge, and circom O2 deltas should be regenerated from the benchmark harness before publishing exact counts or wall-clock timings as facts.

Source files

ComponentFile
Bytecode opcodeslysis/src/bytecode/opcode.rs
Encodinglysis/src/bytecode/encoding.rs
Validatorlysis/src/bytecode/validate.rs
Const poollysis/src/bytecode/const_pool.rs
Header (v1/v2)lysis/src/header.rs
Dispatcherlysis/src/execute/mod.rs
Framelysis/src/execute/frame.rs
Heap dispatchlysis/src/execute/dispatch/heap.rs
Interning sinklysis/src/execute/interning_sink.rs
Internerlysis/src/intern/interner/
Builderlysis/src/builder.rs
Walker (lift)ir-forge/src/lysis_lift/walker/
BTA classifierir-forge/src/lysis_lift/bta.rs
Symbolic treeir-forge/src/lysis_lift/symbolic.rs
Structural diffir-forge/src/lysis_lift/diff.rs
Template extractionir-forge/src/lysis_lift/extract/

See ProveIR for the layer feeding Lysis. See Akron VM and Artik Witness VM for the other two VMs.

Navigation