·building·project·active

canvas-engineering

A type system for multimodal latent dynamics that turns video diffusion models into structured world-modeling substrates.

  • Declares latent regions, temporal frequencies, connectivity, and loss roles as a schema instead of leaving them implicit
  • Introduces looped attention, where frozen video backbones can be upgraded with lightweight recurrent compute
  • Frames multimodal diffusion as typed latent computation rather than flat token processing

Overview

canvas-engineering is the clearest expression on this site of a recurring idea in my work: models get easier to steer when their internal spaces are given structure instead of being treated as undifferentiated tensors.

The core move is simple. Instead of asking a diffusion transformer to discover on its own which latent positions should represent vision, action, reward, proprioception, or thought, you declare that layout directly. The layout becomes the schema, the topology becomes the compute graph, and the resulting model behaves more like a typed program than a bag of embeddings.

That view matters because it makes multimodal world models compositional. Different regions can run at different temporal frequencies, participate in different losses, and connect through different attention operations, while still living on the same spatiotemporal canvas.

Canvas engineering layouts

Looped attention

The second idea in the project is looped attention: reusing transformer blocks across multiple learned iterations. In the current writeup, that is framed less as "reasoning depth" and more as a weight-sharing regularizer that makes frozen backbones substantially more parameter-efficient. That pairing is what makes the project interesting to me: one part is about giving latent space semantics, and the other is about making the compute over that space cheaper and more reusable.

Looped attention diagram

The published docs at jacobfv.github.io/canvas-engineering also make clear that this is meant to be a real design language, not just a one-off research note. The library has concepts for RegionSpec, CanvasTopology, semantic types, carriers, clocks, and executable examples spanning control, agentic computer use, and brain-computer-interface settings.

Repo: JacobFV/canvas-engineering

Neighborhood

Related

LimboidLimboidCookie Baker 3D PrinterCookie Baker 3D PrinterCookie Cutter CNCCookie Cutter CNCFocus StatementFocus StatementRapid prototyping and hardware thinkingRapid prototyping and hardw...canvas-engineering