nixir/README.md
NotAShelf e36693ac3f
docs: document project architechture
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Iaa99d706d61857fbd51d3b757b5066ab6a6a6964
2026-02-22 00:07:50 +03:00

239 lines
6.8 KiB
Markdown

# Nixir - Import-Resolving IR Plugin
Nixir, for the lack of a more imaginative name, is a Nix plugin with a fancy
hybrid compilation architecture for optimized evaluation. We provide two
complementary paths for Nix evaluator. It is either **On-the-fly compilation**
where the plugin parses and compiles Nix code at runtime, or; **ahead-of-time**
compilation where the `nix-irc` tool pre-compiles `.nix` files into `.nixir`
files.
## Supported Nix Constructs
- Literals: integers, strings, booleans, null, paths
- Attrsets: `{ name: value; }`
- Recursive attrsets: `rec { ... }`
- Let bindings: `let x: 1; in x`
- Recursion let: `let rec x: y; y: x; in x`
- Conditionals: `if cond then a else b`
- Lambdas: (basic support, patterns coming in Phase 5)
- Applications: function calls
- Selections: `attrset.attribute`
- Assertions: `assert condition; expr`
- With expressions: `with attrs; expr`
- Operators:
- Binary: `+`, `-`, `*`, `/`, `++`, `==`, `!=`, `<`, `>`, `<=`, `>=`, `&&`,
`||`, `->`
- Unary: `-`, `!`
## Overview
Nixir is a Nix evaluator plugin that compiles Nix expressions to a custom binary
intermediate representation (IR). Think of it like a compiler for Nix: it
translates human-readable Nix code into a compact, fast-to-execute format that
runs in a custom virtual machine.
The plugin works in two ways:
1. **Ahead-of-time**: Use the `nix-irc` tool to compile `.nix` files to `.nixir`
once, then load them instantly
2. **On-the-fly**: Let the plugin parse and compile Nix code at runtime when you
need it
While Nixir _is_ designed as a toy research project, I _envision_[^1] a few
_potential_ use cases built around working with Nix. Sure, you _probably_ would
not go work with Nix willingly, science is not about why, it is about _why not_.
Some potential use cases for Nixir _might_ include:
- **CI/CD Acceleration**: Pre-compile stable Nix expressions to `.nixir` for
faster repeated evaluation in CI pipelines
- **Embedded Nix**: Use Nix as a configuration language in C++ applications
without bundling the full evaluator
- **Plugin Ecosystem**: Extend Nix with custom evaluation strategies via the
plugin API
- **Build Caching**: Cache compiled IR alongside source for instant startup of
Nix-based tools
[^1]: I'm not entirely convinced either, do not ask.
### Architecture
```mermaid
flowchart TD
subgraph Source["User Source"]
A[".nix Source Files"]
end
subgraph Compiler["External Tool: nix-irc"]
B1["Parse Nix"]
B2["Static Import Resolution"]
B3["Flatten Import Graph"]
B4["Desugar + De Bruijn Conversion"]
B5["Emit Versioned IR Bundle (.nixir)"]
end
subgraph IR["IR Bundle"]
C1["Binary IR Format"]
C2["Versioned Header"]
C3["No Names, Indexed Vars"]
end
subgraph Plugin["nix-ir-plugin.so"]
D1["Primop Registration"]
D2["prim_loadIR"]
D3["prim_compileNix"]
D4["prim_info"]
end
subgraph CompilePath["On-the-fly Path"]
E1["Parse Source String"]
E2["IR Generation"]
end
subgraph LoadPath["Pre-compiled Path"]
F1["Deserialize .nixir"]
end
subgraph VM["Custom Lazy VM"]
G1["Heap-Allocated Thunks"]
G2["Memoization"]
G3["Cycle Detection"]
G4["Closure Environments (Array-Based)"]
G5["FORCE / THUNK Execution"]
end
A --> B1
B1 --> B2
B2 --> B3
B3 --> B4
B4 --> B5
B5 --> C1
C1 --> D1
D2 -->|explicit| F1
F1 --> G1
D3 -->|explicit| E1
E1 --> E2
E2 --> G1
G1 -.-> G2 -.-> G3 -.-> G4 -.-> G5
```
The same compiler code runs both in the standalone `nix-irc` CLI tool and inside
the plugin for on-the-fly compilation. This ensures consistent behavior between
pre-compiled and runtime-compiled paths. The intermediate representation (IR)
design uses De Brujin indices instead of names for variable binding, which
eliminates string lookup and the binary format uses a versioned header
(`0x4E495258`). In addition, we make use of string interning for repeated
identifiers and type-tagged nodes for efficient dispatching.
The runtime implements lazy evaluation using heap-allocated thunks. Each thunk
holds a delayed computation and is evaluated at most once through memoization.
Recursive definitions are handled through a blackhole mechanism that detects
cycles at runtime. Variable lookup uses array-based closure environments,
providing O(1) access by index rather than name-based lookup.
The plugin integrates with Nix through the `RegisterPrimOp` API, exposing three
operations: `nixIR_loadIR` for loading pre-compiled `.nixir` bundles,
`nixIR_compile` for on-the-fly compilation, and `nixIR_info` for metadata. This
integration path is compatible with Nix 2.32+.
### IR Format
The `.nixir` files use a versioned binary format:
```plaintext
Header:
- Magic: 0x4E495258 ("NIRX")
- Version: 1 (uint32)
- Source count: uint32
- Import count: uint32
- String table size: uint32
String Table:
- Interned strings for efficient storage
Nodes:
- Binary encoding of IR nodes
- Each node has type tag + inline data
Entry:
- Main expression node index
```
## Usage
### Building
```bash
# Configure
$ cmake -B build
# Build
$ make
# The nix-irc executable will be in the project root
$ ./nix-irc --help
```
### Compiling Nix to IR
```bash
# Basic compilation
$ nix-irc input.nix output.nixir
# With import search paths
$ nix-irc -I ./lib -I /nix/store/... input.nix output.nixir
# Disable import resolution
$ nix-irc --no-imports input.nix output.nixir
```
### Runtime Evaluation (Plugin)
<!--markdownlint-disable MD013-->
```bash
# Load the plugin and evaluate IR
$ nix --plugin-files ./nix-ir-plugin.so eval --expr 'builtins.nixIR_loadIR "output.nixir"'
# On-the-fly compilation and evaluation
$ nix --plugin-files ./nix-ir-plugin.so eval --expr 'builtins.nixIR_compile "1 + 2 * 3"'
# Get plugin info
$ nix --plugin-files ./nix-ir-plugin.so eval --expr 'builtins.nixIR_info'
```
<!--markdownlint-enable MD013-->
### Running Tests
```bash
# Test all sample files
for f in tests/*.nix; do
./nix-irc "$f" "${f%.nix}.nixir"
done
# Verify IR format
$ hexdump -C tests/simple.nixir | head -3
```
## Contributing
This is a research project (with no formal association, i.e., no thesis or
anything) that I'm working on entirely for fun and out of curiousity. Extremely
experimental, could change any time. While I do not suggest running this project
in a serious context, I am happy to receive any kind of feedback you might have.
You will notice _very_ quickly that I'm a little out of my depth, and the code
is in a rough shape. Areas where help is needed:
- Compiler semantics
- Performance optimization
- Test coverage
- Documentation improvements
- Expanding parser to handle more Nix syntax (module system in particular)
Contributions _are_ welcome!