Compare commits

...

37 commits

Author SHA1 Message Date
531855d91a
tests: cover flake refs and lexer/parser regressions
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I1b5f90bbb210262a9287a9b8eac02e9d6a6a6964
2026-04-24 23:13:24 +03:00
b319ef6f3f
irc/parser: fix lexer ownership, errors, and implication parsing
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I12a6b52ec1c0edff605d02393eafde896a6a6964
2026-04-24 23:13:23 +03:00
760094a2b7
irc: add flake reference support
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Icc96d297f02d3aad03b0373727d57f316a6a6964
2026-04-24 23:13:22 +03:00
66c0d5bb99
nix: add Nix 2.32 back to devshell; assign to nixForRuntime
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I64d4b0a66fb647f335862213c8a3f5646a6a6964
2026-04-24 23:13:21 +03:00
554f7f21f1
meta: CTest integration; expand test coverage
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I3e6bf966985f6212e99d146cb18e4e2f6a6a6964
2026-04-24 23:13:20 +03:00
584d84542e
irc: add serialization support for patterns and string interpolation
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I244ae722016b5b49915e23522a1fb72e6a6a6964
2026-04-24 23:13:19 +03:00
0a5920adaf
irc: split parser into lexer and parser components
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I4e73459a02caff5335d690656fd6f1396a6a6964
2026-04-24 23:13:19 +03:00
feb247f64a
irc: extract inline constructors; deduplicate value-copy logic
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ifc74f0bfe621a05fa7b91a5f6be1ea976a6a6964
2026-04-24 23:13:18 +03:00
28de44c598
irc: PrimOp memory leak and IR_VERSION mismatch
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Iad057cd5f51ef26e7de93ccca7b3d3156a6a6964
2026-04-24 23:13:17 +03:00
f6481b3c01
irc/parser: use explicit IR nodes for patterns and interpolation
Replaces lambda pattern desugaring with direct `LambdaPatternNode`
generation, which:

- Separates required and optional fields in pattern structure
- Preserves @-binding and ellipsis information in IR
- Remove like 50 lines of Let-binding desugaring logic

and replace string interpolation concatenation tree with
`StringInterpolationNode` to:

- Use `StringPart::make_literal/make_expr` for cleaner representation
- Optimize single-literal strings to `ConstStringNode`
- Remove `toString` builtin wrapper generation

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I7d3fa038f743d02b9caae0979b79f5086a6a6964
2026-04-24 23:13:16 +03:00
359a707663
irc/types: IR type extensions
Updates IR_VERSION to 3. In an effort to support more features of the
Nix language, implements 5 new node type constants:

- `LAMBDA_PATTERN` = `0x70` - Lambda with pattern matching
- `INHERIT` = `0x71` - Simple inherit expressions
- `INHERIT_FROM` = `0x72` - Inherit from source expression
- `STRING_INTERPOLATION` = `0x73` - String with interpolated parts
- `BUILTIN_CALL` = `0x74` - Builtin function call

and some struct definitions before the Node class, such as `PatternField`.

Those will come in handy for suppporting the entire Nixpkgs library.

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I1f0e81a120a0c956b8068d81c42796616a6a6964
2026-04-24 23:13:15 +03:00
56f15d749e
docs: initial specification; we yap
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I885e6317d186ccdc847195957dba4ab26a6a6964
2026-04-24 23:13:14 +03:00
14bbc09280
docs: document using just in the codebase
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I56d1de8a88bb28e49e6387a320f318c86a6a6964
2026-04-24 23:13:13 +03:00
13a38f707b
meta: switch to justfile for task organization
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ib4000ab597f94b2dd3dccf1e31fce3a76a6a6964
2026-04-24 23:13:12 +03:00
2c9ad890b2
tests: move fixtures to dedicated dir
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I9d6ce6a264780f215b1b57d947b5264c6a6a6964
2026-04-24 23:13:11 +03:00
2a005574d3
tests/benchmark: make benchmark cases... bigger
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Iabd307b475f6568cff4d1ae6e5ae56ef6a6a6964
2026-04-24 23:13:10 +03:00
1ceb889a16
irc: add timing measurements; formatting
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Id4402547e18b6569464850c3753383396a6a6964
2026-04-24 23:13:09 +03:00
54892c3121
irc/evaluator: fix variable lookup, recursive let, and value handling
Bunch of things:

- Decode depth and offset from encoded variable indices
- Pre-allocate Values for recursive let bindings before eval
- Use mk* methods for value copying instead of direct assignment
- Evaluate attrset values immediately to avoid dangling thunks

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I4dd40c93d74df5973a642fb9f123e70e6a6a6964
2026-04-24 23:13:08 +03:00
e8fcaccacc
irc: add ListNode support; fix recursive attrset scoping
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I1657bc6a05c264f0ae0dd2c94d32b1046a6a6964
2026-04-24 23:13:07 +03:00
51305298ee
irc/parser: fix list parsing and function application
Fixes bug where `concat [1 2 3] [4 5 6]` tried to apply integer 1
as a function.

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I6f373dd83bcac9e59286b0448472200b6a6a6964
2026-04-24 23:13:06 +03:00
85a865af47
tests: initial integration tests
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I09ed2eea568edfaecdb800197bc36c416a6a6964
2026-04-24 23:13:05 +03:00
b3dcba607b
tests/benchmark: fine-grain timing reports
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ia481b0129193540636665340bd257d136a6a6964
2026-04-24 23:13:04 +03:00
4ea090cf33
tests/benchmark: rename runner script; compare compilation with native eval
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I6ef30732f875ab134a35282eb2cd66a36a6a6964
2026-04-24 23:13:03 +03:00
d272dc589e
tests: initial benchmarking setup
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: If0ed2dd4279abf155a8ddc678ca047736a6a6964
2026-04-24 23:13:02 +03:00
8a093aa9e8
irc: improve multi-line strings; complete list concat and dynamic attrs
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I64e53c68d90b62f3ca306865ceda32af6a6a6964
2026-04-24 23:13:01 +03:00
e6231f546d
tests: update test cases for newer syntax items; drop old artifacts
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I8640148e8e7597924f9c776750c856266a6a6964
2026-04-24 23:13:00 +03:00
dd79db1f86
irc: more syntax support
Indented strings, ancient let bindings and a bit more

Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ib86c2d8ca4402dfa0c5c536a9959f4006a6a6964
2026-04-24 23:12:59 +03:00
5e41a7cb37
tests: add tests for lookup paths and imports
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I7e54691aa3e81efcb495124d13e8c24a6a6a6964
2026-04-24 23:12:58 +03:00
775bb42c63
irc: support lookup paths and import keyword
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I0d16726646aef82ce675c4f8d209029a6a6a6964
2026-04-24 23:12:57 +03:00
dc7b3305db
tests: add test fixture for merge operator
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ie8d8e5fb817349469fed194773120ce86a6a6964
2026-04-24 23:12:56 +03:00
3387e0d822
irc: support merge operator
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Icfb0cc81542e637d4b91c6a5788370fb6a6a6964
2026-04-24 23:12:55 +03:00
95baf44a9c
irc: add Float and URI literal support
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I40c59d94f650e7b9e68f77598492d7ab6a6a6964
2026-04-24 23:12:54 +03:00
3dd2d604ce
nix: format test fixtures via nix fmt
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ia9c1e9b0a8cd9c6d834f153609baa5426a6a6964
2026-04-24 23:12:53 +03:00
30a3304171 chore: run clang-tidy with --fix
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I84fc0804ceeb652b26ee385b26132e816a6a6964
2026-02-24 19:40:05 +03:00
af17da34da chore: run through 'clang-tidy' with '-fix' 2026-02-24 19:40:05 +03:00
79f99f189f nix: inline env set; add clang-tools to devshell & name shell
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: I5cf7ad8f9fc8c568e53e6cf8dda12b746a6a6964
2026-02-24 19:40:05 +03:00
ea20aaab00 various: format with clang-format
Signed-off-by: NotAShelf <raf@notashelf.dev>
Change-Id: Ib9abc9d2dcd036d3680c5aa3dc919bfa6a6a6964
2026-02-24 19:40:05 +03:00
95 changed files with 5565 additions and 2068 deletions

View file

@ -17,9 +17,11 @@ pkg_check_modules(NIX_MAIN REQUIRED IMPORTED_TARGET nix-main)
add_executable(nix-irc add_executable(nix-irc
src/irc/main.cpp src/irc/main.cpp
src/irc/parser.cpp src/irc/parser.cpp
src/irc/lexer.cpp
src/irc/resolver.cpp src/irc/resolver.cpp
src/irc/ir_gen.cpp src/irc/ir_gen.cpp
src/irc/serializer.cpp src/irc/serializer.cpp
src/irc/types.cpp
) )
target_include_directories(nix-irc PRIVATE target_include_directories(nix-irc PRIVATE
@ -38,10 +40,12 @@ target_link_libraries(nix-irc PRIVATE
add_library(nix-ir-plugin MODULE add_library(nix-ir-plugin MODULE
src/plugin.cpp src/plugin.cpp
src/irc/parser.cpp src/irc/parser.cpp
src/irc/lexer.cpp
src/irc/resolver.cpp src/irc/resolver.cpp
src/irc/ir_gen.cpp src/irc/ir_gen.cpp
src/irc/serializer.cpp src/irc/serializer.cpp
src/irc/evaluator.cpp src/irc/evaluator.cpp
src/irc/types.cpp
) )
# Include directories from pkg-config # Include directories from pkg-config
@ -65,6 +69,10 @@ target_link_libraries(nix-ir-plugin PRIVATE
${NIX_MAIN_LINK_LIBRARIES} ${NIX_MAIN_LINK_LIBRARIES}
) )
# Set output directories to build/
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR})
set(CMAKE_LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR})
# Set output name # Set output name
set_target_properties(nix-ir-plugin PROPERTIES set_target_properties(nix-ir-plugin PROPERTIES
PREFIX "" PREFIX ""
@ -78,6 +86,9 @@ install(TARGETS nix-ir-plugin LIBRARY DESTINATION "${CMAKE_INSTALL_PREFIX}/lib/n
add_executable(regression_test add_executable(regression_test
tests/regression_test.cpp tests/regression_test.cpp
src/irc/serializer.cpp src/irc/serializer.cpp
src/irc/parser.cpp
src/irc/lexer.cpp
src/irc/types.cpp
) )
target_include_directories(regression_test PRIVATE target_include_directories(regression_test PRIVATE
@ -92,3 +103,7 @@ target_link_libraries(regression_test PRIVATE
${NIX_EXPR_LINK_LIBRARIES} ${NIX_EXPR_LINK_LIBRARIES}
${NIX_UTIL_LINK_LIBRARIES} ${NIX_UTIL_LINK_LIBRARIES}
) )
# CTest integration
enable_testing()
add_test(NAME regression_test COMMAND regression_test)

View file

@ -169,27 +169,44 @@ Entry:
### Building ### Building
```bash ```bash
# Configure # Using just (recommended)
$ cmake -B build $ just build
# Build # Or manually with CMake
$ make $ cmake -B build -G Ninja
$ cmake --build build
# The nix-irc executable will be in the project root # The nix-irc executable will be in build/
$ ./nix-irc --help $ ./build/nix-irc --help
``` ```
### Available Tasks
Run `just` to see all available tasks:
- `just build` - Build all targets
- `just test` - Run all tests (unit, compile, integration)
- `just bench` - Run performance benchmarks
- `just clean` - Clean build artifacts
- `just smoke` - Run quick smoke test
- `just stats` - Show project statistics
See `just --list` for the complete list of available commands.
### Compiling Nix to IR ### Compiling Nix to IR
```bash ```bash
# Basic compilation # Basic compilation
$ nix-irc input.nix output.nixir $ ./build/nix-irc input.nix output.nixir
# With import search paths # With import search paths
$ nix-irc -I ./lib -I /nix/store/... input.nix output.nixir $ ./build/nix-irc -I ./lib -I /nix/store/... input.nix output.nixir
# Disable import resolution # Disable import resolution
$ nix-irc --no-imports input.nix output.nixir $ ./build/nix-irc --no-imports input.nix output.nixir
# Using just
$ just compile input.nix output.nixir
``` ```
### Runtime Evaluation (Plugin) ### Runtime Evaluation (Plugin)
@ -212,13 +229,21 @@ $ nix --plugin-files ./nix-ir-plugin.so eval --expr 'builtins.nixIR_info'
### Running Tests ### Running Tests
```bash ```bash
# Test all sample files # Run all tests
for f in tests/*.nix; do $ just test
./nix-irc "$f" "${f%.nix}.nixir"
# Run specific test suites
$ just test-unit # Unit tests only
$ just test-compile # Compilation tests only
$ just test-integration # Integration tests only
# Manually test all fixtures
$ for f in tests/fixtures/*.nix; do
./build/nix-irc "$f" "${f%.nix}.nixir"
done done
# Verify IR format # Verify IR format
$ hexdump -C tests/simple.nixir | head -3 $ hexdump -C tests/fixtures/simple.nixir | head -3
``` ```
## Contributing ## Contributing

265
docs/SPEC.md Normal file
View file

@ -0,0 +1,265 @@
# Nixir Technical Specification
This is a distillation of my personal notes on my "research" within the Nix
codebase and the subsequent design notes on Nixir. While some of those,
naturally, belong in the README I have elected to compile a list of noteworthy
details into a "specification document" for those possibly interested, for some
reason, in integrating with Nixir.
Beware, here be observations.
## What This Project Is
Nixir is, most simply (and elegantly) put, a Nix compiler _and runtime_ packaged
as a plugin. The compiler component compiles a subset of Nix source to a custom
binary intermediate representation (IR) and then executes IR inside a virtual
machine running within the plugin process. Hence it's called Nix-ir.
As you might've caught on from the README already, the project consists of two
artifacts: a standalone compiler tool called `nix-irc` that transforms `.nix`
files into `.nixir` bundles, and a plugin library (`nix-ir-plugin.so`) that Nix
loads to provide three primops for interacting with compiled IR.
The architecture handles the full compilation pipeline. Static imports are
resolved at compile time and inlined into the output bundle, while the compiled
VM handles all evaluation at runtime. This mirrors how Nixpkgs itself
distinguishes between stable library code and application-specific expressions.
The plugin does not intercept evaluation automatically. Instead, it exposes
primops that users invoke explicitly. This design exists because Nix's plugin
API does not provide hooks into the core evaluation loop. Unfortunate, but 'tis
life.
## Why Compile Nix
Every invocation of `nix eval` or `nix build` must parse, type-check, and
evaluate expressions from scratch. For large codebases, this overhead is
measurable.
Nix does provide a persistent evaluation cache, stored in SQLite. However, this
cache only applies to flake-based workflows. Direct imports like
`import ./foo.nix` do not benefit from the cache and re-parse on each
invocation.
For example, a NixOS configuration using direct imports to `nixpkgs.lib`
re-parses source files on every rebuild. The compiler front-end accounts for
substantial wall-clock time before evaluation begins.
Precompiled IR eliminates, or rather, attempts to eliminate this cost. A
`.nixir` bundle contains serialized AST nodes with all variable names converted
to numeric indices. Loading skips parsing entirely and begins directly with the
VM executing pre-processed code.
The project _also_ serves as an implementation study. I say also, but it is
actually the main goal of this project. Reimplementing Nix's evaluation
semantics reveals details that the upstream C++ code obscures. The thunk
mechanism, environment model, and cycle detection become tangible when you can
read and step through the implementation. I don't expect to get a better
understanding of the Nix language, but I now have more reasons to badmouth it.
## The IR Format
The binary format uses 36-byte fixed header followed by variable-length
sections. All multi-byte integers use little-endian byte order.
The header layout:
```plaintext
0x00-0x03: Magic identifier, value 0x4E495258
0x04-0x07: Version number, currently 2
0x08-0x0B: Flags field, reserved
0x0C-0x0F: Offset to string table
0x10-0x13: Offset to primop table
0x14-0x17: Offset to IR blob
0x18-0x1B: String count
0x1C-0x1F: Primop count
0x20-0x23: Reserved
```
The magic value `0x4E495258` corresponds to the bytes N I R X when read in
big-endian order.
The string table follows the header. Each entry encodes length as a varint, then
that many UTF-8 bytes. All attribute names, identifiers, and string literals in
the source are de-duplicated at compile time and stored here. References
throughout the IR use indices into this table rather than inline strings.
The primop table defines built-in operations. Each entry contains the string
table index for the operation name, its arity, and optional flags. This table
enables the VM to dispatch operations by index without string comparison.
The IR blob contains the actual program. Each node begins with a type byte
followed by type-specific payload.
Node type enumeration from the source:
```plaintext
0x01: CONST_INT - Signed 64-bit integer
0x02: CONST_STRING - String table index
0x03: CONST_PATH - String table index
0x04: CONST_BOOL - 0x00 or 0x01
0x05: CONST_NULL - No payload
0x06: CONST_FLOAT - IEEE 754 double
0x07: CONST_URI - String table index
0x08: CONST_LOOKUP_PATH - String table index for <nixpkgs>
0x10: VAR - Two varints: depth and index
0x20: LAMBDA - Arity and body offset
0x21: APP - Function and argument offsets
0x22: BINARY_OP - Operation enum and operands
0x23: UNARY_OP - Operation enum and operand
0x24: IMPORT - String table index for file path
0x30: ATTRSET - Count and recursive flag
0x31: SELECT - Expression, attribute, optional default
0x32: WITH - Attribute set and body offsets
0x33: LIST - Count and element offsets
0x34: HAS_ATTR - Expression and attribute
0x40: IF - Condition, then, and else offsets
0x50: LET - Binding count and body offset
0x51: LETREC - Binding count and body offset
0x52: ASSERT - Condition and body offsets
0x60: THUNK - Expression offset
0x61: FORCE - Expression offset
0xFF: ERROR - Error marker
```
Binary operations supported:
```plaintext
ADD, SUB, MUL, DIV - Arithmetic on integers
CONCAT - List concatenation (++)
EQ, NE - Equality comparison
LT, GT, LE, GE - Ordering comparison
AND, OR, IMPL - Boolean logic
MERGE - Attribute set override (//)
```
## Variable Representation
The compiler converts variable names to De Bruijn indices during IR generation.
Rather than storing strings like "x" in the output, each variable reference
encodes two numbers: the lexical depth and the position within that scope.
The depth indicates how many lambda boundaries enclose the reference. A variable
in the outermost scope has depth zero. A variable referenced from inside one
lambda that refers to the outer scope has depth one.
The index indicates the position in that scope's environment array. The first
bound variable in a scope has index zero, the second has index one, and so
forth.
During evaluation, the VM combines these two numbers into a single 32-bit value
where the high 16 bits encode depth and the low 16 bits encode index. Lookup
traverses the environment chain depth times, then indexes into the resulting
scope's binding array. This achieves O(1) variable resolution.
## The Virtual Machine
The VM implements lazy evaluation using an explicit thunk mechanism. Every
unevaluated expression and function argument wraps in a Thunk structure
containing the expression AST node and a pointer to the captured environment.
When the VM needs a value, it calls `force()` on the thunk. The force operation
checks whether the thunk is already being evaluated. If evaluation attempts to
force a thunk that is currently evaluating, the VM detects the cycle and raises
"infinite recursion encountered". This matches Nix's behavior for recursive
definitions.
The environment structure is an array-based chain. Each scope holds a pointer to
its parent scope and a vector of bound values. Looking up a variable traverses
parent pointers until reaching the scope at the correct depth, then indexes into
that scope's value array. This replaces string comparison with pointer traversal
and array indexing.
Function application follows currying. When applying a function to an argument,
the VM checks whether the function's arity is satisfied. If yes, it extends the
environment with the new binding and evaluates the body. If not, it returns a
partial application awaiting additional arguments.
The evaluator handles binary operations with type-specific dispatch. Addition
supports integers, strings, and paths with appropriate type coercion rules.
Comparison operators work on integers and strings. The merge operator combines
two attribute sets with right-side precedence.
## Plugin Primops
The plugin registers three primops through Nix's `RegisterPrimOp` interface:
`__nixIR_loadIR` accepts a file path string, deserializes the `.nixir` bundle,
evaluates the entry expression, and returns the resulting value. The VM measures
deserialization time and evaluation time separately, printing timing data to
stderr.
`__nixIR_compile` accepts a string containing Nix source code, parses it
in-memory, generates IR, and evaluates the result. This enables runtime
compilation without external tooling.
`__nixIR_info` returns an attribute set containing the plugin name
"nix-ir-plugin", version "0.1.0", and status "runtime-active". This is a
development-only primop that will be removed eventually.
The primops use the double-underscore prefix internally. Users access them
through `builtins.nixIR_loadIR`, `builtins.nixIR_compile`, and
`builtins.nixIR_info` in their expressions.
## Import Handling
The compiler performs static import resolution when the import path meets
specific conditions. The path must be a literal string literal in the source,
not an interpolation or variable. The path must not use home directory
expansion. The resolved path must remain within the project root for security.
The target file must exist and be readable at compile time.
When these conditions hold, the compiler reads the imported file, recursively
processes its imports, and embeds the resulting IR into the output bundle. The
final `.nixir` file is self-contained and requires no additional file lookups at
load time.
When conditions do not hold, the compiler records the import as dynamic and
emits an IMPORT node containing the string table index. At runtime, the VM
evaluates the import expression to obtain the actual file path, then uses Nix's
standard evaluator to load that file.
## What Works And What Does Not
The implementation covers a substantial subset of Nix's expression language.
Literals work across all types including integers, floats, strings, paths, URIs,
booleans, and null. Lambda expressions, function application, and currying are
implemented. Attribute sets with both static and dynamic keys are supported. The
let and letrec forms work with proper recursive binding semantics. The if
expression, assert statement, with expression, and list literals are all
functional.
The implementation does not cover derivations, builtins other than those
required for basic operation, or the full module system. These require
integration with Nix's store and download mechanisms that the VM does not
replicate.
## Building And Using
Create a build directory and configure with CMake:
```
cmake -B build -G Ninja
cmake --build build
```
This produces `nix-irc` in the build directory and `nix-ir-plugin.so` in the
project root.
Compile a Nix file to IR:
```
./build/nix-irc input.nix output.nixir
```
Load and evaluate the compiled bundle through Nix:
```
nix --plugin-files ./nix-ir-plugin.so eval --expr 'builtins.nixIR_loadIR "output.nixir"'
```
Compile and evaluate source at runtime:
```
nix --plugin-files ./nix-ir-plugin.so eval --expr 'builtins.nixIR_compile "1 + 2"'
```

View file

@ -4,22 +4,28 @@
outputs = {nixpkgs, ...}: let outputs = {nixpkgs, ...}: let
systems = ["x86_64-linux" "aarch64-linux"]; systems = ["x86_64-linux" "aarch64-linux"];
forAllSystems = nixpkgs.lib.genAttrs systems; forAllSystems = nixpkgs.lib.genAttrs systems;
pkgsFor = system: nixpkgs.legacyPackages.${system};
in { in {
devShells = forAllSystems (system: let devShells = forAllSystems (system: let
pkgs = nixpkgs.legacyPackages.${system}; pkgs = pkgsFor system;
in { in {
default = pkgs.mkShell { default = pkgs.mkShell {
buildInputs = with pkgs; [ name = "nixir";
buildInputs = with pkgs; let
nixForLinking = nixVersions.nixComponents_2_32;
nixForRuntime = nixVersions.nix_2_32;
in [
boost.dev boost.dev
libblake3.dev libblake3.dev
pegtl
nixVersions.nixComponents_2_32.nix-store nixForRuntime
nixVersions.nixComponents_2_32.nix-expr nixForLinking.nix-store
nixVersions.nixComponents_2_32.nix-cmd nixForLinking.nix-expr
nixVersions.nixComponents_2_32.nix-fetchers nixForLinking.nix-cmd
nixVersions.nixComponents_2_32.nix-main nixForLinking.nix-fetchers
nixVersions.nixComponents_2_32.nix-util nixForLinking.nix-main
nixVersions.nix_2_32 nixForLinking.nix-util
]; ];
nativeBuildInputs = with pkgs; [ nativeBuildInputs = with pkgs; [
@ -27,12 +33,29 @@
pkg-config pkg-config
ninja ninja
bear bear
clang-tools
just
entr
]; ];
env = { env.NIX_PLUGINABI = "0.2";
NIX_PLUGINABI = "0.2";
};
}; };
}); });
formatter = forAllSystems (system: let
pkgs = pkgsFor system;
in
pkgs.writeShellApplication {
name = "nix3-fmt-wrapper";
runtimeInputs = [
pkgs.alejandra
pkgs.fd
];
text = ''
fd "$@" -t f -e nix -x alejandra -q '{}'
'';
});
}; };
} }

98
justfile Normal file
View file

@ -0,0 +1,98 @@
# Default recipe, show available commands
default:
@just --list
# Build all targets
build:
cmake --build build
# Clean build artifacts
clean:
rm -rf build
find tests -name '*.nixir' -delete
# Configure and build from scratch
rebuild: clean
cmake -B build -G Ninja
cmake --build build
# Run unit tests
test-unit:
./build/regression_test
# Run compilation tests (do all fixtures compile?)
test-compile:
#!/usr/bin/env bash
total=0
success=0
for f in tests/fixtures/*.nix; do
total=$((total+1))
if ./build/nix-irc "$f" "${f%.nix}.nixir" 2>&1 | grep -q "Done!"; then
success=$((success+1))
fi
done
echo "Compiled: $success/$total test files"
[ $success -eq $total ]
# Run integration tests
test-integration:
./tests/integration/run.sh
# Run all tests
test: test-unit test-compile test-integration
@echo "All tests passed"
# Run benchmarks
bench:
./tests/benchmark/run.sh
# Compile a single Nix file to IR
compile FILE OUTPUT="":
#!/usr/bin/env bash
if [ -z "{{OUTPUT}}" ]; then
file="{{FILE}}"
output="${file%.nix}.nixir"
else
output="{{OUTPUT}}"
fi
./build/nix-irc "{{FILE}}" "$output"
# Load plugin and evaluate Nix expression
eval FILE:
nix-instantiate --plugin-files ./build/nix-ir-plugin.so --eval --strict "{{FILE}}"
# Format C++ code with clang-format
format:
find src tests -name '*.cpp' -o -name '*.h' | xargs clang-format -i
# Run clang-tidy on source files
lint:
find src -name '*.cpp' | xargs clang-tidy --fix
# Show project statistics
stats:
@echo "Lines of code:"
@find src -name '*.cpp' -o -name '*.h' | xargs wc -l | tail -1
@echo ""
@echo "Test files:"
@find tests/fixtures -name '*.nix' | wc -l
@echo ""
@echo "Build status:"
@ls -lh build/nix-irc build/nix-ir-plugin.so build/regression_test 2>/dev/null || echo "Not built"
# Run a quick smoke test
smoke:
./build/nix-irc tests/fixtures/simple.nix /tmp/smoke.nixir
nix-instantiate --plugin-files ./build/nix-ir-plugin.so --eval tests/integration/simple_eval.nix
# Generate IR from a Nix file and inspect it
inspect FILE:
./build/nix-irc "{{FILE}}" /tmp/inspect.nixir
@echo "IR bundle size:"
@ls -lh /tmp/inspect.nixir | awk '{print $5}'
@echo "Magic number:"
@xxd -l 4 /tmp/inspect.nixir
# Watch mode, rebuild on file changes
watch:
find src tests -name '*.cpp' -o -name '*.h' | entr -c just build test-unit

View file

@ -5,9 +5,7 @@
#include "evaluator.h" #include "evaluator.h"
#include "nix/expr/eval.hh" #include "nix/expr/eval.hh"
#include "nix/expr/value.hh" #include "nix/expr/value.hh"
#include "nix/util/error.hh" #include "nix/util/url.hh"
#include <stdexcept>
#include <unordered_map> #include <unordered_map>
namespace nix_irc { namespace nix_irc {
@ -23,15 +21,20 @@ struct IREnvironment {
void bind(Value* val) { bindings.push_back(val); } void bind(Value* val) { bindings.push_back(val); }
Value* lookup(uint32_t index) { Value* lookup(uint32_t encoded_index) {
// Decode the index: high 16 bits = depth, low 16 bits = offset
uint32_t depth = encoded_index >> 16;
uint32_t offset = encoded_index & 0xFFFF;
IREnvironment* env = this; IREnvironment* env = this;
while (env) { // Skip 'depth' levels to get to the right scope
if (index < env->bindings.size()) { for (uint32_t i = 0; i < depth && env; i++) {
return env->bindings[index];
}
index -= env->bindings.size();
env = env->parent; env = env->parent;
} }
if (env && offset < env->bindings.size()) {
return env->bindings[offset];
}
return nullptr; return nullptr;
} }
@ -66,10 +69,34 @@ struct Evaluator::Impl {
explicit Impl(EvalState& s) : state(s) {} explicit Impl(EvalState& s) : state(s) {}
~Impl() { static std::string escape_nix_string(std::string_view value) {
for (auto& env : environments) { std::string escaped;
delete env.release(); escaped.reserve(value.size());
for (char ch : value) {
switch (ch) {
case '\\':
escaped += "\\\\";
break;
case '"':
escaped += "\\\"";
break;
case '\n':
escaped += "\\n";
break;
case '\r':
escaped += "\\r";
break;
case '\t':
escaped += "\\t";
break;
default:
escaped.push_back(ch);
break;
}
} }
return escaped;
} }
IREnvironment* make_env(IREnvironment* parent = nullptr) { IREnvironment* make_env(IREnvironment* parent = nullptr) {
@ -100,6 +127,39 @@ struct Evaluator::Impl {
thunks.erase(v); thunks.erase(v);
} }
// Copy a forced value into a destination Value
void copy_value(Value& dest, Value* src) {
if (!src)
return;
force(src);
state.forceValue(*src, noPos);
switch (src->type()) {
case nInt:
dest.mkInt(src->integer());
break;
case nBool:
dest.mkBool(src->boolean());
break;
case nString:
dest.mkString(src->c_str());
break;
case nPath:
dest.mkPath(src->path());
break;
case nNull:
dest.mkNull();
break;
case nFloat:
dest.mkFloat(src->fpoint());
break;
default:
// For attrs, lists, functions, etc., direct assignment is safe
// as they use reference counting internally
dest = *src;
break;
}
}
void eval_node(const std::shared_ptr<Node>& node, Value& v, IREnvironment* env) { void eval_node(const std::shared_ptr<Node>& node, Value& v, IREnvironment* env) {
if (!node) { if (!node) {
v.mkNull(); v.mkNull();
@ -108,14 +168,42 @@ struct Evaluator::Impl {
if (auto* n = node->get_if<ConstIntNode>()) { if (auto* n = node->get_if<ConstIntNode>()) {
v.mkInt(n->value); v.mkInt(n->value);
} else if (auto* n = node->get_if<ConstFloatNode>()) {
v.mkFloat(n->value);
} else if (auto* n = node->get_if<ConstStringNode>()) { } else if (auto* n = node->get_if<ConstStringNode>()) {
v.mkString(n->value); v.mkString(n->value);
} else if (auto* n = node->get_if<ConstPathNode>()) { } else if (auto* n = node->get_if<ConstPathNode>()) {
v.mkPath(state.rootPath(CanonPath(n->value))); std::string path = n->value;
// Expand ~/ to home directory
if (path.size() >= 2 && path[0] == '~' && path[1] == '/') {
const char* home = getenv("HOME");
if (home) {
path = std::string(home) + path.substr(1);
}
}
v.mkPath(state.rootPath(CanonPath(path)));
} else if (auto* n = node->get_if<ConstBoolNode>()) { } else if (auto* n = node->get_if<ConstBoolNode>()) {
v.mkBool(n->value); v.mkBool(n->value);
} else if (auto* n = node->get_if<ConstNullNode>()) { // NOLINT(bugprone-branch-clone) } else if (auto* n = node->get_if<ConstNullNode>()) { // NOLINT(bugprone-branch-clone)
v.mkNull(); v.mkNull();
} else if (auto* n = node->get_if<ConstURINode>()) {
// Parse and validate URI, then create string with URI context
auto parsed = parseURL(n->value, true);
// Store URI with context - use simple mkString with context
v.mkString(parsed.to_string(), nix::NixStringContext{});
} else if (auto* n = node->get_if<ConstLookupPathNode>()) {
// Lookup path like <nixpkgs>; resolve via Nix search path
// We can use EvalState's searchPath to resolve
auto path = state.findFile(n->value);
v.mkPath(path);
} else if (auto* n = node->get_if<ListNode>()) {
// Evaluate list - allocate and populate
auto builder = state.buildList(n->elements.size());
for (size_t i = 0; i < n->elements.size(); i++) {
builder.elems[i] = state.allocValue();
eval_node(n->elements[i], *builder.elems[i], env);
}
v.mkList(builder);
} else if (auto* n = node->get_if<VarNode>()) { } else if (auto* n = node->get_if<VarNode>()) {
Value* bound = env ? env->lookup(n->index) : nullptr; Value* bound = env ? env->lookup(n->index) : nullptr;
if (!bound && env && n->name.has_value()) { if (!bound && env && n->name.has_value()) {
@ -124,8 +212,7 @@ struct Evaluator::Impl {
if (!bound) { if (!bound) {
state.error<EvalError>("variable not found").debugThrow(); state.error<EvalError>("variable not found").debugThrow();
} }
force(bound); copy_value(v, bound);
v = *bound;
} else if (auto* n = node->get_if<LambdaNode>()) { } else if (auto* n = node->get_if<LambdaNode>()) {
auto lambda_env = env; auto lambda_env = env;
auto body = n->body; auto body = n->body;
@ -216,6 +303,22 @@ struct Evaluator::Impl {
v.mkInt((left->integer() + right->integer()).valueWrapping()); v.mkInt((left->integer() + right->integer()).valueWrapping());
} else if (left->type() == nString && right->type() == nString) { } else if (left->type() == nString && right->type() == nString) {
v.mkString(std::string(left->c_str()) + std::string(right->c_str())); v.mkString(std::string(left->c_str()) + std::string(right->c_str()));
} else if (left->type() == nPath && right->type() == nString) {
// Path + string = path
std::string leftPath = std::string(left->path().path.abs());
std::string result = leftPath + std::string(right->c_str());
v.mkPath(state.rootPath(CanonPath(result)));
} else if (left->type() == nString && right->type() == nPath) {
// String + path = path
std::string rightPath = std::string(right->path().path.abs());
std::string result = std::string(left->c_str()) + rightPath;
v.mkPath(state.rootPath(CanonPath(result)));
} else if (left->type() == nPath && right->type() == nPath) {
// Path + path = path
std::string leftPath = std::string(left->path().path.abs());
std::string rightPath = std::string(right->path().path.abs());
std::string result = leftPath + rightPath;
v.mkPath(state.rootPath(CanonPath(result)));
} else { } else {
state.error<EvalError>("type error in addition").debugThrow(); state.error<EvalError>("type error in addition").debugThrow();
} }
@ -286,10 +389,60 @@ struct Evaluator::Impl {
state.error<EvalError>("type error in comparison").debugThrow(); state.error<EvalError>("type error in comparison").debugThrow();
} }
break; break;
case BinaryOp::CONCAT: case BinaryOp::CONCAT: {
// ++ is list concatenation in Nix; string concat uses ADD (+) // List concatenation: left ++ right
state.error<EvalError>("list concatenation not yet implemented").debugThrow(); if (left->type() != nList || right->type() != nList) {
state.error<EvalError>("list concatenation requires two lists").debugThrow();
}
size_t left_size = left->listSize();
size_t right_size = right->listSize();
size_t total_size = left_size + right_size;
auto builder = state.buildList(total_size);
auto left_view = left->listView();
auto right_view = right->listView();
// Copy elements from left list
size_t idx = 0;
for (auto elem : left_view) {
builder.elems[idx++] = elem;
}
// Copy elements from right list
for (auto elem : right_view) {
builder.elems[idx++] = elem;
}
v.mkList(builder);
break; break;
}
case BinaryOp::MERGE: {
// // is attrset merge - right overrides left
if (left->type() != nAttrs || right->type() != nAttrs) {
state.error<EvalError>("attrset merge requires two attrsets").debugThrow();
}
// Build a map of right attrs first (these have priority)
std::unordered_map<Symbol, Value*> right_attrs;
for (auto& attr : *right->attrs()) {
right_attrs[attr.name] = attr.value;
}
// Copy right attrs to result
auto builder = state.buildBindings(left->attrs()->size() + right->attrs()->size());
for (auto& attr : *right->attrs()) {
builder.insert(attr.name, attr.value);
}
// Add left attrs that don't exist in right
for (auto& attr : *left->attrs()) {
if (right_attrs.find(attr.name) == right_attrs.end()) {
builder.insert(attr.name, attr.value);
}
}
v.mkAttrs(builder.finish());
break;
}
default: default:
state.error<EvalError>("unknown binary operator").debugThrow(); state.error<EvalError>("unknown binary operator").debugThrow();
} }
@ -334,42 +487,72 @@ struct Evaluator::Impl {
} }
} else if (auto* n = node->get_if<LetNode>()) { } else if (auto* n = node->get_if<LetNode>()) {
auto let_env = make_env(env); auto let_env = make_env(env);
// Nix's let is recursive: bind all names first, then evaluate
// We allocate Values immediately and evaluate into them
std::vector<Value*> values;
for (const auto& [name, expr] : n->bindings) { for (const auto& [name, expr] : n->bindings) {
Value* val = make_thunk(expr, env); Value* val = state.allocValue();
values.push_back(val);
let_env->bind(val); let_env->bind(val);
} }
// Now evaluate each binding expression into its pre-allocated Value
size_t idx = 0;
for (const auto& [name, expr] : n->bindings) {
eval_node(expr, *values[idx++], let_env);
}
eval_node(n->body, v, let_env); eval_node(n->body, v, let_env);
} else if (auto* n = node->get_if<LetRecNode>()) { } else if (auto* n = node->get_if<LetRecNode>()) {
auto letrec_env = make_env(env); auto letrec_env = make_env(env);
std::vector<Value*> thunk_vals; // Same as LetNode - both are recursive in Nix
std::vector<Value*> values;
for (const auto& [name, expr] : n->bindings) { for (const auto& [name, expr] : n->bindings) {
Value* val = make_thunk(expr, letrec_env); Value* val = state.allocValue();
thunk_vals.push_back(val); values.push_back(val);
letrec_env->bind(val); letrec_env->bind(val);
} }
size_t idx = 0;
for (const auto& [name, expr] : n->bindings) {
eval_node(expr, *values[idx++], letrec_env);
}
eval_node(n->body, v, letrec_env); eval_node(n->body, v, letrec_env);
} else if (auto* n = node->get_if<AttrsetNode>()) { } else if (auto* n = node->get_if<AttrsetNode>()) {
auto bindings = state.buildBindings(n->attrs.size()); auto bindings = state.buildBindings(n->attrs.size());
IREnvironment* attr_env = env; IREnvironment* attr_env = env;
if (n->recursive) { if (n->recursive) {
// For recursive attrsets, create environment where all bindings can
// see each other
attr_env = make_env(env); attr_env = make_env(env);
for (const auto& [key, val] : n->attrs) { for (const auto& binding : n->attrs) {
Value* thunk = make_thunk(val, attr_env); if (!binding.is_dynamic()) {
attr_env->bind(thunk); Value* thunk = make_thunk(binding.value, attr_env);
attr_env->bind(thunk);
}
} }
} }
for (const auto& [key, val] : n->attrs) { // Evaluate attribute values immediately to avoid dangling thunks
// Our thunk system is tied to the Evaluator lifetime, so we can't
// return lazy thunks that outlive the evaluator
for (const auto& binding : n->attrs) {
Value* attr_val = state.allocValue(); Value* attr_val = state.allocValue();
if (n->recursive) { eval_node(binding.value, *attr_val, attr_env);
eval_node(val, *attr_val, attr_env);
if (binding.is_dynamic()) {
// Evaluate key expression to get attribute name
Value* key_val = state.allocValue();
eval_node(binding.dynamic_name, *key_val, attr_env);
force(key_val);
if (key_val->type() != nString) {
state.error<EvalError>("dynamic attribute name must evaluate to a string").debugThrow();
}
std::string key_str = std::string(key_val->c_str());
bindings.insert(state.symbols.create(key_str), attr_val);
} else { } else {
eval_node(val, *attr_val, env); bindings.insert(state.symbols.create(binding.static_name.value()), attr_val);
} }
bindings.insert(state.symbols.create(key), attr_val);
} }
v.mkAttrs(bindings.finish()); v.mkAttrs(bindings.finish());
@ -394,9 +577,7 @@ struct Evaluator::Impl {
auto attr = obj->attrs()->get(sym); auto attr = obj->attrs()->get(sym);
if (attr) { if (attr) {
Value* val = attr->value; copy_value(v, attr->value);
force(val);
v = *val;
} else if (n->default_expr) { } else if (n->default_expr) {
eval_node(*n->default_expr, v, env); eval_node(*n->default_expr, v, env);
} else { } else {
@ -446,6 +627,42 @@ struct Evaluator::Impl {
} }
eval_node(n->body, v, env); eval_node(n->body, v, env);
} else if (auto* n = node->get_if<ImportNode>()) {
// Evaluate path expression to get the file path
Value* path_val = state.allocValue();
eval_node(n->path, *path_val, env);
force(path_val);
// Path should be a string or path type, convert to SourcePath
if (path_val->type() == nPath) {
state.evalFile(path_val->path(), v);
} else if (path_val->type() == nString) {
auto path = state.rootPath(CanonPath(path_val->c_str()));
state.evalFile(path, v);
} else {
state.error<EvalError>("import argument must be a path or string").debugThrow();
}
} else if (auto* n = node->get_if<BuiltinCallNode>()) {
std::vector<Value*> args;
args.reserve(n->args.size());
for (const auto& arg_node : n->args) {
Value* arg = state.allocValue();
eval_node(arg_node, *arg, env);
args.push_back(arg);
}
if (n->builtin_name == "getFlake") {
if (args.size() != 1) {
state.error<EvalError>("getFlake expects exactly one argument").debugThrow();
}
auto flake_ref = state.forceStringNoCtx(*args[0], noPos, "while evaluating getFlake");
std::string expr = "builtins.getFlake \"" + escape_nix_string(flake_ref) + "\"";
auto* parsed = state.parseExprFromString(expr, state.rootPath(CanonPath::root));
state.eval(parsed, v);
} else {
state.error<EvalError>("unsupported builtin call: %s", n->builtin_name).debugThrow();
}
} else { } else {
v.mkNull(); v.mkNull();
} }

View file

@ -9,7 +9,7 @@ namespace nix {
class EvalState; class EvalState;
class Value; class Value;
class PosIdx; class PosIdx;
} } // namespace nix
namespace nix_irc { namespace nix_irc {
@ -18,18 +18,17 @@ class IREnvironment;
class Evaluator { class Evaluator {
public: public:
explicit Evaluator(nix::EvalState& state); explicit Evaluator(nix::EvalState& state);
~Evaluator(); ~Evaluator();
void eval_to_nix(const std::shared_ptr<Node>& ir_node, void eval_to_nix(const std::shared_ptr<Node>& ir_node, nix::Value& result,
nix::Value& result, IREnvironment* env = nullptr);
IREnvironment* env = nullptr);
private: private:
struct Impl; struct Impl;
std::unique_ptr<Impl> pImpl; std::unique_ptr<Impl> pImpl;
}; };
} } // namespace nix_irc
#endif #endif

View file

@ -1,219 +1,256 @@
#include "ir_gen.h" #include "ir_gen.h"
#include <algorithm>
#include <iostream>
#include <stack> #include <stack>
#include <unordered_map> #include <unordered_map>
#include <algorithm>
namespace nix_irc { namespace nix_irc {
struct NameResolver::Impl { struct NameResolver::Impl {
std::vector<std::unordered_map<std::string, uint32_t>> scopes; std::vector<std::unordered_map<std::string, uint32_t>> scopes;
std::vector<std::vector<std::string>> scope_names; std::vector<std::vector<std::string>> scope_names;
Impl() { Impl() {
scopes.push_back({}); scopes.push_back({});
scope_names.push_back({}); scope_names.push_back({});
} }
}; };
NameResolver::NameResolver() : pImpl(std::make_unique<Impl>()) {} NameResolver::NameResolver() : pImpl(std::make_unique<Impl>()) {}
NameResolver::~NameResolver() = default; NameResolver::~NameResolver() = default;
void NameResolver::enter_scope() { void NameResolver::enter_scope() {
pImpl->scopes.push_back({}); pImpl->scopes.push_back({});
pImpl->scope_names.push_back({}); pImpl->scope_names.push_back({});
} }
void NameResolver::exit_scope() { void NameResolver::exit_scope() {
if (!pImpl->scopes.empty()) { if (!pImpl->scopes.empty()) {
pImpl->scopes.pop_back(); pImpl->scopes.pop_back();
pImpl->scope_names.pop_back(); pImpl->scope_names.pop_back();
} }
} }
void NameResolver::bind(const std::string& name) { void NameResolver::bind(const std::string& name) {
if (pImpl->scopes.empty()) return; if (pImpl->scopes.empty())
uint32_t idx = pImpl->scope_names.back().size(); return;
pImpl->scopes.back()[name] = idx; uint32_t idx = pImpl->scope_names.back().size();
pImpl->scope_names.back().push_back(name); pImpl->scopes.back()[name] = idx;
pImpl->scope_names.back().push_back(name);
} }
uint32_t NameResolver::resolve(const std::string& name) { uint32_t NameResolver::resolve(const std::string& name) {
for (int i = (int)pImpl->scopes.size() - 1; i >= 0; --i) { for (int i = (int) pImpl->scopes.size() - 1; i >= 0; --i) {
auto it = pImpl->scopes[i].find(name); auto it = pImpl->scopes[i].find(name);
if (it != pImpl->scopes[i].end()) { if (it != pImpl->scopes[i].end()) {
uint32_t depth = pImpl->scopes.size() - 1 - i; uint32_t depth = pImpl->scopes.size() - 1 - i;
uint32_t offset = it->second; uint32_t offset = it->second;
return depth << 16 | offset; return depth << 16 | offset;
}
} }
return 0xFFFFFFFF; }
return 0xFFFFFFFF;
} }
bool NameResolver::is_bound(const std::string& name) const { bool NameResolver::is_bound(const std::string& name) const {
for (auto it = pImpl->scopes.rbegin(); it != pImpl->scopes.rend(); ++it) { for (auto it = pImpl->scopes.rbegin(); it != pImpl->scopes.rend(); ++it) {
if (it->count(name)) return true; if (it->count(name))
} return true;
return false; }
return false;
} }
struct IRGenerator::Impl { struct IRGenerator::Impl {
std::unordered_map<std::string, uint32_t> string_table; std::unordered_map<std::string, uint32_t> string_table;
uint32_t next_string_id = 0; uint32_t next_string_id = 0;
NameResolver name_resolver; NameResolver name_resolver;
Impl() {} Impl() {}
uint32_t add_string(const std::string& str) { uint32_t add_string(const std::string& str) {
auto it = string_table.find(str); auto it = string_table.find(str);
if (it != string_table.end()) { if (it != string_table.end()) {
return it->second; return it->second;
}
uint32_t id = next_string_id++;
string_table[str] = id;
return id;
} }
uint32_t id = next_string_id++;
string_table[str] = id;
return id;
}
std::shared_ptr<Node> convert(const std::shared_ptr<Node>& node_ptr) { std::shared_ptr<Node> convert(const std::shared_ptr<Node>& node_ptr) {
if (!node_ptr) return std::make_shared<Node>(ConstNullNode{}); if (!node_ptr)
return std::make_shared<Node>(ConstNullNode{});
const Node& node = *node_ptr; const Node& node = *node_ptr;
if (auto* n = node.get_if<ConstIntNode>()) { if (auto* n = node.get_if<ConstIntNode>()) {
return std::make_shared<Node>(*n); return std::make_shared<Node>(*n);
}
if (auto* n = node.get_if<ConstStringNode>()) {
return std::make_shared<Node>(*n);
}
if (auto* n = node.get_if<ConstPathNode>()) {
return std::make_shared<Node>(*n);
}
if (auto* n = node.get_if<ConstBoolNode>()) {
return std::make_shared<Node>(*n);
}
if (auto* n = node.get_if<ConstNullNode>()) {
return std::make_shared<Node>(*n);
}
if (auto* n = node.get_if<VarNode>()) {
uint32_t idx = name_resolver.resolve(n->name.value_or(""));
VarNode converted(idx);
converted.name = n->name;
converted.line = n->line;
return std::make_shared<Node>(converted);
}
if (auto* n = node.get_if<LambdaNode>()) {
name_resolver.enter_scope();
if (n->param_name) {
name_resolver.bind(*n->param_name);
}
auto body = convert(n->body);
name_resolver.exit_scope();
LambdaNode lambda(n->arity, body, n->line);
lambda.param_name = n->param_name;
return std::make_shared<Node>(lambda);
}
if (auto* n = node.get_if<AppNode>()) {
auto func = convert(n->func);
auto arg = convert(n->arg);
return std::make_shared<Node>(AppNode(func, arg, n->line));
}
if (auto* n = node.get_if<AttrsetNode>()) {
AttrsetNode attrs(n->recursive, n->line);
name_resolver.enter_scope();
for (const auto& [key, val] : n->attrs) {
name_resolver.bind(key);
}
for (const auto& [key, val] : n->attrs) {
attrs.attrs.push_back({key, convert(val)});
}
name_resolver.exit_scope();
return std::make_shared<Node>(attrs);
}
if (auto* n = node.get_if<SelectNode>()) {
auto expr = convert(n->expr);
auto attr = convert(n->attr);
SelectNode select(expr, attr, n->line);
if (n->default_expr) {
select.default_expr = convert(*n->default_expr);
}
return std::make_shared<Node>(select);
}
if (auto* n = node.get_if<HasAttrNode>()) {
auto expr = convert(n->expr);
auto attr = convert(n->attr);
return std::make_shared<Node>(HasAttrNode(expr, attr, n->line));
}
if (auto* n = node.get_if<WithNode>()) {
auto attrs = convert(n->attrs);
auto body = convert(n->body);
return std::make_shared<Node>(WithNode(attrs, body, n->line));
}
if (auto* n = node.get_if<IfNode>()) {
auto cond = convert(n->cond);
auto then_b = convert(n->then_branch);
auto else_b = convert(n->else_branch);
return std::make_shared<Node>(IfNode(cond, then_b, else_b, n->line));
}
if (auto* n = node.get_if<LetNode>()) {
name_resolver.enter_scope();
for (const auto& [key, val] : n->bindings) {
name_resolver.bind(key);
}
std::vector<std::pair<std::string, std::shared_ptr<Node>>> new_bindings;
for (const auto& [key, val] : n->bindings) {
new_bindings.push_back({key, convert(val)});
}
auto body = convert(n->body);
name_resolver.exit_scope();
LetNode let(body, n->line);
let.bindings = std::move(new_bindings);
return std::make_shared<Node>(let);
}
if (auto* n = node.get_if<LetRecNode>()) {
name_resolver.enter_scope();
for (const auto& [key, val] : n->bindings) {
name_resolver.bind(key);
}
std::vector<std::pair<std::string, std::shared_ptr<Node>>> new_bindings;
for (const auto& [key, val] : n->bindings) {
new_bindings.push_back({key, convert(val)});
}
auto body = convert(n->body);
name_resolver.exit_scope();
LetRecNode letrec(body, n->line);
letrec.bindings = std::move(new_bindings);
return std::make_shared<Node>(letrec);
}
if (auto* n = node.get_if<AssertNode>()) {
auto cond = convert(n->cond);
auto body = convert(n->body);
return std::make_shared<Node>(AssertNode(cond, body, n->line));
}
if (auto* n = node.get_if<BinaryOpNode>()) {
auto left = convert(n->left);
auto right = convert(n->right);
return std::make_shared<Node>(BinaryOpNode(n->op, left, right, n->line));
}
if (auto* n = node.get_if<UnaryOpNode>()) {
auto operand = convert(n->operand);
return std::make_shared<Node>(UnaryOpNode(n->op, operand, n->line));
}
return std::make_shared<Node>(ConstNullNode{});
} }
if (auto* n = node.get_if<ConstStringNode>()) {
return std::make_shared<Node>(*n);
}
if (auto* n = node.get_if<ConstPathNode>()) {
return std::make_shared<Node>(*n);
}
if (auto* n = node.get_if<ConstBoolNode>()) {
return std::make_shared<Node>(*n);
}
if (auto* n = node.get_if<ConstNullNode>()) {
return std::make_shared<Node>(*n);
}
if (auto* n = node.get_if<VarNode>()) {
std::string var_name = n->name.value_or("");
uint32_t idx = name_resolver.resolve(var_name);
VarNode converted(idx);
converted.name = n->name;
converted.line = n->line;
return std::make_shared<Node>(converted);
}
if (auto* n = node.get_if<LambdaNode>()) {
name_resolver.enter_scope();
if (n->param_name) {
name_resolver.bind(*n->param_name);
}
auto body = convert(n->body);
name_resolver.exit_scope();
LambdaNode lambda(n->arity, body, n->line);
lambda.param_name = n->param_name;
return std::make_shared<Node>(lambda);
}
if (auto* n = node.get_if<AppNode>()) {
auto func = convert(n->func);
auto arg = convert(n->arg);
return std::make_shared<Node>(AppNode(func, arg, n->line));
}
if (auto* n = node.get_if<AttrsetNode>()) {
AttrsetNode attrs(n->recursive, n->line);
// Only enter a new scope for recursive attrsets
if (n->recursive) {
name_resolver.enter_scope();
for (const auto& binding : n->attrs) {
if (!binding.is_dynamic()) {
name_resolver.bind(binding.static_name.value());
}
}
}
for (const auto& binding : n->attrs) {
if (binding.is_dynamic()) {
attrs.attrs.push_back(AttrBinding(convert(binding.dynamic_name), convert(binding.value)));
} else {
attrs.attrs.push_back(AttrBinding(binding.static_name.value(), convert(binding.value)));
}
}
if (n->recursive) {
name_resolver.exit_scope();
}
return std::make_shared<Node>(attrs);
}
if (auto* n = node.get_if<SelectNode>()) {
auto expr = convert(n->expr);
auto attr = convert(n->attr);
SelectNode select(expr, attr, n->line);
if (n->default_expr) {
select.default_expr = convert(*n->default_expr);
}
return std::make_shared<Node>(select);
}
if (auto* n = node.get_if<HasAttrNode>()) {
auto expr = convert(n->expr);
auto attr = convert(n->attr);
return std::make_shared<Node>(HasAttrNode(expr, attr, n->line));
}
if (auto* n = node.get_if<WithNode>()) {
auto attrs = convert(n->attrs);
auto body = convert(n->body);
return std::make_shared<Node>(WithNode(attrs, body, n->line));
}
if (auto* n = node.get_if<IfNode>()) {
auto cond = convert(n->cond);
auto then_b = convert(n->then_branch);
auto else_b = convert(n->else_branch);
return std::make_shared<Node>(IfNode(cond, then_b, else_b, n->line));
}
if (auto* n = node.get_if<LetNode>()) {
name_resolver.enter_scope();
for (const auto& [key, val] : n->bindings) {
name_resolver.bind(key);
}
std::vector<std::pair<std::string, std::shared_ptr<Node>>> new_bindings;
new_bindings.reserve(n->bindings.size());
for (const auto& [key, val] : n->bindings) {
new_bindings.push_back({key, convert(val)});
}
auto body = convert(n->body);
name_resolver.exit_scope();
LetNode let(body, n->line);
let.bindings = std::move(new_bindings);
return std::make_shared<Node>(let);
}
if (auto* n = node.get_if<LetRecNode>()) {
name_resolver.enter_scope();
for (const auto& [key, val] : n->bindings) {
name_resolver.bind(key);
}
std::vector<std::pair<std::string, std::shared_ptr<Node>>> new_bindings;
new_bindings.reserve(n->bindings.size());
for (const auto& [key, val] : n->bindings) {
new_bindings.push_back({key, convert(val)});
}
auto body = convert(n->body);
name_resolver.exit_scope();
LetRecNode letrec(body, n->line);
letrec.bindings = std::move(new_bindings);
return std::make_shared<Node>(letrec);
}
if (auto* n = node.get_if<AssertNode>()) {
auto cond = convert(n->cond);
auto body = convert(n->body);
return std::make_shared<Node>(AssertNode(cond, body, n->line));
}
if (auto* n = node.get_if<BinaryOpNode>()) {
auto left = convert(n->left);
auto right = convert(n->right);
return std::make_shared<Node>(BinaryOpNode(n->op, left, right, n->line));
}
if (auto* n = node.get_if<UnaryOpNode>()) {
auto operand = convert(n->operand);
return std::make_shared<Node>(UnaryOpNode(n->op, operand, n->line));
}
if (auto* n = node.get_if<ListNode>()) {
std::vector<std::shared_ptr<Node>> elements;
elements.reserve(n->elements.size());
for (const auto& elem : n->elements) {
elements.push_back(convert(elem));
}
return std::make_shared<Node>(ListNode(std::move(elements), n->line));
}
if (auto* n = node.get_if<BuiltinCallNode>()) {
std::vector<std::shared_ptr<Node>> args;
args.reserve(n->args.size());
for (const auto& arg : n->args) {
args.push_back(convert(arg));
}
return std::make_shared<Node>(BuiltinCallNode(n->builtin_name, std::move(args), n->line));
}
return std::make_shared<Node>(ConstNullNode{});
}
}; };
IRGenerator::IRGenerator() : pImpl(std::make_unique<Impl>()) {} IRGenerator::IRGenerator() : pImpl(std::make_unique<Impl>()) {}
IRGenerator::~IRGenerator() = default; IRGenerator::~IRGenerator() = default;
void IRGenerator::set_string_table(const std::unordered_map<std::string, uint32_t>& table) { void IRGenerator::set_string_table(const std::unordered_map<std::string, uint32_t>& table) {
pImpl->string_table = table; pImpl->string_table = table;
} }
uint32_t IRGenerator::add_string(const std::string& str) { uint32_t IRGenerator::add_string(const std::string& str) {
return pImpl->add_string(str); return pImpl->add_string(str);
} }
std::shared_ptr<Node> IRGenerator::generate(const std::shared_ptr<Node>& ast) { std::shared_ptr<Node> IRGenerator::generate(const std::shared_ptr<Node>& ast) {
return pImpl->convert(ast); return pImpl->convert(ast);
} }
} } // namespace nix_irc

View file

@ -2,44 +2,44 @@
#define NIX_IRC_IR_GEN_H #define NIX_IRC_IR_GEN_H
#include "types.h" #include "types.h"
#include <memory>
#include <string> #include <string>
#include <unordered_map> #include <unordered_map>
#include <vector> #include <vector>
#include <memory>
namespace nix_irc { namespace nix_irc {
class IRGenerator { class IRGenerator {
public: public:
IRGenerator(); IRGenerator();
~IRGenerator(); ~IRGenerator();
void set_string_table(const std::unordered_map<std::string, uint32_t>& table); void set_string_table(const std::unordered_map<std::string, uint32_t>& table);
uint32_t add_string(const std::string& str); uint32_t add_string(const std::string& str);
std::shared_ptr<Node> generate(const std::shared_ptr<Node>& ast); std::shared_ptr<Node> generate(const std::shared_ptr<Node>& ast);
private: private:
struct Impl; struct Impl;
std::unique_ptr<Impl> pImpl; std::unique_ptr<Impl> pImpl;
}; };
class NameResolver { class NameResolver {
public: public:
NameResolver(); NameResolver();
~NameResolver(); ~NameResolver();
void enter_scope(); void enter_scope();
void exit_scope(); void exit_scope();
void bind(const std::string& name); void bind(const std::string& name);
uint32_t resolve(const std::string& name); uint32_t resolve(const std::string& name);
bool is_bound(const std::string& name) const; bool is_bound(const std::string& name) const;
private: private:
struct Impl; struct Impl;
std::unique_ptr<Impl> pImpl; std::unique_ptr<Impl> pImpl;
}; };
} } // namespace nix_irc
#endif #endif

598
src/irc/lexer.cpp Normal file
View file

@ -0,0 +1,598 @@
#include "lexer.h"
#include <cctype>
#include <stdexcept>
namespace nix_irc {
Lexer::Lexer(std::string input) : input(std::move(input)), pos(0), line(1), col(1) {}
std::vector<Token> Lexer::tokenize() {
#define TOKEN(t) \
Token { \
Token::t, "", line, col \
}
while (pos < input.size()) {
skip_whitespace();
if (pos >= input.size())
break;
char c = input[pos];
if (c == '(') {
emit(TOKEN(LPAREN));
} else if (c == ')') {
emit(TOKEN(RPAREN));
} else if (c == '{') {
emit(TOKEN(LBRACE));
} else if (c == '}') {
emit(TOKEN(RBRACE));
} else if (c == '[') {
emit(TOKEN(LBRACKET));
} else if (c == ']') {
emit(TOKEN(RBRACKET));
} else if (c == ';') {
emit(TOKEN(SEMICOLON));
} else if (c == ':') {
emit(TOKEN(COLON));
} else if (c == '@') {
emit(TOKEN(AT));
} else if (c == ',') {
emit(TOKEN(COMMA));
} else if (c == '\'' && pos + 1 < input.size() && input[pos + 1] == '\'') {
tokenize_indented_string();
} else if (c == '"') {
tokenize_string();
}
// Two-char operators
else if (c == '=' && pos + 1 < input.size() && input[pos + 1] == '=') {
tokens.push_back(TOKEN(EQEQ));
pos += 2;
col += 2;
} else if (c == '=') {
emit(TOKEN(EQUALS));
} else if (c == '!' && pos + 1 < input.size() && input[pos + 1] == '=') {
tokens.push_back(TOKEN(NE));
pos += 2;
col += 2;
} else if (c == '<' && pos + 1 < input.size() && input[pos + 1] == '=') {
tokens.push_back(TOKEN(LE));
pos += 2;
col += 2;
} else if (c == '>' && pos + 1 < input.size() && input[pos + 1] == '=') {
tokens.push_back(TOKEN(GE));
pos += 2;
col += 2;
} else if (c == '+' && pos + 1 < input.size() && input[pos + 1] == '+') {
tokens.push_back(TOKEN(CONCAT));
pos += 2;
col += 2;
} else if (c == '/' && pos + 1 < input.size() && input[pos + 1] == '/') {
tokens.push_back(TOKEN(MERGE));
pos += 2;
col += 2;
} else if (c == '&' && pos + 1 < input.size() && input[pos + 1] == '&') {
tokens.push_back(TOKEN(AND));
pos += 2;
col += 2;
} else if (c == '|' && pos + 1 < input.size() && input[pos + 1] == '|') {
tokens.push_back(TOKEN(OR));
pos += 2;
col += 2;
} else if (c == '-' && pos + 1 < input.size() && input[pos + 1] == '>') {
tokens.push_back(TOKEN(IMPL));
pos += 2;
col += 2;
}
// Single-char operators
else if (c == '+') {
emit(TOKEN(PLUS));
} else if (c == '*') {
emit(TOKEN(STAR));
} else if (c == '/') {
// Check if it's a path or division
if (pos + 1 < input.size() && (isalnum(input[pos + 1]) || input[pos + 1] == '.')) {
tokenize_path();
} else {
emit(TOKEN(SLASH));
}
} else if (c == '<') {
// Check for lookup path <nixpkgs> vs comparison operator
size_t end = pos + 1;
bool is_lookup_path = false;
// Scan for valid lookup path characters until >
while (end < input.size() && (isalnum(input[end]) || input[end] == '-' || input[end] == '_' ||
input[end] == '/' || input[end] == '.')) {
end++;
}
// If we found > and there's content, it's a lookup path
if (end < input.size() && input[end] == '>' && end > pos + 1) {
std::string path = input.substr(pos + 1, end - pos - 1);
size_t consumed = end - pos + 1;
tokens.push_back({Token::LOOKUP_PATH, path, line, col});
pos = end + 1;
col += consumed;
is_lookup_path = true;
}
if (!is_lookup_path) {
emit(TOKEN(LT));
}
} else if (c == '>') {
emit(TOKEN(GT));
} else if (c == '!') {
emit(TOKEN(NOT));
} else if (c == '.') {
// Relative paths: ./foo and ../foo
if (pos + 1 < input.size() && input[pos + 1] == '/') {
tokenize_path();
} else if (pos + 2 < input.size() && input[pos + 1] == '.' && input[pos + 2] == '/') {
tokenize_path();
}
// Check for ellipsis (...)
else if (pos + 2 < input.size() && input[pos + 1] == '.' && input[pos + 2] == '.') {
tokens.push_back(TOKEN(ELLIPSIS));
pos += 3;
col += 3;
} else {
emit(TOKEN(DOT));
}
} else if (c == '?') {
emit(TOKEN(QUESTION));
} else if (c == '~') {
// Home-relative path ~/...
if (pos + 1 < input.size() && input[pos + 1] == '/') {
tokenize_home_path();
} else {
// Just ~ by itself is an identifier
tokenize_ident();
}
} else if (c == '-') {
// Check if it's a negative number or minus operator
if (pos + 1 < input.size() && isdigit(input[pos + 1])) {
// Check for negative float
if (pos + 2 < input.size() && input[pos + 2] == '.') {
tokenize_float();
} else {
tokenize_int();
}
} else {
emit(TOKEN(MINUS));
}
} else if (isdigit(c)) {
// Check if it's a float (digit followed by '.')
if (pos + 1 < input.size() && input[pos + 1] == '.') {
tokenize_float();
} else {
tokenize_int();
}
} else if (isalpha(c)) {
// Check if it's a URI (contains ://) - look ahead
size_t lookahead = pos;
while (lookahead < input.size() &&
(isalnum(input[lookahead]) || input[lookahead] == '_' || input[lookahead] == '-' ||
input[lookahead] == '+' || input[lookahead] == '.'))
lookahead++;
std::string potential_scheme = input.substr(pos, lookahead - pos);
if (lookahead + 2 < input.size() && input[lookahead] == ':' && input[lookahead + 1] == '/' &&
input[lookahead + 2] == '/') {
// It's a URI, consume the whole thing
tokenize_uri();
} else {
tokenize_ident();
}
} else {
throw std::runtime_error("Unexpected character '" + std::string(1, c) + "' at " +
std::to_string(line) + ":" + std::to_string(col));
}
}
tokens.push_back({Token::EOF_, "", line, col});
#undef TOKEN
return tokens;
}
void Lexer::emit(const Token& t) {
tokens.push_back(t);
pos++;
col++;
}
void Lexer::skip_whitespace() {
while (pos < input.size()) {
char c = input[pos];
if (c == ' ' || c == '\t' || c == '\n' || c == '\r') {
if (c == '\n') {
line++;
col = 1;
} else {
col++;
}
pos++;
} else if (c == '#') {
// Line comment - skip until newline
while (pos < input.size() && input[pos] != '\n')
pos++;
} else if (c == '/' && pos + 1 < input.size() && input[pos + 1] == '*') {
// Block comment /* ... */
// Note: Nix block comments do NOT nest
size_t start_line = line;
size_t start_col = col;
bool terminated = false;
pos += 2; // Skip /*
col += 2;
while (pos + 1 < input.size()) {
if (input[pos] == '*' && input[pos + 1] == '/') {
pos += 2; // Skip */
col += 2;
terminated = true;
break;
}
if (input[pos] == '\n') {
line++;
col = 1;
} else {
col++;
}
pos++;
}
if (!terminated) {
throw std::runtime_error("Unterminated block comment at " + std::to_string(start_line) +
":" + std::to_string(start_col));
}
} else {
break;
}
}
}
void Lexer::tokenize_string() {
size_t start_line = line;
size_t start_col = col;
pos++;
col++;
std::string s;
bool has_interp = false;
while (pos < input.size() && input[pos] != '"') {
if (input[pos] == '\\' && pos + 1 < input.size()) {
pos++;
col++;
switch (input[pos]) {
case 'n':
s += '\n';
break;
case 't':
s += '\t';
break;
case 'r':
s += '\r';
break;
case '"':
s += '"';
break;
case '\\':
s += '\\';
break;
case '$':
s += '$';
break; // Escaped $
default:
s += input[pos];
break;
}
pos++;
col++;
} else if (input[pos] == '$' && pos + 1 < input.size() && input[pos + 1] == '{') {
// Found interpolation marker
has_interp = true;
s += input[pos]; // Keep $ in raw string
pos++;
col++;
} else {
if (input[pos] == '\n') {
s += input[pos];
pos++;
line++;
col = 1;
continue;
}
s += input[pos];
pos++;
col++;
}
}
if (pos >= input.size()) {
throw std::runtime_error("Unterminated string at " + std::to_string(start_line) + ":" +
std::to_string(start_col));
}
pos++;
col++;
Token::Type type = has_interp ? Token::STRING_INTERP : Token::STRING;
tokens.push_back({type, s, start_line, start_col});
}
void Lexer::tokenize_indented_string() {
pos += 2; // Skip opening ''
std::string raw_content;
bool has_interp = false;
size_t start_line = line;
// Collect raw content until closing ''
while (pos < input.size()) {
// Check for escape sequences
if (pos + 1 < input.size() && input[pos] == '\'' && input[pos + 1] == '\'') {
// Check if it's an escape or the closing delimiter
if (pos + 2 < input.size() && input[pos + 2] == '\'') {
// ''' -> escape for ''
raw_content += "''";
pos += 3;
continue;
} else if (pos + 2 < input.size() && input[pos + 2] == '$') {
// ''$ -> escape for $
raw_content += '$';
pos += 3;
continue;
} else if (pos + 2 < input.size() && input[pos + 2] == '\\') {
// ''\ -> check what follows
if (pos + 3 < input.size()) {
char next = input[pos + 3];
if (next == 'n') {
raw_content += '\n';
pos += 4;
continue;
} else if (next == 'r') {
raw_content += '\r';
pos += 4;
continue;
} else if (next == 't') {
raw_content += '\t';
pos += 4;
continue;
} else if (next == ' ' || next == '\t') {
// ''\ before whitespace - preserve the whitespace by prepending a marker
// We use a special escape sequence that won't appear in normal text
raw_content += "\x1F\x1F"; // Unit separator pair as marker for preserved whitespace
raw_content += next;
pos += 4;
continue;
}
}
// Default: literal backslash
raw_content += '\\';
pos += 3;
continue;
} else {
// Just closing ''
pos += 2;
break;
}
}
// Check for interpolation
if (input[pos] == '$' && pos + 1 < input.size() && input[pos + 1] == '{') {
has_interp = true;
raw_content += input[pos];
pos++;
if (input[pos] == '\n') {
line++;
}
continue;
}
// Track newlines
if (input[pos] == '\n') {
line++;
raw_content += input[pos];
pos++;
} else {
raw_content += input[pos];
pos++;
}
}
// Strip common indentation
std::string stripped = strip_indentation(raw_content);
Token::Type type = has_interp ? Token::INDENTED_STRING_INTERP : Token::INDENTED_STRING;
tokens.push_back({type, stripped, start_line, col});
}
std::string Lexer::strip_indentation(const std::string& s) {
if (s.empty())
return s;
// Split into lines
std::vector<std::string> lines;
std::string current_line;
for (char c : s) {
if (c == '\n') {
lines.push_back(current_line);
current_line.clear();
} else {
current_line += c;
}
}
if (!current_line.empty() || (!s.empty() && s.back() == '\n')) {
lines.push_back(current_line);
}
// Find minimum indentation (spaces/tabs at start of non-empty lines)
// \x1F\x1F marker indicates preserved whitespace (from ''\ escape)
size_t min_indent = std::string::npos;
for (const auto& line : lines) {
if (line.empty())
continue; // Skip empty lines when calculating indentation
size_t indent = 0;
for (size_t i = 0; i < line.size(); i++) {
char c = line[i];
// If we hit the preserved whitespace marker, stop counting indentation
if (c == '\x1F' && i + 1 < line.size() && line[i + 1] == '\x1F') {
break;
}
if (c == ' ' || c == '\t')
indent++;
else
break;
}
if (indent < min_indent)
min_indent = indent;
}
if (min_indent == std::string::npos)
min_indent = 0;
// Strip min_indent from all lines and remove \x1F\x1F markers
std::string result;
for (size_t i = 0; i < lines.size(); i++) {
const auto& line = lines[i];
if (line.empty()) {
// Preserve empty lines
if (i + 1 < lines.size())
result += '\n';
} else {
// Strip indentation, being careful about \x1F\x1F markers
size_t skip = 0;
size_t pos = 0;
while (skip < min_indent && pos < line.size()) {
if (line[pos] == '\x1F' && pos + 1 < line.size() && line[pos + 1] == '\x1F') {
// Hit preserved whitespace marker - don't strip any more
break;
}
skip++;
pos++;
}
// Add the rest of the line, removing \x1F\x1F markers
for (size_t j = pos; j < line.size(); j++) {
if (line[j] == '\x1F' && j + 1 < line.size() && line[j + 1] == '\x1F') {
j++; // Skip both marker bytes
continue;
}
result += line[j];
}
if (i + 1 < lines.size())
result += '\n';
}
}
return result;
}
void Lexer::tokenize_path() {
size_t start = pos;
while (pos < input.size() && !isspace(input[pos]) && input[pos] != '(' && input[pos] != ')' &&
input[pos] != '{' && input[pos] != '}' && input[pos] != '[' && input[pos] != ']' &&
input[pos] != ';') {
pos++;
}
std::string path = input.substr(start, pos - start);
tokens.push_back({Token::PATH, path, line, col});
col += path.size();
}
void Lexer::tokenize_home_path() {
size_t start = pos;
pos++; // Skip ~
if (pos < input.size() && input[pos] == '/') {
// Home-relative path ~/something
while (pos < input.size() && !isspace(input[pos]) && input[pos] != '(' && input[pos] != ')' &&
input[pos] != '{' && input[pos] != '}' && input[pos] != '[' && input[pos] != ']' &&
input[pos] != ';') {
pos++;
}
}
std::string path = input.substr(start, pos - start);
tokens.push_back({Token::PATH, path, line, col});
col += path.size();
}
void Lexer::tokenize_int() {
size_t start = pos;
if (input[pos] == '-')
pos++;
while (pos < input.size() && isdigit(input[pos]))
pos++;
std::string num = input.substr(start, pos - start);
tokens.push_back({Token::INT, num, line, col});
col += num.size();
}
void Lexer::tokenize_float() {
size_t start = pos;
if (input[pos] == '-')
pos++;
while (pos < input.size() && isdigit(input[pos]))
pos++;
if (pos < input.size() && input[pos] == '.') {
pos++;
while (pos < input.size() && isdigit(input[pos]))
pos++;
}
std::string num = input.substr(start, pos - start);
tokens.push_back({Token::FLOAT, num, line, col});
col += num.size();
}
void Lexer::tokenize_uri() {
size_t start = pos;
while (pos < input.size() && !isspace(input[pos]) && input[pos] != ')' && input[pos] != ']' &&
input[pos] != ';') {
pos++;
}
std::string uri = input.substr(start, pos - start);
tokens.push_back({Token::URI, uri, line, col});
col += uri.size();
}
void Lexer::tokenize_ident() {
size_t start = pos;
// Note: Don't include '.' here - it's used for selection (a.b.c)
// URIs are handled separately by checking for '://' pattern
while (pos < input.size() && (isalnum(input[pos]) || input[pos] == '_' || input[pos] == '-'))
pos++;
std::string ident = input.substr(start, pos - start);
// Check if it's a URI (contains ://)
size_t scheme_end = ident.find("://");
if (scheme_end != std::string::npos && scheme_end > 0) {
tokens.push_back({Token::URI, ident, line, col});
col += ident.size();
return;
}
Token::Type type = Token::IDENT;
if (ident == "let")
type = Token::LET;
else if (ident == "in")
type = Token::IN;
else if (ident == "rec")
type = Token::REC;
else if (ident == "if")
type = Token::IF;
else if (ident == "then")
type = Token::THEN;
else if (ident == "else")
type = Token::ELSE;
else if (ident == "assert")
type = Token::ASSERT;
else if (ident == "with")
type = Token::WITH;
else if (ident == "inherit")
type = Token::INHERIT;
else if (ident == "import")
type = Token::IMPORT;
else if (ident == "true")
type = Token::BOOL;
else if (ident == "false")
type = Token::BOOL;
tokens.push_back({type, ident, line, col});
col += ident.size();
}
} // namespace nix_irc

94
src/irc/lexer.h Normal file
View file

@ -0,0 +1,94 @@
#pragma once
#include <string>
#include <vector>
namespace nix_irc {
struct Token {
enum Type {
LPAREN,
RPAREN,
LBRACE,
RBRACE,
LBRACKET,
RBRACKET,
IDENT,
STRING,
STRING_INTERP,
INDENTED_STRING,
INDENTED_STRING_INTERP,
PATH,
LOOKUP_PATH,
INT,
FLOAT,
URI,
BOOL,
LET,
IN,
REC,
IF,
THEN,
ELSE,
ASSERT,
WITH,
INHERIT,
IMPORT,
DOT,
SEMICOLON,
COLON,
EQUALS,
AT,
COMMA,
QUESTION,
ELLIPSIS,
// Operators
PLUS,
MINUS,
STAR,
SLASH,
CONCAT,
MERGE,
EQEQ,
NE,
LT,
GT,
LE,
GE,
AND,
OR,
IMPL,
NOT,
EOF_
} type;
std::string value;
size_t line;
size_t col;
};
class Lexer {
public:
explicit Lexer(std::string input);
std::vector<Token> tokenize();
private:
std::vector<Token> tokens;
std::string input;
size_t pos;
size_t line;
size_t col;
void emit(const Token& t);
void skip_whitespace();
void tokenize_string();
void tokenize_indented_string();
std::string strip_indentation(const std::string& s);
void tokenize_path();
void tokenize_home_path();
void tokenize_int();
void tokenize_float();
void tokenize_uri();
void tokenize_ident();
};
} // namespace nix_irc

View file

@ -1,150 +1,297 @@
#include <iostream>
#include "parser.h"
#include "resolver.h"
#include "ir_gen.h" #include "ir_gen.h"
#include "parser.h"
#include "serializer.h" #include "serializer.h"
#include <cctype>
#include <cstring>
#include <filesystem>
#include <iostream>
#include <stdexcept>
#include <string> #include <string>
#include <vector> #include <vector>
#include <cstring>
namespace nix_irc { namespace nix_irc {
namespace fs = std::filesystem;
void print_usage(const char* prog) { void print_usage(const char* prog) {
std::cout << "Usage: " << prog << " [options] <input.nix> [output.nixir]\n" std::cout << "Usage: " << prog << " [options] <input.nix|flake#attr> [output.nixir]\n"
<< "\nOptions:\n" << "\nOptions:\n"
<< " -I <path> Add search path for imports\n" << " -I <path> Add search path for imports\n"
<< " --no-imports Disable import resolution\n" << " --no-imports Disable import resolution\n"
<< " --help Show this help\n"; << " --help Show this help\n";
}
static bool is_flake_reference(const std::string& input) {
return input.find('#') != std::string::npos;
}
static std::string sanitize_output_stem(const std::string& input) {
std::string stem;
stem.reserve(input.size());
for (char ch : input) {
if (std::isalnum(static_cast<unsigned char>(ch))) {
stem.push_back(ch);
} else if (stem.empty() || stem.back() != '-') {
stem.push_back('-');
}
}
while (!stem.empty() && stem.back() == '-') {
stem.pop_back();
}
return stem.empty() ? "bundle" : stem;
}
static std::string default_output_path_for(const std::string& input) {
if (!is_flake_reference(input)) {
return input + "ir";
}
return sanitize_output_stem(input) + ".nixir";
}
static std::string normalize_local_flake_path(const std::string& raw_path) {
fs::path path = raw_path.empty() ? fs::current_path() : fs::path(raw_path);
fs::path absolute = path.is_absolute() ? path : fs::absolute(path);
fs::path normalized = absolute.lexically_normal();
if (!fs::exists(normalized)) {
throw std::runtime_error("Flake path does not exist: " + normalized.string());
}
if (fs::is_directory(normalized) && !fs::exists(normalized / "flake.nix")) {
throw std::runtime_error("Flake directory does not contain flake.nix: " + normalized.string());
}
return normalized.string();
}
static std::string normalize_flake_ref_source(const std::string& ref) {
if (ref.empty()) {
return normalize_local_flake_path(".");
}
if (ref.rfind("path:", 0) == 0) {
return "path:" + normalize_local_flake_path(ref.substr(5));
}
if (ref[0] == '.' || ref[0] == '/') {
return normalize_local_flake_path(ref);
}
if (fs::exists(ref)) {
return normalize_local_flake_path(ref);
}
return ref;
}
static std::vector<std::string> parse_flake_attr_path(const std::string& raw_attr_path) {
if (raw_attr_path.empty()) {
throw std::runtime_error("Flake reference is missing an attribute path after '#'");
}
std::vector<std::string> segments;
std::string current;
bool in_quotes = false;
bool escaping = false;
for (char ch : raw_attr_path) {
if (escaping) {
current.push_back(ch);
escaping = false;
continue;
}
if (in_quotes) {
if (ch == '\\') {
escaping = true;
} else if (ch == '"') {
in_quotes = false;
} else {
current.push_back(ch);
}
continue;
}
if (ch == '"') {
in_quotes = true;
} else if (ch == '.') {
if (current.empty()) {
throw std::runtime_error("Flake attribute path contains an empty segment");
}
segments.push_back(current);
current.clear();
} else {
current.push_back(ch);
}
}
if (escaping || in_quotes) {
throw std::runtime_error("Unterminated quoted segment in flake attribute path");
}
if (current.empty()) {
throw std::runtime_error("Flake attribute path contains an empty segment");
}
segments.push_back(current);
return segments;
}
static std::shared_ptr<Node> build_flake_ref_ast(const std::string& input) {
size_t hash_pos = input.find('#');
if (hash_pos == std::string::npos) {
throw std::runtime_error("Not a flake reference: " + input);
}
std::string flake_source = normalize_flake_ref_source(input.substr(0, hash_pos));
auto attr_path = parse_flake_attr_path(input.substr(hash_pos + 1));
auto expr = std::make_shared<Node>(BuiltinCallNode(
"getFlake",
std::vector<std::shared_ptr<Node>>{std::make_shared<Node>(ConstStringNode(flake_source))}));
for (const auto& attr : attr_path) {
expr = std::make_shared<Node>(SelectNode(expr, std::make_shared<Node>(ConstStringNode(attr))));
}
return expr;
} }
int run_compile(int argc, char** argv) { int run_compile(int argc, char** argv) {
std::string input_file; std::string input_file;
std::string output_file; std::string output_file;
std::vector<std::string> search_paths; std::vector<std::string> search_paths;
bool resolve_imports = true; bool resolve_imports = true;
int i = 1;
while (i < argc) {
std::string arg = argv[i];
if (arg == "-I") {
if (i + 1 >= argc) {
std::cerr << "Error: -I requires a path argument\n";
return 1;
}
search_paths.push_back(argv[++i]);
} else if (arg == "--no-imports") {
resolve_imports = false;
} else if (arg == "--help" || arg == "-h") {
print_usage(argv[0]);
return 0;
} else if (arg[0] != '-') {
input_file = arg;
if (i + 1 < argc && argv[i + 1][0] != '-') {
output_file = argv[++i];
}
} else {
std::cerr << "Unknown option: " << arg << "\n";
print_usage(argv[0]);
return 1;
}
i++;
}
if (input_file.empty()) {
std::cerr << "Error: No input file specified\n";
print_usage(argv[0]);
return 1;
}
if (output_file.empty()) {
output_file = input_file + "r";
}
try {
Parser parser;
Resolver resolver;
for (const auto& path : search_paths) {
resolver.add_search_path(path);
}
std::cout << "Parsing: " << input_file << "\n";
auto ast = parser.parse_file(input_file);
if (!ast) {
std::cerr << "Error: Failed to parse input\n";
return 1;
}
std::cout << "Resolving imports...\n";
IRGenerator ir_gen;
std::cout << "Generating IR...\n";
auto ir = ir_gen.generate(ast);
IRModule module; int i = 1;
module.version = IR_VERSION; while (i < argc) {
module.entry = ir; std::string arg = argv[i];
if (arg == "-I") {
std::cout << "Serializing to: " << output_file << "\n"; if (i + 1 >= argc) {
Serializer serializer; std::cerr << "Error: -I requires a path argument\n";
serializer.serialize(module, output_file);
std::cout << "Done!\n";
return 0;
} catch (const std::exception& e) {
std::cerr << "Error: " << e.what() << "\n";
return 1; return 1;
}
search_paths.push_back(argv[++i]);
} else if (arg == "--no-imports") {
resolve_imports = false;
} else if (arg == "--help" || arg == "-h") {
print_usage(argv[0]);
return 0;
} else if (arg[0] != '-') {
input_file = arg;
if (i + 1 < argc && argv[i + 1][0] != '-') {
output_file = argv[++i];
}
} else {
std::cerr << "Unknown option: " << arg << "\n";
print_usage(argv[0]);
return 1;
} }
i++;
}
if (input_file.empty()) {
std::cerr << "Error: No input file specified\n";
print_usage(argv[0]);
return 1;
}
if (output_file.empty()) {
output_file = default_output_path_for(input_file);
}
try {
Parser parser;
(void) search_paths;
(void) resolve_imports;
std::shared_ptr<Node> ast;
if (is_flake_reference(input_file)) {
std::cout << "Compiling flake reference: " << input_file << "\n";
ast = build_flake_ref_ast(input_file);
} else {
std::cout << "Parsing: " << input_file << "\n";
ast = parser.parse_file(input_file);
}
if (!ast) {
std::cerr << "Error: Failed to parse input\n";
return 1;
}
std::cout << "Resolving imports...\n";
IRGenerator ir_gen;
std::cout << "Generating IR...\n";
auto ir = ir_gen.generate(ast);
IRModule module;
module.version = IR_VERSION;
module.entry = ir;
std::cout << "Serializing to: " << output_file << "\n";
Serializer serializer;
serializer.serialize(module, output_file);
std::cout << "Done!\n";
return 0;
} catch (const std::exception& e) {
std::cerr << "Error: " << e.what() << "\n";
return 1;
}
} }
void print_decompile_usage(const char* prog) { void print_decompile_usage(const char* prog) {
std::cout << "Usage: " << prog << " decompile <input.nixir>\n"; std::cout << "Usage: " << prog << " decompile <input.nixir>\n";
} }
int run_decompile(int argc, char** argv) { int run_decompile(int argc, char** argv) {
if (argc < 3) { if (argc < 3) {
print_decompile_usage(argv[0]); print_decompile_usage(argv[0]);
return 1; return 1;
} }
std::string input_file = argv[2]; std::string input_file = argv[2];
try { try {
Deserializer deserializer; Deserializer deserializer;
auto module = deserializer.deserialize(input_file); auto module = deserializer.deserialize(input_file);
std::cout << "IR Version: " << module.version << "\n"; std::cout << "IR Version: " << module.version << "\n";
std::cout << "Sources: " << module.sources.size() << "\n"; std::cout << "Sources: " << module.sources.size() << "\n";
std::cout << "Imports: " << module.imports.size() << "\n"; std::cout << "Imports: " << module.imports.size() << "\n";
return 0; return 0;
} catch (const std::exception& e) { } catch (const std::exception& e) {
std::cerr << "Error: " << e.what() << "\n"; std::cerr << "Error: " << e.what() << "\n";
return 1; return 1;
} }
} }
} } // namespace nix_irc
int main(int argc, char** argv) { int main(int argc, char** argv) {
if (argc < 2) { if (argc < 2) {
nix_irc::print_usage(argv[0]); nix_irc::print_usage(argv[0]);
return 1; return 1;
} }
std::string cmd = argv[1]; std::string cmd = argv[1];
if (cmd == "compile" || cmd == "c") { if (cmd == "compile" || cmd == "c") {
return nix_irc::run_compile(argc - 1, argv + 1); return nix_irc::run_compile(argc - 1, argv + 1);
} else if (cmd == "decompile" || cmd == "d") { } else if (cmd == "decompile" || cmd == "d") {
return nix_irc::run_decompile(argc, argv); return nix_irc::run_decompile(argc, argv);
} else if (cmd == "help" || cmd == "--help" || cmd == "-h") { } else if (cmd == "help" || cmd == "--help" || cmd == "-h") {
nix_irc::print_usage(argv[0]); nix_irc::print_usage(argv[0]);
return 0; return 0;
} else { } else {
return nix_irc::run_compile(argc, argv); return nix_irc::run_compile(argc, argv);
} }
} }

File diff suppressed because it is too large Load diff

View file

@ -2,24 +2,24 @@
#define NIX_IRC_PARSER_H #define NIX_IRC_PARSER_H
#include "types.h" #include "types.h"
#include <string>
#include <memory> #include <memory>
#include <string>
namespace nix_irc { namespace nix_irc {
class Parser { class Parser {
public: public:
Parser(); Parser();
~Parser(); ~Parser();
std::shared_ptr<Node> parse(const std::string& source, const std::string& path = "<stdin>"); std::shared_ptr<Node> parse(const std::string& source, const std::string& path = "<stdin>");
std::shared_ptr<Node> parse_file(const std::string& path); std::shared_ptr<Node> parse_file(const std::string& path);
private: private:
struct Impl; struct Impl;
std::unique_ptr<Impl> pImpl; std::unique_ptr<Impl> pImpl;
}; };
} } // namespace nix_irc
#endif #endif

View file

@ -1,111 +1,114 @@
#include "resolver.h" #include "resolver.h"
#include "parser.h" #include "parser.h"
#include <iostream>
#include <fstream>
#include <sstream>
#include <filesystem> #include <filesystem>
#include <fstream>
#include <iostream>
#include <regex> #include <regex>
#include <sstream>
namespace nix_irc { namespace nix_irc {
namespace fs = std::filesystem; namespace fs = std::filesystem;
struct Resolver::Impl { struct Resolver::Impl {
ResolverConfig config; ResolverConfig config;
std::vector<std::pair<std::string, std::string>> resolved_imports; std::vector<std::pair<std::string, std::string>> resolved_imports;
std::unordered_set<std::string> visited; std::unordered_set<std::string> visited;
Parser parser; Parser parser;
Impl(const ResolverConfig& cfg) : config(cfg) {} Impl(const ResolverConfig& cfg) : config(cfg) {}
std::string resolve_path(const std::string& path, const std::string& from_file) { std::string resolve_path(const std::string& path, const std::string& from_file) {
fs::path p(path); fs::path p(path);
if (p.is_absolute()) { if (p.is_absolute()) {
if (fs::exists(p)) return path; if (fs::exists(p))
return ""; return path;
} return "";
fs::path from_dir = fs::path(from_file).parent_path();
fs::path candidate = from_dir / p;
if (fs::exists(candidate)) return candidate.string();
for (const auto& search : config.search_paths) {
candidate = fs::path(search) / p;
if (fs::exists(candidate)) return candidate.string();
}
return "";
} }
ImportResult do_resolve(const std::string& path, const std::string& from_file) { fs::path from_dir = fs::path(from_file).parent_path();
std::string resolved = resolve_path(path, from_file); fs::path candidate = from_dir / p;
if (fs::exists(candidate))
if (resolved.empty()) { return candidate.string();
return {false, "", "Cannot find file: " + path, nullptr};
} for (const auto& search : config.search_paths) {
candidate = fs::path(search) / p;
if (visited.count(resolved)) { if (fs::exists(candidate))
return {true, resolved, "", nullptr}; return candidate.string();
}
visited.insert(resolved);
try {
auto ast = parser.parse_file(resolved);
return {true, resolved, "", ast};
} catch (const std::exception& e) {
return {false, "", e.what(), nullptr};
}
} }
return "";
}
ImportResult do_resolve(const std::string& path, const std::string& from_file) {
std::string resolved = resolve_path(path, from_file);
if (resolved.empty()) {
return {false, "", "Cannot find file: " + path, nullptr};
}
if (visited.count(resolved)) {
return {true, resolved, "", nullptr};
}
visited.insert(resolved);
try {
auto ast = parser.parse_file(resolved);
return {true, resolved, "", ast};
} catch (const std::exception& e) {
return {false, "", e.what(), nullptr};
}
}
}; };
Resolver::Resolver(const ResolverConfig& config) : pImpl(std::make_unique<Impl>(config)) {} Resolver::Resolver(const ResolverConfig& config) : pImpl(std::make_unique<Impl>(config)) {}
Resolver::~Resolver() = default; Resolver::~Resolver() = default;
void Resolver::add_search_path(const std::string& path) { void Resolver::add_search_path(const std::string& path) {
pImpl->config.search_paths.push_back(path); pImpl->config.search_paths.push_back(path);
} }
void Resolver::set_search_paths(const std::vector<std::string>& paths) { void Resolver::set_search_paths(const std::vector<std::string>& paths) {
pImpl->config.search_paths = paths; pImpl->config.search_paths = paths;
} }
ImportResult Resolver::resolve_import(const std::string& path, const std::string& from_file) { ImportResult Resolver::resolve_import(const std::string& path, const std::string& from_file) {
auto result = pImpl->do_resolve(path, from_file); auto result = pImpl->do_resolve(path, from_file);
if (result.success && result.ast) { if (result.success && result.ast) {
pImpl->resolved_imports.push_back({path, result.path}); pImpl->resolved_imports.push_back({path, result.path});
} }
return result; return result;
} }
ImportResult Resolver::resolve_import(const Node& import_node, const std::string& from_file) { ImportResult Resolver::resolve_import(const Node& import_node, const std::string& from_file) {
const ConstPathNode* path_node = import_node.get_if<ConstPathNode>(); const ConstPathNode* path_node = import_node.get_if<ConstPathNode>();
if (!path_node) { if (!path_node) {
return {false, "", "Dynamic import not supported", nullptr}; return {false, "", "Dynamic import not supported", nullptr};
} }
return resolve_import(path_node->value, from_file); return resolve_import(path_node->value, from_file);
} }
std::vector<std::string> Resolver::get_resolved_files() const { std::vector<std::string> Resolver::get_resolved_files() const {
std::vector<std::string> files; std::vector<std::string> files;
for (const auto& [orig, resolved] : pImpl->resolved_imports) { for (const auto& [orig, resolved] : pImpl->resolved_imports) {
(void)orig; (void) orig;
files.push_back(resolved); files.push_back(resolved);
} }
return files; return files;
} }
std::vector<std::pair<std::string, std::string>> Resolver::get_imports() const { std::vector<std::pair<std::string, std::string>> Resolver::get_imports() const {
return pImpl->resolved_imports; return pImpl->resolved_imports;
} }
bool is_static_import(const Node& node) { bool is_static_import(const Node& node) {
return node.holds<ConstPathNode>(); return node.holds<ConstPathNode>();
} }
std::string normalize_path(const std::string& path) { std::string normalize_path(const std::string& path) {
fs::path p(path); fs::path p(path);
return fs::absolute(p).string(); return fs::absolute(p).string();
} }
} } // namespace nix_irc

View file

@ -2,47 +2,47 @@
#define NIX_IRC_RESOLVER_H #define NIX_IRC_RESOLVER_H
#include "types.h" #include "types.h"
#include <string>
#include <vector>
#include <unordered_set>
#include <filesystem> #include <filesystem>
#include <string>
#include <unordered_set>
#include <vector>
namespace nix_irc { namespace nix_irc {
struct ImportResult { struct ImportResult {
bool success; bool success;
std::string path; std::string path;
std::string error; std::string error;
std::shared_ptr<Node> ast; std::shared_ptr<Node> ast;
}; };
struct ResolverConfig { struct ResolverConfig {
std::vector<std::string> search_paths; std::vector<std::string> search_paths;
bool resolve_imports = true; bool resolve_imports = true;
}; };
class Resolver { class Resolver {
public: public:
Resolver(const ResolverConfig& config = {}); Resolver(const ResolverConfig& config = {});
~Resolver(); ~Resolver();
void add_search_path(const std::string& path); void add_search_path(const std::string& path);
void set_search_paths(const std::vector<std::string>& paths); void set_search_paths(const std::vector<std::string>& paths);
ImportResult resolve_import(const std::string& path, const std::string& from_file); ImportResult resolve_import(const std::string& path, const std::string& from_file);
ImportResult resolve_import(const Node& import_node, const std::string& from_file); ImportResult resolve_import(const Node& import_node, const std::string& from_file);
std::vector<std::string> get_resolved_files() const; std::vector<std::string> get_resolved_files() const;
std::vector<std::pair<std::string, std::string>> get_imports() const; std::vector<std::pair<std::string, std::string>> get_imports() const;
private: private:
struct Impl; struct Impl;
std::unique_ptr<Impl> pImpl; std::unique_ptr<Impl> pImpl;
}; };
bool is_static_import(const Node& node); bool is_static_import(const Node& node);
std::string normalize_path(const std::string& path); std::string normalize_path(const std::string& path);
} } // namespace nix_irc
#endif #endif

View file

@ -1,392 +1,632 @@
#include "serializer.h" #include "serializer.h"
#include <cstring> #include <cstring>
#include <sstream>
#include <iostream> #include <iostream>
namespace nix_irc { namespace nix_irc {
struct Serializer::Impl { struct Serializer::Impl {
std::vector<uint8_t> buffer; std::vector<uint8_t> buffer;
void write_u32(uint32_t val) { void write_u32(uint32_t val) {
buffer.push_back((val >> 0) & 0xFF); buffer.push_back((val >> 0) & 0xFF);
buffer.push_back((val >> 8) & 0xFF); buffer.push_back((val >> 8) & 0xFF);
buffer.push_back((val >> 16) & 0xFF); buffer.push_back((val >> 16) & 0xFF);
buffer.push_back((val >> 24) & 0xFF); buffer.push_back((val >> 24) & 0xFF);
}
void write_u64(uint64_t val) {
for (int i = 0; i < 8; i++) {
buffer.push_back((val >> (i * 8)) & 0xFF);
} }
}
void write_u64(uint64_t val) { void write_u8(uint8_t val) { buffer.push_back(val); }
for (int i = 0; i < 8; i++) {
buffer.push_back((val >> (i * 8)) & 0xFF); void write_string(const std::string& str) {
write_u32(str.size());
buffer.insert(buffer.end(), str.begin(), str.end());
}
NodeType get_node_type(const Node& node) {
if (node.holds<ConstIntNode>())
return NodeType::CONST_INT;
if (node.holds<ConstFloatNode>())
return NodeType::CONST_FLOAT;
if (node.holds<ConstStringNode>())
return NodeType::CONST_STRING;
if (node.holds<ConstPathNode>())
return NodeType::CONST_PATH;
if (node.holds<ConstBoolNode>())
return NodeType::CONST_BOOL;
if (node.holds<ConstNullNode>())
return NodeType::CONST_NULL;
if (node.holds<ConstURINode>())
return NodeType::CONST_URI;
if (node.holds<ConstLookupPathNode>())
return NodeType::CONST_LOOKUP_PATH;
if (node.holds<VarNode>())
return NodeType::VAR;
if (node.holds<LambdaNode>())
return NodeType::LAMBDA;
if (node.holds<AppNode>())
return NodeType::APP;
if (node.holds<BinaryOpNode>())
return NodeType::BINARY_OP;
if (node.holds<UnaryOpNode>())
return NodeType::UNARY_OP;
if (node.holds<ImportNode>())
return NodeType::IMPORT;
if (node.holds<AttrsetNode>())
return NodeType::ATTRSET;
if (node.holds<SelectNode>())
return NodeType::SELECT;
if (node.holds<HasAttrNode>())
return NodeType::HAS_ATTR;
if (node.holds<WithNode>())
return NodeType::WITH;
if (node.holds<ListNode>())
return NodeType::LIST;
if (node.holds<IfNode>())
return NodeType::IF;
if (node.holds<LetNode>())
return NodeType::LET;
if (node.holds<LetRecNode>())
return NodeType::LETREC;
if (node.holds<AssertNode>())
return NodeType::ASSERT;
if (node.holds<LambdaPatternNode>())
return NodeType::LAMBDA_PATTERN;
if (node.holds<StringInterpolationNode>())
return NodeType::STRING_INTERPOLATION;
if (node.holds<BuiltinCallNode>())
return NodeType::BUILTIN_CALL;
return NodeType::ERROR;
}
uint32_t get_node_line(const Node& node) {
return std::visit([](const auto& n) { return n.line; }, node.data);
}
void write_node(const Node& node) {
write_u8(static_cast<uint8_t>(get_node_type(node)));
write_u32(get_node_line(node));
if (auto* n = node.get_if<ConstIntNode>()) {
write_u64(static_cast<uint64_t>(n->value));
} else if (auto* n = node.get_if<ConstFloatNode>()) {
double val = n->value;
uint64_t bits = 0;
std::memcpy(&bits, &val, sizeof(bits));
write_u64(bits);
} else if (auto* n = node.get_if<ConstStringNode>()) {
write_string(n->value);
} else if (auto* n = node.get_if<ConstPathNode>()) {
write_string(n->value);
} else if (auto* n = node.get_if<ConstBoolNode>()) {
write_u8(n->value ? 1 : 0);
} else if (auto* n = node.get_if<ConstNullNode>()) {
// No data for null
} else if (auto* n = node.get_if<ConstURINode>()) {
write_string(n->value);
} else if (auto* n = node.get_if<ConstLookupPathNode>()) {
write_string(n->value);
} else if (auto* n = node.get_if<VarNode>()) {
write_u32(n->index);
} else if (auto* n = node.get_if<LambdaNode>()) {
write_u32(n->arity);
if (n->body)
write_node(*n->body);
} else if (auto* n = node.get_if<AppNode>()) {
if (n->func)
write_node(*n->func);
if (n->arg)
write_node(*n->arg);
} else if (auto* n = node.get_if<BinaryOpNode>()) {
write_u8(static_cast<uint8_t>(n->op));
if (n->left)
write_node(*n->left);
if (n->right)
write_node(*n->right);
} else if (auto* n = node.get_if<UnaryOpNode>()) {
write_u8(static_cast<uint8_t>(n->op));
if (n->operand)
write_node(*n->operand);
} else if (auto* n = node.get_if<ImportNode>()) {
if (n->path)
write_node(*n->path);
} else if (auto* n = node.get_if<AttrsetNode>()) {
write_u8(n->recursive ? 1 : 0);
write_u32(n->attrs.size());
for (const auto& binding : n->attrs) {
if (binding.is_dynamic()) {
write_u8(1); // Dynamic flag
write_node(*binding.dynamic_name);
} else {
write_u8(0); // Static flag
write_string(binding.static_name.value());
} }
} if (binding.value)
write_node(*binding.value);
}
} else if (auto* n = node.get_if<SelectNode>()) {
if (n->expr)
write_node(*n->expr);
if (n->attr)
write_node(*n->attr);
if (n->default_expr && *n->default_expr) {
write_u8(1);
write_node(**n->default_expr);
} else {
write_u8(0);
}
} else if (auto* n = node.get_if<HasAttrNode>()) {
if (n->expr)
write_node(*n->expr);
if (n->attr)
write_node(*n->attr);
} else if (auto* n = node.get_if<WithNode>()) {
if (n->attrs)
write_node(*n->attrs);
if (n->body)
write_node(*n->body);
} else if (auto* n = node.get_if<ListNode>()) {
write_u32(n->elements.size());
for (const auto& elem : n->elements) {
if (elem)
write_node(*elem);
}
} else if (auto* n = node.get_if<IfNode>()) {
if (n->cond)
write_node(*n->cond);
if (n->then_branch)
write_node(*n->then_branch);
if (n->else_branch)
write_node(*n->else_branch);
} else if (auto* n = node.get_if<LetNode>()) {
write_u32(n->bindings.size());
for (const auto& [key, val] : n->bindings) {
write_string(key);
if (val)
write_node(*val);
}
if (n->body)
write_node(*n->body);
} else if (auto* n = node.get_if<LetRecNode>()) {
write_u32(n->bindings.size());
for (const auto& [key, val] : n->bindings) {
write_string(key);
if (val)
write_node(*val);
}
if (n->body)
write_node(*n->body);
} else if (auto* n = node.get_if<AssertNode>()) {
if (n->cond)
write_node(*n->cond);
if (n->body)
write_node(*n->body);
} else if (auto* n = node.get_if<LambdaPatternNode>()) {
// Required fields
write_u32(n->required_fields.size());
for (const auto& field : n->required_fields) {
write_string(field.name);
write_u8(0); // No default
}
void write_u8(uint8_t val) { // Optional fields
buffer.push_back(val); write_u32(n->optional_fields.size());
} for (const auto& field : n->optional_fields) {
write_string(field.name);
void write_string(const std::string& str) { if (field.default_value && *field.default_value) {
write_u32(str.size()); write_u8(1);
buffer.insert(buffer.end(), str.begin(), str.end()); write_node(**field.default_value);
} } else {
write_u8(0);
NodeType get_node_type(const Node& node) {
if (node.holds<ConstIntNode>()) return NodeType::CONST_INT;
if (node.holds<ConstStringNode>()) return NodeType::CONST_STRING;
if (node.holds<ConstPathNode>()) return NodeType::CONST_PATH;
if (node.holds<ConstBoolNode>()) return NodeType::CONST_BOOL;
if (node.holds<ConstNullNode>()) return NodeType::CONST_NULL;
if (node.holds<VarNode>()) return NodeType::VAR;
if (node.holds<LambdaNode>()) return NodeType::LAMBDA;
if (node.holds<AppNode>()) return NodeType::APP;
if (node.holds<BinaryOpNode>()) return NodeType::BINARY_OP;
if (node.holds<UnaryOpNode>()) return NodeType::UNARY_OP;
if (node.holds<AttrsetNode>()) return NodeType::ATTRSET;
if (node.holds<SelectNode>()) return NodeType::SELECT;
if (node.holds<HasAttrNode>()) return NodeType::HAS_ATTR;
if (node.holds<WithNode>()) return NodeType::WITH;
if (node.holds<IfNode>()) return NodeType::IF;
if (node.holds<LetNode>()) return NodeType::LET;
if (node.holds<LetRecNode>()) return NodeType::LETREC;
if (node.holds<AssertNode>()) return NodeType::ASSERT;
return NodeType::ERROR;
}
uint32_t get_node_line(const Node& node) {
return std::visit([](const auto& n) { return n.line; }, node.data);
}
void write_node(const Node& node) {
write_u8(static_cast<uint8_t>(get_node_type(node)));
write_u32(get_node_line(node));
if (auto* n = node.get_if<ConstIntNode>()) {
write_u64(static_cast<uint64_t>(n->value));
} else if (auto* n = node.get_if<ConstStringNode>()) {
write_string(n->value);
} else if (auto* n = node.get_if<ConstPathNode>()) {
write_string(n->value);
} else if (auto* n = node.get_if<ConstBoolNode>()) {
write_u8(n->value ? 1 : 0);
} else if (auto* n = node.get_if<ConstNullNode>()) {
// No data for null
} else if (auto* n = node.get_if<VarNode>()) {
write_u32(n->index);
} else if (auto* n = node.get_if<LambdaNode>()) {
write_u32(n->arity);
if (n->body) write_node(*n->body);
} else if (auto* n = node.get_if<AppNode>()) {
if (n->func) write_node(*n->func);
if (n->arg) write_node(*n->arg);
} else if (auto* n = node.get_if<BinaryOpNode>()) {
write_u8(static_cast<uint8_t>(n->op));
if (n->left) write_node(*n->left);
if (n->right) write_node(*n->right);
} else if (auto* n = node.get_if<UnaryOpNode>()) {
write_u8(static_cast<uint8_t>(n->op));
if (n->operand) write_node(*n->operand);
} else if (auto* n = node.get_if<AttrsetNode>()) {
write_u8(n->recursive ? 1 : 0);
write_u32(n->attrs.size());
for (const auto& [key, val] : n->attrs) {
write_string(key);
if (val) write_node(*val);
}
} else if (auto* n = node.get_if<SelectNode>()) {
if (n->expr) write_node(*n->expr);
if (n->attr) write_node(*n->attr);
if (n->default_expr && *n->default_expr) {
write_u8(1);
write_node(**n->default_expr);
} else {
write_u8(0);
}
} else if (auto* n = node.get_if<HasAttrNode>()) {
if (n->expr) write_node(*n->expr);
if (n->attr) write_node(*n->attr);
} else if (auto* n = node.get_if<WithNode>()) {
if (n->attrs) write_node(*n->attrs);
if (n->body) write_node(*n->body);
} else if (auto* n = node.get_if<IfNode>()) {
if (n->cond) write_node(*n->cond);
if (n->then_branch) write_node(*n->then_branch);
if (n->else_branch) write_node(*n->else_branch);
} else if (auto* n = node.get_if<LetNode>()) {
write_u32(n->bindings.size());
for (const auto& [key, val] : n->bindings) {
write_string(key);
if (val) write_node(*val);
}
if (n->body) write_node(*n->body);
} else if (auto* n = node.get_if<LetRecNode>()) {
write_u32(n->bindings.size());
for (const auto& [key, val] : n->bindings) {
write_string(key);
if (val) write_node(*val);
}
if (n->body) write_node(*n->body);
} else if (auto* n = node.get_if<AssertNode>()) {
if (n->cond) write_node(*n->cond);
if (n->body) write_node(*n->body);
} }
}
// At-binding
if (n->at_binding) {
write_u8(1);
write_string(*n->at_binding);
} else {
write_u8(0);
}
// Allow extra
write_u8(n->allow_extra ? 1 : 0);
// Body
if (n->body)
write_node(*n->body);
} else if (auto* n = node.get_if<StringInterpolationNode>()) {
write_u32(n->parts.size());
for (const auto& part : n->parts) {
write_u8(static_cast<uint8_t>(part.type));
if (part.type == StringPart::Type::LITERAL) {
write_string(part.literal);
} else { // EXPR
if (part.expr)
write_node(*part.expr);
}
}
} else if (auto* n = node.get_if<BuiltinCallNode>()) {
write_string(n->builtin_name);
write_u32(n->args.size());
for (const auto& arg : n->args) {
if (arg)
write_node(*arg);
}
} }
}
}; };
Serializer::Serializer() : pImpl(std::make_unique<Impl>()) {} Serializer::Serializer() : pImpl(std::make_unique<Impl>()) {}
Serializer::~Serializer() = default; Serializer::~Serializer() = default;
void Serializer::serialize(const IRModule& module, const std::string& path) { void Serializer::serialize(const IRModule& module, const std::string& path) {
auto bytes = serialize_to_bytes(module); auto bytes = serialize_to_bytes(module);
std::ofstream out(path, std::ios::binary); std::ofstream out(path, std::ios::binary);
out.write(reinterpret_cast<const char*>(bytes.data()), bytes.size()); out.write(reinterpret_cast<const char*>(bytes.data()), bytes.size());
} }
std::vector<uint8_t> Serializer::serialize_to_bytes(const IRModule& module) { std::vector<uint8_t> Serializer::serialize_to_bytes(const IRModule& module) {
pImpl->buffer.clear(); pImpl->buffer.clear();
pImpl->write_u32(IR_MAGIC); pImpl->write_u32(IR_MAGIC);
pImpl->write_u32(IR_VERSION); pImpl->write_u32(IR_VERSION);
pImpl->write_u32(module.sources.size()); pImpl->write_u32(module.sources.size());
for (const auto& src : module.sources) { for (const auto& src : module.sources) {
pImpl->write_string(src.path); pImpl->write_string(src.path);
pImpl->write_string(src.content); pImpl->write_string(src.content);
} }
pImpl->write_u32(module.imports.size()); pImpl->write_u32(module.imports.size());
for (const auto& [from, to] : module.imports) { for (const auto& [from, to] : module.imports) {
pImpl->write_string(from); pImpl->write_string(from);
pImpl->write_string(to); pImpl->write_string(to);
} }
pImpl->write_u32(module.string_table.size()); pImpl->write_u32(module.string_table.size());
for (const auto& [str, id] : module.string_table) { for (const auto& [str, id] : module.string_table) {
pImpl->write_string(str); pImpl->write_string(str);
pImpl->write_u32(id); pImpl->write_u32(id);
} }
if (module.entry && module.entry != nullptr) { if (module.entry && module.entry != nullptr) {
pImpl->write_u8(1); pImpl->write_u8(1);
pImpl->write_node(*module.entry); pImpl->write_node(*module.entry);
} else { } else {
pImpl->write_u8(0); pImpl->write_u8(0);
} }
return pImpl->buffer; return pImpl->buffer;
} }
struct Deserializer::Impl { struct Deserializer::Impl {
std::vector<uint8_t> buffer; std::vector<uint8_t> buffer;
size_t pos = 0; size_t pos = 0;
uint32_t read_u32() { uint32_t read_u32() {
uint32_t val = 0; uint32_t val = 0;
val |= buffer[pos + 0]; val |= buffer[pos + 0];
val |= (uint32_t)buffer[pos + 1] << 8; val |= (uint32_t) buffer[pos + 1] << 8;
val |= (uint32_t)buffer[pos + 2] << 16; val |= (uint32_t) buffer[pos + 2] << 16;
val |= (uint32_t)buffer[pos + 3] << 24; val |= (uint32_t) buffer[pos + 3] << 24;
pos += 4; pos += 4;
return val; return val;
}
uint64_t read_u64() {
uint64_t val = 0;
for (int i = 0; i < 8; i++) {
val |= (uint64_t) buffer[pos + i] << (i * 8);
} }
pos += 8;
return val;
}
uint64_t read_u64() { uint8_t read_u8() { return buffer[pos++]; }
uint64_t val = 0;
for (int i = 0; i < 8; i++) { std::string read_string() {
val |= (uint64_t)buffer[pos + i] << (i * 8); uint32_t len = read_u32();
std::string str(reinterpret_cast<const char*>(&buffer[pos]), len);
pos += len;
return str;
}
std::shared_ptr<Node> read_node() {
NodeType type = static_cast<NodeType>(read_u8());
uint32_t line = read_u32();
switch (type) {
case NodeType::CONST_INT: {
int64_t val = static_cast<int64_t>(read_u64());
return std::make_shared<Node>(ConstIntNode(val, line));
}
case NodeType::CONST_FLOAT: {
uint64_t bits = read_u64();
double val = 0.0;
std::memcpy(&val, &bits, sizeof(val));
return std::make_shared<Node>(ConstFloatNode(val, line));
}
case NodeType::CONST_STRING: {
std::string val = read_string();
return std::make_shared<Node>(ConstStringNode(val, line));
}
case NodeType::CONST_PATH: {
std::string val = read_string();
return std::make_shared<Node>(ConstPathNode(val, line));
}
case NodeType::CONST_BOOL: {
bool val = read_u8() != 0;
return std::make_shared<Node>(ConstBoolNode(val, line));
}
case NodeType::CONST_NULL:
return std::make_shared<Node>(ConstNullNode(line));
case NodeType::CONST_URI: {
std::string val = read_string();
return std::make_shared<Node>(ConstURINode(val, line));
}
case NodeType::CONST_LOOKUP_PATH: {
std::string val = read_string();
return std::make_shared<Node>(ConstLookupPathNode(val, line));
}
case NodeType::BUILTIN_CALL: {
std::string builtin_name = read_string();
uint32_t num_args = read_u32();
std::vector<std::shared_ptr<Node>> args;
args.reserve(num_args);
for (uint32_t i = 0; i < num_args; i++) {
args.push_back(read_node());
}
return std::make_shared<Node>(
BuiltinCallNode(std::move(builtin_name), std::move(args), line));
}
case NodeType::VAR: {
uint32_t index = read_u32();
return std::make_shared<Node>(VarNode(index, "", line));
}
case NodeType::LAMBDA: {
uint32_t arity = read_u32();
auto body = read_node();
return std::make_shared<Node>(LambdaNode(arity, body, line));
}
case NodeType::APP: {
auto func = read_node();
auto arg = read_node();
return std::make_shared<Node>(AppNode(func, arg, line));
}
case NodeType::BINARY_OP: {
BinaryOp op = static_cast<BinaryOp>(read_u8());
auto left = read_node();
auto right = read_node();
return std::make_shared<Node>(BinaryOpNode(op, left, right, line));
}
case NodeType::UNARY_OP: {
UnaryOp op = static_cast<UnaryOp>(read_u8());
auto operand = read_node();
return std::make_shared<Node>(UnaryOpNode(op, operand, line));
}
case NodeType::IMPORT: {
auto path = read_node();
return std::make_shared<Node>(ImportNode(path, line));
}
case NodeType::ATTRSET: {
bool recursive = read_u8() != 0;
uint32_t num_attrs = read_u32();
AttrsetNode attrs(recursive, line);
for (uint32_t i = 0; i < num_attrs; i++) {
uint8_t is_dynamic = read_u8();
if (is_dynamic) {
auto key_expr = read_node();
auto val = read_node();
attrs.attrs.push_back(AttrBinding(key_expr, val));
} else {
std::string key = read_string();
auto val = read_node();
attrs.attrs.push_back(AttrBinding(key, val));
} }
pos += 8; }
return val; return std::make_shared<Node>(std::move(attrs));
} }
case NodeType::SELECT: {
uint8_t read_u8() { auto expr = read_node();
return buffer[pos++]; auto attr = read_node();
uint8_t has_default = read_u8();
std::optional<std::shared_ptr<Node>> default_expr;
if (has_default) {
default_expr = read_node();
}
SelectNode select_node(expr, attr, line);
select_node.default_expr = default_expr;
return std::make_shared<Node>(std::move(select_node));
} }
case NodeType::HAS_ATTR: {
std::string read_string() { auto expr = read_node();
uint32_t len = read_u32(); auto attr = read_node();
std::string str(reinterpret_cast<const char*>(&buffer[pos]), len); return std::make_shared<Node>(HasAttrNode(expr, attr, line));
pos += len;
return str;
} }
case NodeType::WITH: {
auto attrs = read_node();
auto body = read_node();
return std::make_shared<Node>(WithNode(attrs, body, line));
}
case NodeType::LIST: {
uint32_t num_elements = read_u32();
std::vector<std::shared_ptr<Node>> elements;
elements.reserve(num_elements);
for (uint32_t i = 0; i < num_elements; i++) {
elements.push_back(read_node());
}
return std::make_shared<Node>(ListNode(std::move(elements), line));
}
case NodeType::IF: {
auto cond = read_node();
auto then_branch = read_node();
auto else_branch = read_node();
return std::make_shared<Node>(IfNode(cond, then_branch, else_branch, line));
}
case NodeType::LET: {
uint32_t num_bindings = read_u32();
std::vector<std::pair<std::string, std::shared_ptr<Node>>> bindings;
for (uint32_t i = 0; i < num_bindings; i++) {
std::string key = read_string();
auto val = read_node();
bindings.push_back({key, val});
}
auto body = read_node();
LetNode let(body, line);
let.bindings = std::move(bindings);
return std::make_shared<Node>(std::move(let));
}
case NodeType::LETREC: {
uint32_t num_bindings = read_u32();
std::vector<std::pair<std::string, std::shared_ptr<Node>>> bindings;
for (uint32_t i = 0; i < num_bindings; i++) {
std::string key = read_string();
auto val = read_node();
bindings.push_back({key, val});
}
auto body = read_node();
LetRecNode letrec(body, line);
letrec.bindings = std::move(bindings);
return std::make_shared<Node>(std::move(letrec));
}
case NodeType::ASSERT: {
auto cond = read_node();
auto body = read_node();
return std::make_shared<Node>(AssertNode(cond, body, line));
}
case NodeType::LAMBDA_PATTERN: {
// Read required fields
uint32_t num_required = read_u32();
std::vector<PatternField> required_fields;
required_fields.reserve(num_required);
for (uint32_t i = 0; i < num_required; i++) {
std::string name = read_string();
read_u8(); // Discard has_default (always 0)
required_fields.emplace_back(name, std::nullopt);
}
std::shared_ptr<Node> read_node() { // Read optional fields
NodeType type = static_cast<NodeType>(read_u8()); uint32_t num_optional = read_u32();
uint32_t line = read_u32(); std::vector<PatternField> optional_fields;
optional_fields.reserve(num_optional);
switch (type) { for (uint32_t i = 0; i < num_optional; i++) {
case NodeType::CONST_INT: { std::string name = read_string();
int64_t val = static_cast<int64_t>(read_u64()); uint8_t has_default = read_u8();
return std::make_shared<Node>(ConstIntNode(val, line)); std::optional<std::shared_ptr<Node>> default_val;
} if (has_default) {
case NodeType::CONST_STRING: { default_val = read_node();
std::string val = read_string();
return std::make_shared<Node>(ConstStringNode(val, line));
}
case NodeType::CONST_PATH: {
std::string val = read_string();
return std::make_shared<Node>(ConstPathNode(val, line));
}
case NodeType::CONST_BOOL: {
bool val = read_u8() != 0;
return std::make_shared<Node>(ConstBoolNode(val, line));
}
case NodeType::CONST_NULL:
return std::make_shared<Node>(ConstNullNode(line));
case NodeType::VAR: {
uint32_t index = read_u32();
return std::make_shared<Node>(VarNode(index, "", line));
}
case NodeType::LAMBDA: {
uint32_t arity = read_u32();
auto body = read_node();
return std::make_shared<Node>(LambdaNode(arity, body, line));
}
case NodeType::APP: {
auto func = read_node();
auto arg = read_node();
return std::make_shared<Node>(AppNode(func, arg, line));
}
case NodeType::BINARY_OP: {
BinaryOp op = static_cast<BinaryOp>(read_u8());
auto left = read_node();
auto right = read_node();
return std::make_shared<Node>(BinaryOpNode(op, left, right, line));
}
case NodeType::UNARY_OP: {
UnaryOp op = static_cast<UnaryOp>(read_u8());
auto operand = read_node();
return std::make_shared<Node>(UnaryOpNode(op, operand, line));
}
case NodeType::ATTRSET: {
bool recursive = read_u8() != 0;
uint32_t num_attrs = read_u32();
AttrsetNode attrs(recursive, line);
for (uint32_t i = 0; i < num_attrs; i++) {
std::string key = read_string();
auto val = read_node();
attrs.attrs.push_back({key, val});
}
return std::make_shared<Node>(std::move(attrs));
}
case NodeType::SELECT: {
auto expr = read_node();
auto attr = read_node();
uint8_t has_default = read_u8();
std::optional<std::shared_ptr<Node>> default_expr;
if (has_default) {
default_expr = read_node();
}
SelectNode select_node(expr, attr, line);
select_node.default_expr = default_expr;
return std::make_shared<Node>(std::move(select_node));
}
case NodeType::HAS_ATTR: {
auto expr = read_node();
auto attr = read_node();
return std::make_shared<Node>(HasAttrNode(expr, attr, line));
}
case NodeType::WITH: {
auto attrs = read_node();
auto body = read_node();
return std::make_shared<Node>(WithNode(attrs, body, line));
}
case NodeType::IF: {
auto cond = read_node();
auto then_branch = read_node();
auto else_branch = read_node();
return std::make_shared<Node>(IfNode(cond, then_branch, else_branch, line));
}
case NodeType::LET: {
uint32_t num_bindings = read_u32();
std::vector<std::pair<std::string, std::shared_ptr<Node>>> bindings;
for (uint32_t i = 0; i < num_bindings; i++) {
std::string key = read_string();
auto val = read_node();
bindings.push_back({key, val});
}
auto body = read_node();
LetNode let(body, line);
let.bindings = std::move(bindings);
return std::make_shared<Node>(std::move(let));
}
case NodeType::LETREC: {
uint32_t num_bindings = read_u32();
std::vector<std::pair<std::string, std::shared_ptr<Node>>> bindings;
for (uint32_t i = 0; i < num_bindings; i++) {
std::string key = read_string();
auto val = read_node();
bindings.push_back({key, val});
}
auto body = read_node();
LetRecNode letrec(body, line);
letrec.bindings = std::move(bindings);
return std::make_shared<Node>(std::move(letrec));
}
case NodeType::ASSERT: {
auto cond = read_node();
auto body = read_node();
return std::make_shared<Node>(AssertNode(cond, body, line));
}
default:
throw std::runtime_error("Unknown node type in IR");
} }
optional_fields.emplace_back(name, default_val);
}
// Read at-binding
std::optional<std::string> at_binding;
if (read_u8()) {
at_binding = read_string();
}
// Read allow_extra
bool allow_extra = read_u8() != 0;
// Read body
auto body = read_node();
// Construct node
LambdaPatternNode lambda_pattern(body, line);
lambda_pattern.required_fields = std::move(required_fields);
lambda_pattern.optional_fields = std::move(optional_fields);
lambda_pattern.at_binding = at_binding;
lambda_pattern.allow_extra = allow_extra;
return std::make_shared<Node>(std::move(lambda_pattern));
} }
case NodeType::STRING_INTERPOLATION: {
uint32_t num_parts = read_u32();
std::vector<StringPart> parts;
parts.reserve(num_parts);
for (uint32_t i = 0; i < num_parts; i++) {
uint8_t type_byte = read_u8();
StringPart::Type type = static_cast<StringPart::Type>(type_byte);
if (type == StringPart::Type::LITERAL) {
std::string literal = read_string();
parts.push_back(StringPart::make_literal(std::move(literal)));
} else { // EXPR
auto expr = read_node();
parts.push_back(StringPart::make_expr(expr));
}
}
return std::make_shared<Node>(StringInterpolationNode(std::move(parts), line));
}
default:
throw std::runtime_error("Unknown node type in IR");
}
}
}; };
Deserializer::Deserializer() : pImpl(std::make_unique<Impl>()) {} Deserializer::Deserializer() : pImpl(std::make_unique<Impl>()) {}
Deserializer::~Deserializer() = default; Deserializer::~Deserializer() = default;
IRModule Deserializer::deserialize(const std::string& path) { IRModule Deserializer::deserialize(const std::string& path) {
std::ifstream in(path, std::ios::binary | std::ios::ate); std::ifstream in(path, std::ios::binary | std::ios::ate);
size_t size = in.tellg(); size_t size = in.tellg();
in.seekg(0); in.seekg(0);
pImpl->buffer.resize(size); pImpl->buffer.resize(size);
in.read(reinterpret_cast<char*>(pImpl->buffer.data()), size); in.read(reinterpret_cast<char*>(pImpl->buffer.data()), size);
pImpl->pos = 0; pImpl->pos = 0;
return deserialize(pImpl->buffer); return deserialize(pImpl->buffer);
} }
IRModule Deserializer::deserialize(const std::vector<uint8_t>& data) { IRModule Deserializer::deserialize(const std::vector<uint8_t>& data) {
pImpl->buffer = data; pImpl->buffer = data;
pImpl->pos = 0; pImpl->pos = 0;
IRModule module; IRModule module;
uint32_t magic = pImpl->read_u32(); uint32_t magic = pImpl->read_u32();
if (magic != IR_MAGIC) { if (magic != IR_MAGIC) {
throw std::runtime_error("Invalid IR file"); throw std::runtime_error("Invalid IR file");
} }
uint32_t version = pImpl->read_u32(); uint32_t version = pImpl->read_u32();
if (version != IR_VERSION) { if (version != IR_VERSION) {
throw std::runtime_error("Unsupported IR version"); throw std::runtime_error("Unsupported IR version");
} }
uint32_t num_sources = pImpl->read_u32(); uint32_t num_sources = pImpl->read_u32();
for (uint32_t i = 0; i < num_sources; i++) { for (uint32_t i = 0; i < num_sources; i++) {
SourceFile src; SourceFile src;
src.path = pImpl->read_string(); src.path = pImpl->read_string();
src.content = pImpl->read_string(); src.content = pImpl->read_string();
module.sources.push_back(src); module.sources.push_back(src);
} }
uint32_t num_imports = pImpl->read_u32(); uint32_t num_imports = pImpl->read_u32();
for (uint32_t i = 0; i < num_imports; i++) { for (uint32_t i = 0; i < num_imports; i++) {
module.imports.push_back({pImpl->read_string(), pImpl->read_string()}); module.imports.push_back({pImpl->read_string(), pImpl->read_string()});
} }
uint32_t num_strings = pImpl->read_u32(); uint32_t num_strings = pImpl->read_u32();
for (uint32_t i = 0; i < num_strings; i++) { for (uint32_t i = 0; i < num_strings; i++) {
std::string str = pImpl->read_string(); std::string str = pImpl->read_string();
uint32_t id = pImpl->read_u32(); uint32_t id = pImpl->read_u32();
module.string_table[str] = id; module.string_table[str] = id;
} }
if (pImpl->read_u8()) { if (pImpl->read_u8()) {
module.entry = pImpl->read_node(); module.entry = pImpl->read_node();
} }
return module; return module;
} }
} } // namespace nix_irc

View file

@ -2,38 +2,38 @@
#define NIX_IRC_SERIALIZER_H #define NIX_IRC_SERIALIZER_H
#include "types.h" #include "types.h"
#include <fstream>
#include <string> #include <string>
#include <vector> #include <vector>
#include <fstream>
namespace nix_irc { namespace nix_irc {
class Serializer { class Serializer {
public: public:
Serializer(); Serializer();
~Serializer(); ~Serializer();
void serialize(const IRModule& module, const std::string& path); void serialize(const IRModule& module, const std::string& path);
std::vector<uint8_t> serialize_to_bytes(const IRModule& module); std::vector<uint8_t> serialize_to_bytes(const IRModule& module);
private: private:
struct Impl; struct Impl;
std::unique_ptr<Impl> pImpl; std::unique_ptr<Impl> pImpl;
}; };
class Deserializer { class Deserializer {
public: public:
Deserializer(); Deserializer();
~Deserializer(); ~Deserializer();
IRModule deserialize(const std::string& path); IRModule deserialize(const std::string& path);
IRModule deserialize(const std::vector<uint8_t>& data); IRModule deserialize(const std::vector<uint8_t>& data);
private: private:
struct Impl; struct Impl;
std::unique_ptr<Impl> pImpl; std::unique_ptr<Impl> pImpl;
}; };
} } // namespace nix_irc
#endif #endif

62
src/irc/types.cpp Normal file
View file

@ -0,0 +1,62 @@
#include "types.h"
namespace nix_irc {
// LambdaNode constructor
LambdaNode::LambdaNode(uint32_t a, std::shared_ptr<Node> b, uint32_t l)
: arity(a), body(std::move(b)), line(l) {}
// AppNode constructor
AppNode::AppNode(std::shared_ptr<Node> f, std::shared_ptr<Node> a, uint32_t l)
: func(std::move(f)), arg(std::move(a)), line(l) {}
// BinaryOpNode constructor
BinaryOpNode::BinaryOpNode(BinaryOp o, std::shared_ptr<Node> l, std::shared_ptr<Node> r,
uint32_t ln)
: op(o), left(std::move(l)), right(std::move(r)), line(ln) {}
// UnaryOpNode constructor
UnaryOpNode::UnaryOpNode(UnaryOp o, std::shared_ptr<Node> operand_ptr, uint32_t l)
: op(o), operand(std::move(operand_ptr)), line(l) {}
// SelectNode constructor
SelectNode::SelectNode(std::shared_ptr<Node> e, std::shared_ptr<Node> a, uint32_t l)
: expr(std::move(e)), attr(std::move(a)), line(l) {}
// HasAttrNode constructor
HasAttrNode::HasAttrNode(std::shared_ptr<Node> e, std::shared_ptr<Node> a, uint32_t l)
: expr(std::move(e)), attr(std::move(a)), line(l) {}
// WithNode constructor
WithNode::WithNode(std::shared_ptr<Node> a, std::shared_ptr<Node> b, uint32_t l)
: attrs(std::move(a)), body(std::move(b)), line(l) {}
// IfNode constructor
IfNode::IfNode(std::shared_ptr<Node> c, std::shared_ptr<Node> t, std::shared_ptr<Node> e,
uint32_t l)
: cond(std::move(c)), then_branch(std::move(t)), else_branch(std::move(e)), line(l) {}
// LetNode constructor
LetNode::LetNode(std::shared_ptr<Node> b, uint32_t l) : body(std::move(b)), line(l) {}
// LetRecNode constructor
LetRecNode::LetRecNode(std::shared_ptr<Node> b, uint32_t l) : body(std::move(b)), line(l) {}
// AssertNode constructor
AssertNode::AssertNode(std::shared_ptr<Node> c, std::shared_ptr<Node> b, uint32_t l)
: cond(std::move(c)), body(std::move(b)), line(l) {}
// ImportNode constructor
ImportNode::ImportNode(std::shared_ptr<Node> p, uint32_t l) : path(std::move(p)), line(l) {}
// ThunkNode constructor
ThunkNode::ThunkNode(std::shared_ptr<Node> e, uint32_t l) : expr(std::move(e)), line(l) {}
// ForceNode constructor
ForceNode::ForceNode(std::shared_ptr<Node> e, uint32_t l) : expr(std::move(e)), line(l) {}
// LambdaPatternNode constructor
LambdaPatternNode::LambdaPatternNode(std::shared_ptr<Node> b, uint32_t l)
: allow_extra(false), body(std::move(b)), line(l) {}
} // namespace nix_irc

View file

@ -2,289 +2,369 @@
#define NIX_IRC_TYPES_H #define NIX_IRC_TYPES_H
#include <cstdint> #include <cstdint>
#include <string>
#include <vector>
#include <unordered_map>
#include <optional>
#include <memory> #include <memory>
#include <optional>
#include <string>
#include <unordered_map>
#include <utility>
#include <variant> #include <variant>
#include <fstream> #include <vector>
#include <sstream>
namespace nix_irc { namespace nix_irc {
constexpr uint32_t IR_MAGIC = 0x4E495258; constexpr uint32_t IR_MAGIC = 0x4E495258;
constexpr uint32_t IR_VERSION = 2; constexpr uint32_t IR_VERSION = 3;
enum class NodeType : uint8_t { enum class NodeType : uint8_t {
CONST_INT = 0x01, CONST_INT = 0x01,
CONST_STRING = 0x02, CONST_FLOAT = 0x06,
CONST_PATH = 0x03, CONST_STRING = 0x02,
CONST_BOOL = 0x04, CONST_PATH = 0x03,
CONST_NULL = 0x05, CONST_BOOL = 0x04,
VAR = 0x10, CONST_NULL = 0x05,
LAMBDA = 0x20, CONST_URI = 0x07,
APP = 0x21, CONST_LOOKUP_PATH = 0x08,
BINARY_OP = 0x22, VAR = 0x10,
UNARY_OP = 0x23, LAMBDA = 0x20,
ATTRSET = 0x30, APP = 0x21,
SELECT = 0x31, BINARY_OP = 0x22,
HAS_ATTR = 0x34, UNARY_OP = 0x23,
WITH = 0x32, IMPORT = 0x24,
IF = 0x40, ATTRSET = 0x30,
LET = 0x50, SELECT = 0x31,
LETREC = 0x51, HAS_ATTR = 0x34,
ASSERT = 0x52, WITH = 0x32,
THUNK = 0x60, LIST = 0x33,
FORCE = 0x61, IF = 0x40,
ERROR = 0xFF LET = 0x50,
LETREC = 0x51,
ASSERT = 0x52,
THUNK = 0x60,
FORCE = 0x61,
LAMBDA_PATTERN = 0x70,
INHERIT = 0x71,
INHERIT_FROM = 0x72,
STRING_INTERPOLATION = 0x73,
BUILTIN_CALL = 0x74,
ERROR = 0xFF
}; };
enum class BinaryOp : uint8_t { enum class BinaryOp : uint8_t {
ADD, SUB, MUL, DIV, CONCAT, ADD,
EQ, NE, LT, GT, LE, GE, SUB,
AND, OR, IMPL MUL,
DIV,
CONCAT,
EQ,
NE,
LT,
GT,
LE,
GE,
AND,
OR,
IMPL,
MERGE
}; };
enum class UnaryOp : uint8_t { enum class UnaryOp : uint8_t { NEG, NOT };
NEG, NOT
};
// Forward declare Node for use in shared_ptr // Forward declare Node for use in shared_ptr
class Node; class Node;
struct ConstIntNode { struct ConstIntNode {
int64_t value; int64_t value;
uint32_t line = 0; uint32_t line = 0;
ConstIntNode(int64_t v = 0, uint32_t l = 0) : value(v), line(l) {} ConstIntNode(int64_t v = 0, uint32_t l = 0) : value(v), line(l) {}
}; };
struct ConstStringNode { struct ConstStringNode {
std::string value; std::string value;
uint32_t line = 0; uint32_t line = 0;
ConstStringNode(std::string v = "", uint32_t l = 0) : value(std::move(v)), line(l) {} ConstStringNode(std::string v = "", uint32_t l = 0) : value(std::move(v)), line(l) {}
}; };
struct ConstPathNode { struct ConstPathNode {
std::string value; std::string value;
uint32_t line = 0; uint32_t line = 0;
ConstPathNode(std::string v = "", uint32_t l = 0) : value(std::move(v)), line(l) {} ConstPathNode(std::string v = "", uint32_t l = 0) : value(std::move(v)), line(l) {}
}; };
struct ConstBoolNode { struct ConstBoolNode {
bool value; bool value;
uint32_t line = 0; uint32_t line = 0;
ConstBoolNode(bool v = false, uint32_t l = 0) : value(v), line(l) {} ConstBoolNode(bool v = false, uint32_t l = 0) : value(v), line(l) {}
}; };
struct ConstNullNode { struct ConstNullNode {
uint32_t line = 0; uint32_t line = 0;
ConstNullNode(uint32_t l = 0) : line(l) {} ConstNullNode(uint32_t l = 0) : line(l) {}
};
struct ConstFloatNode {
double value;
uint32_t line = 0;
ConstFloatNode(double v = 0.0, uint32_t l = 0) : value(v), line(l) {}
};
struct ConstURINode {
std::string value;
uint32_t line = 0;
ConstURINode(std::string v = "", uint32_t l = 0) : value(std::move(v)), line(l) {}
};
struct ConstLookupPathNode {
std::string value; // e.g., "nixpkgs" or "nixpkgs/lib"
uint32_t line = 0;
ConstLookupPathNode(std::string v = "", uint32_t l = 0) : value(std::move(v)), line(l) {}
}; };
struct VarNode { struct VarNode {
uint32_t index = 0; uint32_t index = 0;
std::optional<std::string> name; std::optional<std::string> name;
uint32_t line = 0; uint32_t line = 0;
VarNode(uint32_t idx = 0, std::string n = "", uint32_t l = 0) VarNode(uint32_t idx = 0, std::string n = "", uint32_t l = 0)
: index(idx), name(n.empty() ? std::nullopt : std::optional<std::string>(n)), line(l) {} : index(idx), name(n.empty() ? std::nullopt : std::optional<std::string>(n)), line(l) {}
}; };
struct LambdaNode { struct LambdaNode {
uint32_t arity = 1; uint32_t arity = 1;
std::shared_ptr<Node> body; std::shared_ptr<Node> body;
std::optional<std::string> param_name; std::optional<std::string> param_name;
bool strict_pattern = true; bool strict_pattern = true;
uint32_t line = 0; uint32_t line = 0;
LambdaNode(uint32_t a, std::shared_ptr<Node> b, uint32_t l = 0); LambdaNode(uint32_t a, std::shared_ptr<Node> b, uint32_t l = 0);
};
struct PatternField {
std::string name;
std::optional<std::shared_ptr<Node>> default_value;
PatternField(std::string n, std::optional<std::shared_ptr<Node>> def = std::nullopt)
: name(std::move(n)), default_value(std::move(def)) {}
};
struct LambdaPatternNode {
std::vector<PatternField> required_fields;
std::vector<PatternField> optional_fields;
std::optional<std::string> at_binding;
bool allow_extra;
std::shared_ptr<Node> body;
uint32_t line = 0;
LambdaPatternNode(std::shared_ptr<Node> b, uint32_t l = 0);
}; };
struct AppNode { struct AppNode {
std::shared_ptr<Node> func; std::shared_ptr<Node> func;
std::shared_ptr<Node> arg; std::shared_ptr<Node> arg;
uint32_t line = 0; uint32_t line = 0;
AppNode(std::shared_ptr<Node> f, std::shared_ptr<Node> a, uint32_t l = 0); AppNode(std::shared_ptr<Node> f, std::shared_ptr<Node> a, uint32_t l = 0);
}; };
struct BinaryOpNode { struct BinaryOpNode {
BinaryOp op; BinaryOp op;
std::shared_ptr<Node> left; std::shared_ptr<Node> left;
std::shared_ptr<Node> right; std::shared_ptr<Node> right;
uint32_t line = 0; uint32_t line = 0;
BinaryOpNode(BinaryOp o, std::shared_ptr<Node> l, std::shared_ptr<Node> r, uint32_t ln = 0); BinaryOpNode(BinaryOp o, std::shared_ptr<Node> l, std::shared_ptr<Node> r, uint32_t ln = 0);
}; };
struct UnaryOpNode { struct UnaryOpNode {
UnaryOp op; UnaryOp op;
std::shared_ptr<Node> operand; std::shared_ptr<Node> operand;
uint32_t line = 0; uint32_t line = 0;
UnaryOpNode(UnaryOp o, std::shared_ptr<Node> operand, uint32_t l = 0); UnaryOpNode(UnaryOp o, std::shared_ptr<Node> operand, uint32_t l = 0);
};
struct AttrBinding {
std::optional<std::string> static_name; // Static key like "foo"
std::shared_ptr<Node> dynamic_name; // Dynamic key like ${expr}
std::shared_ptr<Node> value;
// Static attribute
AttrBinding(std::string name, std::shared_ptr<Node> val)
: static_name(std::move(name)), value(std::move(val)) {}
// Dynamic attribute
AttrBinding(std::shared_ptr<Node> name_expr, std::shared_ptr<Node> val)
: dynamic_name(std::move(name_expr)), value(std::move(val)) {}
bool is_dynamic() const { return !static_name.has_value(); }
}; };
struct AttrsetNode { struct AttrsetNode {
std::vector<std::pair<std::string, std::shared_ptr<Node>>> attrs; std::vector<AttrBinding> attrs;
bool recursive = false; bool recursive = false;
uint32_t line = 0; uint32_t line = 0;
AttrsetNode(bool rec = false, uint32_t l = 0) : recursive(rec), line(l) {} AttrsetNode(bool rec = false, uint32_t l = 0) : recursive(rec), line(l) {}
}; };
struct SelectNode { struct SelectNode {
std::shared_ptr<Node> expr; std::shared_ptr<Node> expr;
std::shared_ptr<Node> attr; std::shared_ptr<Node> attr;
std::optional<std::shared_ptr<Node>> default_expr; std::optional<std::shared_ptr<Node>> default_expr;
uint32_t line = 0; uint32_t line = 0;
SelectNode(std::shared_ptr<Node> e, std::shared_ptr<Node> a, uint32_t l = 0); SelectNode(std::shared_ptr<Node> e, std::shared_ptr<Node> a, uint32_t l = 0);
}; };
struct HasAttrNode { struct HasAttrNode {
std::shared_ptr<Node> expr; std::shared_ptr<Node> expr;
std::shared_ptr<Node> attr; std::shared_ptr<Node> attr;
uint32_t line = 0; uint32_t line = 0;
HasAttrNode(std::shared_ptr<Node> e, std::shared_ptr<Node> a, uint32_t l = 0); HasAttrNode(std::shared_ptr<Node> e, std::shared_ptr<Node> a, uint32_t l = 0);
}; };
struct WithNode { struct WithNode {
std::shared_ptr<Node> attrs; std::shared_ptr<Node> attrs;
std::shared_ptr<Node> body; std::shared_ptr<Node> body;
uint32_t line = 0; uint32_t line = 0;
WithNode(std::shared_ptr<Node> a, std::shared_ptr<Node> b, uint32_t l = 0); WithNode(std::shared_ptr<Node> a, std::shared_ptr<Node> b, uint32_t l = 0);
}; };
struct IfNode { struct IfNode {
std::shared_ptr<Node> cond; std::shared_ptr<Node> cond;
std::shared_ptr<Node> then_branch; std::shared_ptr<Node> then_branch;
std::shared_ptr<Node> else_branch; std::shared_ptr<Node> else_branch;
uint32_t line = 0; uint32_t line = 0;
IfNode(std::shared_ptr<Node> c, std::shared_ptr<Node> t, std::shared_ptr<Node> e, uint32_t l = 0); IfNode(std::shared_ptr<Node> c, std::shared_ptr<Node> t, std::shared_ptr<Node> e, uint32_t l = 0);
}; };
struct LetNode { struct LetNode {
std::vector<std::pair<std::string, std::shared_ptr<Node>>> bindings; std::vector<std::pair<std::string, std::shared_ptr<Node>>> bindings;
std::shared_ptr<Node> body; std::shared_ptr<Node> body;
uint32_t line = 0; uint32_t line = 0;
LetNode(std::shared_ptr<Node> b, uint32_t l = 0); LetNode(std::shared_ptr<Node> b, uint32_t l = 0);
}; };
struct LetRecNode { struct LetRecNode {
std::vector<std::pair<std::string, std::shared_ptr<Node>>> bindings; std::vector<std::pair<std::string, std::shared_ptr<Node>>> bindings;
std::shared_ptr<Node> body; std::shared_ptr<Node> body;
uint32_t line = 0; uint32_t line = 0;
LetRecNode(std::shared_ptr<Node> b, uint32_t l = 0); LetRecNode(std::shared_ptr<Node> b, uint32_t l = 0);
}; };
struct AssertNode { struct AssertNode {
std::shared_ptr<Node> cond; std::shared_ptr<Node> cond;
std::shared_ptr<Node> body; std::shared_ptr<Node> body;
uint32_t line = 0; uint32_t line = 0;
AssertNode(std::shared_ptr<Node> c, std::shared_ptr<Node> b, uint32_t l = 0); AssertNode(std::shared_ptr<Node> c, std::shared_ptr<Node> b, uint32_t l = 0);
};
struct ImportNode {
std::shared_ptr<Node> path; // Path expression to import
uint32_t line = 0;
ImportNode(std::shared_ptr<Node> p, uint32_t l = 0);
}; };
struct ThunkNode { struct ThunkNode {
std::shared_ptr<Node> expr; std::shared_ptr<Node> expr;
uint32_t line = 0; uint32_t line = 0;
ThunkNode(std::shared_ptr<Node> e, uint32_t l = 0); ThunkNode(std::shared_ptr<Node> e, uint32_t l = 0);
}; };
struct ForceNode { struct ForceNode {
std::shared_ptr<Node> expr; std::shared_ptr<Node> expr;
uint32_t line = 0; uint32_t line = 0;
ForceNode(std::shared_ptr<Node> e, uint32_t l = 0); ForceNode(std::shared_ptr<Node> e, uint32_t l = 0);
};
struct ListNode {
std::vector<std::shared_ptr<Node>> elements;
uint32_t line = 0;
ListNode(std::vector<std::shared_ptr<Node>> elems = {}, uint32_t l = 0)
: elements(std::move(elems)), line(l) {}
};
struct InheritNode {
std::vector<std::string> names;
uint32_t line = 0;
InheritNode(std::vector<std::string> n = {}, uint32_t l = 0) : names(std::move(n)), line(l) {}
};
struct InheritFromNode {
std::shared_ptr<Node> source;
std::vector<std::string> names;
uint32_t line = 0;
InheritFromNode(std::shared_ptr<Node> src, std::vector<std::string> n, uint32_t l = 0)
: source(std::move(src)), names(std::move(n)), line(l) {}
};
struct StringPart {
enum class Type { LITERAL, EXPR };
Type type;
std::string literal;
std::shared_ptr<Node> expr;
static StringPart make_literal(std::string lit) {
StringPart part;
part.type = Type::LITERAL;
part.literal = std::move(lit);
return part;
}
static StringPart make_expr(std::shared_ptr<Node> e) {
StringPart part;
part.type = Type::EXPR;
part.expr = std::move(e);
return part;
}
};
struct StringInterpolationNode {
std::vector<StringPart> parts;
uint32_t line = 0;
StringInterpolationNode(std::vector<StringPart> p = {}, uint32_t l = 0)
: parts(std::move(p)), line(l) {}
};
struct BuiltinCallNode {
std::string builtin_name;
std::vector<std::shared_ptr<Node>> args;
uint32_t line = 0;
BuiltinCallNode(std::string name, std::vector<std::shared_ptr<Node>> a = {}, uint32_t l = 0)
: builtin_name(std::move(name)), args(std::move(a)), line(l) {}
}; };
// Node wraps a variant for type-safe AST // Node wraps a variant for type-safe AST
class Node { class Node {
public: public:
using Variant = std::variant< using Variant =
ConstIntNode, std::variant<ConstIntNode, ConstFloatNode, ConstStringNode, ConstPathNode, ConstBoolNode,
ConstStringNode, ConstNullNode, ConstURINode, ConstLookupPathNode, VarNode, LambdaNode, AppNode,
ConstPathNode, BinaryOpNode, UnaryOpNode, ImportNode, AttrsetNode, SelectNode, HasAttrNode,
ConstBoolNode, WithNode, IfNode, LetNode, LetRecNode, AssertNode, ThunkNode, ForceNode,
ConstNullNode, ListNode, LambdaPatternNode, InheritNode, InheritFromNode,
VarNode, StringInterpolationNode, BuiltinCallNode>;
LambdaNode,
AppNode,
BinaryOpNode,
UnaryOpNode,
AttrsetNode,
SelectNode,
HasAttrNode,
WithNode,
IfNode,
LetNode,
LetRecNode,
AssertNode,
ThunkNode,
ForceNode
>;
Variant data; Variant data;
template<typename T> template <typename T> Node(T&& value) : data(std::forward<T>(value)) {}
Node(T&& value) : data(std::forward<T>(value)) {}
template<typename T> template <typename T> T* get_if() { return std::get_if<T>(&data); }
T* get_if() { return std::get_if<T>(&data); }
template<typename T> template <typename T> const T* get_if() const { return std::get_if<T>(&data); }
const T* get_if() const { return std::get_if<T>(&data); }
template<typename T> template <typename T> bool holds() const { return std::holds_alternative<T>(data); }
bool holds() const { return std::holds_alternative<T>(data); }
}; };
// Constructor implementations
inline LambdaNode::LambdaNode(uint32_t a, std::shared_ptr<Node> b, uint32_t l)
: arity(a), body(b), line(l) {}
inline AppNode::AppNode(std::shared_ptr<Node> f, std::shared_ptr<Node> a, uint32_t l)
: func(f), arg(a), line(l) {}
inline BinaryOpNode::BinaryOpNode(BinaryOp o, std::shared_ptr<Node> l, std::shared_ptr<Node> r, uint32_t ln)
: op(o), left(l), right(r), line(ln) {}
inline UnaryOpNode::UnaryOpNode(UnaryOp o, std::shared_ptr<Node> operand, uint32_t l)
: op(o), operand(operand), line(l) {}
inline SelectNode::SelectNode(std::shared_ptr<Node> e, std::shared_ptr<Node> a, uint32_t l)
: expr(e), attr(a), line(l) {}
inline HasAttrNode::HasAttrNode(std::shared_ptr<Node> e, std::shared_ptr<Node> a, uint32_t l)
: expr(e), attr(a), line(l) {}
inline WithNode::WithNode(std::shared_ptr<Node> a, std::shared_ptr<Node> b, uint32_t l)
: attrs(a), body(b), line(l) {}
inline IfNode::IfNode(std::shared_ptr<Node> c, std::shared_ptr<Node> t, std::shared_ptr<Node> e, uint32_t l)
: cond(c), then_branch(t), else_branch(e), line(l) {}
inline LetNode::LetNode(std::shared_ptr<Node> b, uint32_t l)
: body(b), line(l) {}
inline LetRecNode::LetRecNode(std::shared_ptr<Node> b, uint32_t l)
: body(b), line(l) {}
inline AssertNode::AssertNode(std::shared_ptr<Node> c, std::shared_ptr<Node> b, uint32_t l)
: cond(c), body(b), line(l) {}
inline ThunkNode::ThunkNode(std::shared_ptr<Node> e, uint32_t l)
: expr(e), line(l) {}
inline ForceNode::ForceNode(std::shared_ptr<Node> e, uint32_t l)
: expr(e), line(l) {}
struct SourceFile { struct SourceFile {
std::string path; std::string path;
std::string content; std::string content;
std::shared_ptr<Node> ast; std::shared_ptr<Node> ast;
}; };
struct IRModule { struct IRModule {
uint32_t version = IR_VERSION; uint32_t version = IR_VERSION;
std::vector<SourceFile> sources; std::vector<SourceFile> sources;
std::vector<std::pair<std::string, std::string>> imports; std::vector<std::pair<std::string, std::string>> imports;
std::shared_ptr<Node> entry; std::shared_ptr<Node> entry;
std::unordered_map<std::string, uint32_t> string_table; std::unordered_map<std::string, uint32_t> string_table;
}; };
} } // namespace nix_irc
#endif #endif

View file

@ -5,20 +5,15 @@
#include "nix/expr/eval.hh" #include "nix/expr/eval.hh"
#include "nix/expr/primops.hh" #include "nix/expr/primops.hh"
#include "nix/expr/value.hh" #include "nix/expr/value.hh"
#include "nix/store/store-api.hh"
#include "nix/util/source-path.hh"
#include "irc/evaluator.h"
#include "irc/ir_gen.h" #include "irc/ir_gen.h"
#include "irc/parser.h" #include "irc/parser.h"
#include "irc/resolver.h"
#include "irc/serializer.h" #include "irc/serializer.h"
#include "irc/types.h" #include "irc/types.h"
#include "irc/evaluator.h"
#include <fstream> #include <chrono>
#include <iostream> #include <iostream>
#include <memory>
#include <optional>
namespace nix_ir_plugin { namespace nix_ir_plugin {
@ -29,50 +24,52 @@ using namespace nix_irc;
* Load and evaluate a pre-compiled IR bundle * Load and evaluate a pre-compiled IR bundle
* Usage: builtins.nixIR.loadIR "/path/to/file.nixir" * Usage: builtins.nixIR.loadIR "/path/to/file.nixir"
*/ */
static void prim_loadIR(EvalState &state, const PosIdx pos, Value **args, static void prim_loadIR(EvalState& state, const PosIdx pos, Value** args, Value& v) {
Value &v) {
auto path = state.forceStringNoCtx( auto path = state.forceStringNoCtx(
*args[0], pos, *args[0], pos, "while evaluating the first argument to builtins.nixIR.loadIR");
"while evaluating the first argument to builtins.nixIR.loadIR");
std::string pathStr(path); std::string pathStr(path);
auto t_start = std::chrono::high_resolution_clock::now();
Deserializer deserializer; Deserializer deserializer;
IRModule module; IRModule module;
try { try {
module = deserializer.deserialize(pathStr); module = deserializer.deserialize(pathStr);
} catch (const std::exception &e) { } catch (const std::exception& e) {
state.error<EvalError>("failed to deserialize IR bundle: %s", e.what()) state.error<EvalError>("failed to deserialize IR bundle: %s", e.what()).atPos(pos).debugThrow();
.atPos(pos)
.debugThrow();
} }
auto t_deser = std::chrono::high_resolution_clock::now();
if (!module.entry) { if (!module.entry) {
state.error<EvalError>("IR bundle has no entry point") state.error<EvalError>("IR bundle has no entry point").atPos(pos).debugThrow();
.atPos(pos)
.debugThrow();
} }
try { try {
Evaluator evaluator(state); Evaluator evaluator(state);
evaluator.eval_to_nix(module.entry, v); evaluator.eval_to_nix(module.entry, v);
} catch (const std::exception &e) { } catch (const std::exception& e) {
state.error<EvalError>("failed to evaluate IR: %s", e.what()) state.error<EvalError>("failed to evaluate IR: %s", e.what()).atPos(pos).debugThrow();
.atPos(pos)
.debugThrow();
} }
auto t_eval = std::chrono::high_resolution_clock::now();
auto deser_us = std::chrono::duration_cast<std::chrono::microseconds>(t_deser - t_start).count();
auto eval_us = std::chrono::duration_cast<std::chrono::microseconds>(t_eval - t_deser).count();
std::cerr << "nixIR timing: deser=" << deser_us << "us eval=" << eval_us
<< "us total=" << (deser_us + eval_us) << "us" << std::endl;
} }
/** /**
* Compile Nix source to IR on-the-fly * Compile Nix source to IR on-the-fly
* Usage: builtins.nixIR.compile "{ x = 1; }" * Usage: builtins.nixIR.compile "{ x = 1; }"
*/ */
static void prim_compileNix(EvalState &state, const PosIdx pos, Value **args, static void prim_compileNix(EvalState& state, const PosIdx pos, Value** args, Value& v) {
Value &v) {
auto source = state.forceStringNoCtx( auto source = state.forceStringNoCtx(
*args[0], pos, *args[0], pos, "while evaluating the first argument to builtins.nixIR.compile");
"while evaluating the first argument to builtins.nixIR.compile");
std::string sourceStr(source); std::string sourceStr(source);
@ -81,9 +78,7 @@ static void prim_compileNix(EvalState &state, const PosIdx pos, Value **args,
auto ast = parser.parse(sourceStr, "<inline>"); auto ast = parser.parse(sourceStr, "<inline>");
if (!ast) { if (!ast) {
state.error<EvalError>("failed to parse Nix expression") state.error<EvalError>("failed to parse Nix expression").atPos(pos).debugThrow();
.atPos(pos)
.debugThrow();
} }
IRGenerator ir_gen; IRGenerator ir_gen;
@ -92,10 +87,8 @@ static void prim_compileNix(EvalState &state, const PosIdx pos, Value **args,
Evaluator evaluator(state); Evaluator evaluator(state);
evaluator.eval_to_nix(ir, v); evaluator.eval_to_nix(ir, v);
} catch (const std::exception &e) { } catch (const std::exception& e) {
state.error<EvalError>("IR compilation failed: %s", e.what()) state.error<EvalError>("IR compilation failed: %s", e.what()).atPos(pos).debugThrow();
.atPos(pos)
.debugThrow();
} }
} }
@ -103,19 +96,18 @@ static void prim_compileNix(EvalState &state, const PosIdx pos, Value **args,
* Get information about the IR plugin * Get information about the IR plugin
* Usage: builtins.nixIR.info * Usage: builtins.nixIR.info
*/ */
static void prim_info(EvalState &state, const PosIdx pos, Value **args, static void prim_info(EvalState& state, const PosIdx pos, Value** args, Value& v) {
Value &v) {
auto bindings = state.buildBindings(3); auto bindings = state.buildBindings(3);
Value *vName = state.allocValue(); Value* vName = state.allocValue();
vName->mkString("nix-ir-plugin"); vName->mkString("nix-ir-plugin");
bindings.insert(state.symbols.create("name"), vName); bindings.insert(state.symbols.create("name"), vName);
Value *vVersion = state.allocValue(); Value* vVersion = state.allocValue();
vVersion->mkString("0.1.0"); vVersion->mkString("0.1.0");
bindings.insert(state.symbols.create("version"), vVersion); bindings.insert(state.symbols.create("version"), vVersion);
Value *vStatus = state.allocValue(); Value* vStatus = state.allocValue();
vStatus->mkString("runtime-active"); vStatus->mkString("runtime-active");
bindings.insert(state.symbols.create("status"), vStatus); bindings.insert(state.symbols.create("status"), vStatus);
@ -160,7 +152,7 @@ static RegisterPrimOp rp_info({
} // namespace nix_ir_plugin } // namespace nix_ir_plugin
// Plugin initialization message // Plugin initialization
__attribute__((constructor)) static void init_plugin() { __attribute__((constructor)) static void init_plugin() {
std::cerr << "nix-ir-plugin loaded" << std::endl; // Plugin loads silently...
} }

Binary file not shown.

View file

@ -1,4 +0,0 @@
let
x = 10;
in
{ a = x; }

237
tests/benchmark/large.nix Normal file
View file

@ -0,0 +1,237 @@
# Large benchmark for comprehensive stress testing
let
range = start: end:
if start >= end
then []
else [start] ++ range (start + 1) end;
concat = a: b: a ++ b;
factorial = n:
if n <= 1
then 1
else n * factorial (n - 1);
# Ackermann function (highly recursive)
ackermann = m: n:
if m == 0
then n + 1
else if n == 0
then ackermann (m - 1) 1
else ackermann (m - 1) (ackermann m (n - 1));
# Greatest common divisor
gcd = a: b:
if b == 0
then a
else gcd b (a - (a / b) * b);
# Power function
pow = base: exp:
if exp == 0
then 1
else if exp == 1
then base
else base * pow base (exp - 1);
compose = f: g: x: f (g x);
double = x: x * 2;
addTen = x: x + 10;
square = x: x * x;
pipeline = compose square (compose double addTen);
list_100 = range 1 101;
list_50 = range 1 51;
list_25 = range 1 26;
largeAttrs = {
a1 = 1;
a2 = 2;
a3 = 3;
a4 = 4;
a5 = 5;
a6 = 6;
a7 = 7;
a8 = 8;
a9 = 9;
a10 = 10;
b1 = 11;
b2 = 12;
b3 = 13;
b4 = 14;
b5 = 15;
b6 = 16;
b7 = 17;
b8 = 18;
b9 = 19;
b10 = 20;
c1 = 21;
c2 = 22;
c3 = 23;
c4 = 24;
c5 = 25;
c6 = 26;
c7 = 27;
c8 = 28;
c9 = 29;
c10 = 30;
d1 = 31;
d2 = 32;
d3 = 33;
d4 = 34;
d5 = 35;
d6 = 36;
d7 = 37;
d8 = 38;
d9 = 39;
d10 = 40;
e1 = 41;
e2 = 42;
e3 = 43;
e4 = 44;
e5 = 45;
e6 = 46;
e7 = 47;
e8 = 48;
e9 = 49;
e10 = 50;
};
# Very deep nesting (10 levels)
deepNest = {
level1 = {
level2 = {
level3 = {
level4 = {
level5 = {
level6 = {
level7 = {
level8 = {
level9 = {
level10 = {
treasure = "found";
value = 12345;
};
};
};
};
};
};
};
};
};
};
};
recursiveComplex = rec {
base = 10;
doubled = base * 2;
tripled = base * 3;
sum = doubled + tripled;
product = doubled * tripled;
x = base * 4;
y = x + doubled;
z = y * tripled;
total = sum + product + z;
final = total * base;
};
config1 = rec {
multiplier = 5;
base = 100;
result = base * multiplier;
};
config2 = rec {
offset = 50;
scaled = config1.result + offset;
doubled = scaled * 2;
};
config3 = rec {
factor = 3;
combined = config2.doubled * factor;
final = combined + config1.multiplier;
};
baseConfig = {
system = {
arch = "x86_64";
os = "linux";
};
settings = {
enabled = true;
level = 5;
};
};
overrides = {
system = {
kernel = "6.1";
};
settings = {
level = 10;
extra = "custom";
};
newSection = {
value = 42;
};
};
merged =
baseConfig
// overrides
// {
system = baseConfig.system // overrides.system;
settings =
baseConfig.settings
// overrides.settings
// {
combined = baseConfig.settings.level + overrides.settings.level;
};
};
fact10 = factorial 10;
fact7 = factorial 7;
ack_3_3 = ackermann 3 3;
gcd_48_18 = gcd 48 18;
gcd_100_35 = gcd 100 35;
pow_2_10 = pow 2 10;
pow_3_5 = pow 3 5;
pipelineResult = pipeline 5; # ((5 + 10) * 2)^2 = 900
# List operations
concatenated = concat [1 2 3] [4 5 6];
multilevel = concat (concat [1] [2 3]) [4 5];
in {
# Lists
inherit list_100 list_50 list_25 concatenated multilevel;
# Math results
inherit fact10 fact7 ack_3_3 gcd_48_18 gcd_100_35 pow_2_10 pow_3_5 pipelineResult;
# Data structures
inherit largeAttrs merged;
deepValue = deepNest.level1.level2.level3.level4.level5.level6.level7.level8.level9.level10.value;
deepTreasure = deepNest.level1.level2.level3.level4.level5.level6.level7.level8.level9.level10.treasure;
# Recursive attrsets
recursiveTotal = recursiveComplex.total;
recursiveFinal = recursiveComplex.final;
computedZ = recursiveComplex.z;
# Config chain
config1Result = config1.result;
config2Doubled = config2.doubled;
config3Final = config3.final;
# Merged config
mergedCombined = merged.settings.combined;
mergedArch = merged.system.arch;
mergedKernel = merged.system.kernel;
}

View file

@ -0,0 +1,75 @@
let
# Recursive factorial
factorial = n:
if n <= 1
then 1
else n * factorial (n - 1);
# Fibonacci sequence generator
fib = n:
if n <= 1
then n
else fib (n - 1) + fib (n - 2);
# List concatenation test
range = start: end:
if start >= end
then []
else [start] ++ range (start + 1) end;
# Curried function application
add = x: y: x + y;
add5 = add 5;
# Complex computation
compute = x: y: let
a = x * 2;
b = y + 10;
c = a * b;
in
c / 2;
# Data structures
numbers = range 1 11;
# Nested attribute operations
base = {
config = {
enable = true;
value = 42;
};
data = {
items = [1 2 3];
};
};
extended =
base
// {
config =
base.config
// {
extra = "test";
multiplied = base.config.value * 2;
};
computed = base.config.value + 100;
};
# Recursive attrset with selections
recursive = rec {
x = 10;
y = x * 2;
z = y + x;
result = z * 3;
final = result + x;
};
in {
fact5 = factorial 5;
fib7 = fib 7;
sum15 = add5 10;
computed = compute 10 20;
inherit numbers extended;
deepValue = extended.config.multiplied;
recursiveResult = recursive.result;
recursiveFinal = recursive.final;
}

158
tests/benchmark/run.sh Executable file
View file

@ -0,0 +1,158 @@
#!/usr/bin/env bash
set -e
echo "# Running benchmarks..."
echo ""
BENCH_DIR="$(pwd)/tests/benchmark"
IRC_BIN="$(pwd)/build/nix-irc"
GREEN='\033[0;32m'
BLUE='\033[0;34m'
YELLOW='\033[0;33m'
NC='\033[0m'
get_ms() {
local time_str="$1"
if [[ $time_str =~ ([0-9]+)m([0-9.]+)s ]]; then
local mins="${BASH_REMATCH[1]}"
local secs="${BASH_REMATCH[2]}"
local ms
ms=$(awk "BEGIN {printf \"%.1f\", ($mins * 60000) + ($secs * 1000)}")
echo "$ms"
else
echo "0"
fi
}
run_benchmark() {
local name="$1"
local file="$2"
echo -e "${BLUE}=== $name ===${NC}"
echo ""
# Measure compilation time only
echo -n " Compilation only: "
local compile_start
compile_start=$(date +%s%N)
"$IRC_BIN" "$file" /tmp/bench.nixir >/dev/null 2>&1
local compile_end
compile_end=$(date +%s%N)
local compile_ms=$(((compile_end - compile_start) / 1000000))
echo -e "${YELLOW}${compile_ms}ms${NC}"
# Measure IR loading only (deserialization + evaluation)
echo -n " IR load only: "
PLUGIN_PATH="$(pwd)/build/nix-ir-plugin.so"
if [ ! -f "$PLUGIN_PATH" ]; then
echo -e "${YELLOW}skipped${NC} (plugin not built)"
else
# Pre-compile the IR
"$IRC_BIN" "$file" /tmp/bench.nixir >/dev/null 2>&1
# Measure just the loading (average of 10 runs to reduce noise)
local total_load_us=0
for _ in {1..10}; do
local load_output
load_output=$(nix-instantiate --plugin-files "$PLUGIN_PATH" --eval --expr "builtins.nixIR_loadIR \"/tmp/bench.nixir\"" 2>&1 >/dev/null | grep "nixIR timing" | grep -oP 'total=\K[0-9]+')
total_load_us=$((total_load_us + load_output))
done
local avg_load_us=$((total_load_us / 10))
local avg_load_ms_frac=$(awk "BEGIN {printf \"%.3f\", $avg_load_us / 1000}")
echo -e "${GREEN}${avg_load_ms_frac}ms${NC} avg (10 runs)"
fi
# Measure full pipeline (compile + nix-instantiate overhead + IR load)
echo -n " Full pipeline: "
if [ ! -f "$PLUGIN_PATH" ]; then
echo -e "${YELLOW}skipped${NC}"
else
local pipeline_start
pipeline_start=$(date +%s%N)
"$IRC_BIN" "$file" /tmp/bench.nixir >/dev/null 2>&1
nix-instantiate --plugin-files "$PLUGIN_PATH" --eval --expr "builtins.nixIR_loadIR \"/tmp/bench.nixir\"" >/dev/null 2>&1
local pipeline_end
pipeline_end=$(date +%s%N)
local pipeline_ms=$(((pipeline_end - pipeline_start) / 1000000))
echo -e "${YELLOW}${pipeline_ms}ms${NC}"
fi
# Source and IR sizes
local src_size
src_size=$(stat -c%s "$file" 2>/dev/null || stat -f%z "$file" 2>/dev/null)
local ir_size
ir_size=$(stat -c%s /tmp/bench.nixir 2>/dev/null || stat -f%z /tmp/bench.nixir 2>/dev/null)
local ratio=0
if [[ "$src_size" -gt 0 ]]; then
ratio=$((ir_size * 100 / src_size))
fi
echo -e " Source size: ${src_size}B"
echo -e " IR bundle size: ${ir_size}B (${ratio}% of source)"
echo ""
# Native Nix evaluation (baseline)
echo -n " Native Nix eval: "
local native_total=0
for _ in {1..5}; do
local t
t=$( (time nix-instantiate --eval --strict "$file" >/dev/null 2>&1) 2>&1 | grep "real" | awk '{print $2}')
local ms
ms=$(get_ms "$t")
native_total=$(awk "BEGIN {print $native_total + $ms}")
done
local native_avg
native_avg=$(awk "BEGIN {printf \"%.1f\", $native_total / 5}")
echo -e "${GREEN}${native_avg}ms${NC} avg (5 runs)"
echo ""
}
echo "Measuring IR compilation speed and bundle size characteristics."
echo ""
run_benchmark "Simple Expression" "$BENCH_DIR/simple.nix"
run_benchmark "Medium Complexity" "$BENCH_DIR/medium.nix"
run_benchmark "Large Expression" "$BENCH_DIR/large.nix"
# Overall statistics
echo -e "${BLUE}=== Overall Statistics ===${NC}"
echo ""
testdir=$(mktemp -d)
total_nix=0
total_ir=0
total_compile_time=0
for f in "$BENCH_DIR"/*.nix; do
nixsize=$(stat -c%s "$f" 2>/dev/null || stat -f%z "$f" 2>/dev/null)
base=$(basename "$f" .nix)
irfile="${testdir}/${base}.nixir"
start=$(date +%s%N)
"$IRC_BIN" "$f" "$irfile" >/dev/null 2>&1
end=$(date +%s%N)
compile_time=$(((end - start) / 1000000))
if [ -f "$irfile" ]; then
irsize=$(stat -c%s "$irfile" 2>/dev/null || stat -f%z "$irfile" 2>/dev/null)
total_nix=$((total_nix + nixsize))
total_ir=$((total_ir + irsize))
total_compile_time=$((total_compile_time + compile_time))
fi
done
total_ratio=$((total_ir * 100 / total_nix))
avg_compile_time=$((total_compile_time / 3))
# TBH those are entirely unnecessary. However, I'm a sucker for data
# and those are trivial to compile. Might as well. Who knows, maybe it'll
# come in handy in the future.
echo " Total source size: ${total_nix}B"
echo " Total IR size: ${total_ir}B"
echo " Compression ratio: ${total_ratio}% of source"
echo " Average compile time: ${avg_compile_time}ms"
echo ""
rm -rf "$testdir"

View file

@ -0,0 +1,13 @@
let
x = 10;
y = 20;
z = x + y;
in {
result = z * 2;
list = [1 2 3 4 5];
attrs = {
a = 1;
b = 2;
c = 3;
};
}

Binary file not shown.

8
tests/fixtures/ancient_let.nix vendored Normal file
View file

@ -0,0 +1,8 @@
# Test ancient let syntax: let { bindings; body = expr; }
# This is equivalent to: let bindings in expr, but has been deprecated
# in newer Nix versions.
let {
x = 10;
y = 20;
body = x + y;
}

3
tests/fixtures/attrset_var.nix vendored Normal file
View file

@ -0,0 +1,3 @@
let
x = 10;
in {a = x;}

24
tests/fixtures/block_comments.nix vendored Normal file
View file

@ -0,0 +1,24 @@
# Test block comments /* */
/*
This is a block comment
*/
let
x = 42;
/*
inline block comment
*/
/*
Multi-line
block
comment
*/
y = 100;
in
/*
Comment before expression
*/
x + y
/*
Trailing comment
*/

View file

@ -3,4 +3,6 @@ let
a = 10; a = 10;
b = 20; b = 20;
in in
if a < b then true else false if a < b
then true
else false

14
tests/fixtures/dynamic_attr_full.nix vendored Normal file
View file

@ -0,0 +1,14 @@
# Test dynamic attribute names
let
key = "mykey";
value = 42;
in {
# Dynamic attribute with string interpolation
"${key}" = value;
# Another dynamic attribute
"${key}_suffix" = value + 1;
# Static attribute for comparison
static = 100;
}

15
tests/fixtures/dynamic_attrs.nix vendored Normal file
View file

@ -0,0 +1,15 @@
# Test dynamic attribute names
# Note: Full dynamic attrs require runtime evaluation
# For now, testing that syntax is recognized
let
key = "mykey";
in {
# Static attribute for comparison
static = "value";
# Dynamic attribute name (basic string interpolation)
# "${key}" = "dynamic_value";
# For now, use workaround with static names
mykey = "works";
}

1
tests/fixtures/float_test.nix vendored Normal file
View file

@ -0,0 +1 @@
1.5

11
tests/fixtures/home_path.nix vendored Normal file
View file

@ -0,0 +1,11 @@
# Test home-relative paths
# Note: This will resolve to the actual home directory at evaluation time
let
# Example home path (will be expanded by evaluator)
config = ~/..config;
file = ~/.bashrc;
in {
# These are just path values that will be expanded
configPath = config;
filePath = file;
}

4
tests/fixtures/if.nix vendored Normal file
View file

@ -0,0 +1,4 @@
# Conditional test
if true
then 1
else 2

3
tests/fixtures/import_lookup.nix vendored Normal file
View file

@ -0,0 +1,3 @@
# Test import with lookup path
# Common pattern: import <nixpkgs> { }
import <nixpkgs>

9
tests/fixtures/import_simple.nix vendored Normal file
View file

@ -0,0 +1,9 @@
# Test import expression
# Import evaluates the file and returns its value
# Import a file that returns a simple value (42)
import ./simple.nix
# Can also import lookup paths:
# import <nixpkgs> { }
# Import with path expressions:
# import (./dir + "/file.nix")

31
tests/fixtures/indented_string.nix vendored Normal file
View file

@ -0,0 +1,31 @@
# Test indented strings (multi-line strings with '' delimiters)
let
# Simple indented string
simple = ''
Hello
World
'';
# Indented string with interpolation
name = "Nix";
greeting = ''
Welcome to ${name}!
This is indented.
'';
# Escape sequences
escapes = ''
Literal dollar: ''$
Literal quotes: '''
Regular text
'';
# Shell script example (common use case)
script = ''
#!/bin/bash
echo "Running script"
ls -la
'';
in {
inherit simple greeting escapes script;
}

20
tests/fixtures/inherit.nix vendored Normal file
View file

@ -0,0 +1,20 @@
# Test inherit keyword
let
x = 10;
y = 20;
attrs = {
a = 1;
b = 2;
c = 3;
};
in {
# Basic inherit from outer scope
inherit x y;
# Inherit from expression
inherit (attrs) a b;
# Mixed
z = 30;
inherit (attrs) c;
}

3
tests/fixtures/inherit_from.nix vendored Normal file
View file

@ -0,0 +1,3 @@
let
attrs = {a = 1;};
in {inherit (attrs) a;}

3
tests/fixtures/inherit_simple.nix vendored Normal file
View file

@ -0,0 +1,3 @@
let
x = 10;
in {inherit x;}

61
tests/fixtures/lambda_pattern.nix vendored Normal file
View file

@ -0,0 +1,61 @@
# Test lambda patterns
let
# Basic destructuring
f1 = {
a,
b,
}:
a + b;
# With default values
f2 = {
a,
b ? 10,
}:
a + b;
# With ellipsis (extra fields allowed)
f3 = {a, ...}: a * 2;
# Named pattern with ellipsis to allow extra fields
f4 = arg @ {
a,
b,
...
}:
a + b + arg.c;
# Simple lambda (not a pattern)
f5 = x: x + 1;
in {
# Test basic destructuring
test1 = f1 {
a = 3;
b = 4;
};
# Test with defaults (provide both)
test2a = f2 {
a = 5;
b = 6;
};
# Test with defaults (use default for b)
test2b = f2 {a = 5;};
# Test ellipsis (extra field ignored)
test3 = f3 {
a = 7;
extra = 999;
};
# Test named pattern
test4 = f4 {
a = 1;
b = 2;
c = 3;
};
# Test simple lambda
test5 = f5 10;
}

BIN
tests/fixtures/lambda_pattern.nixr vendored Normal file

Binary file not shown.

View file

@ -2,4 +2,5 @@
let let
x = 10; x = 10;
y = 20; y = 20;
in x in
x

15
tests/fixtures/list_concat.nix vendored Normal file
View file

@ -0,0 +1,15 @@
# Test list concatenation operator ++
let
list1 = [1 2 3];
list2 = [4 5 6];
empty = [];
in {
# Basic concatenation
combined = list1 ++ list2;
# Concatenate with empty list
with_empty = list1 ++ empty;
# Nested concatenation
triple = [1] ++ [2] ++ [3];
}

8
tests/fixtures/list_simple.nix vendored Normal file
View file

@ -0,0 +1,8 @@
# Test basic list support
let
x = [1 2 3];
y = [4 5 6];
z = x ++ y; # List concatenation
in {
inherit x y z;
}

View file

@ -3,4 +3,8 @@ let
x = true; x = true;
y = false; y = false;
in in
if x && y then 1 else if x || y then 2 else 3 if x && y
then 1
else if x || y
then 2
else 3

8
tests/fixtures/lookup_path.nix vendored Normal file
View file

@ -0,0 +1,8 @@
# Test lookup path syntax
# Lookup paths resolve via NIX_PATH environment variable
# Example: <nixpkgs> -> /nix/var/nix/profiles/per-user/root/channels/nixpkgs
# Simple lookup path
<nixpkgs>
# Nested lookup path (common pattern)
# <nixpkgs/lib>

3
tests/fixtures/lookup_path_nested.nix vendored Normal file
View file

@ -0,0 +1,3 @@
# Test nested lookup path
# Common pattern in Nix: <nixpkgs/lib> or <nixpkgs/pkgs/stdenv>
<nixpkgs/lib>

2
tests/fixtures/merge.nix vendored Normal file
View file

@ -0,0 +1,2 @@
# Test attrset merge operator (//)
{a = {x = 1;} // {y = 2;};}

13
tests/fixtures/nested_attrs.nix vendored Normal file
View file

@ -0,0 +1,13 @@
# Test nested attribute paths
{
# Simple nested path
a.b.c = 42;
# Multiple nested paths
x.y = 1;
x.z = 2;
# Mix of nested and non-nested
foo = "bar";
nested.deep.value = 100;
}

6
tests/fixtures/or_in_attrset.nix vendored Normal file
View file

@ -0,0 +1,6 @@
# Test 'or' in attrset context
let
attrs = {a = 1;};
in {
test = attrs.a or 999;
}

5
tests/fixtures/or_simple.nix vendored Normal file
View file

@ -0,0 +1,5 @@
# Simplest 'or' test
let
x = {a = 1;};
in
x.a or 2

13
tests/fixtures/path_concat.nix vendored Normal file
View file

@ -0,0 +1,13 @@
# Test path concatenation
let
# Path + string = path
p1 = ./foo + "/bar";
# String + path = path
p2 = "/prefix" + ./suffix;
# Path + path = path
p3 = ./dir + ./file;
in {
inherit p1 p2 p3;
}

12
tests/fixtures/precedence.nix vendored Normal file
View file

@ -0,0 +1,12 @@
# Test operator precedence
let
a = 1 + 2 * 3; # Should be 1 + (2 * 3) = 7
b = 10 - 5 - 2; # Should be (10 - 5) - 2 = 3
c = true && false || true; # Should be (true && false) || true = true
d = 1 < 2 && 3 > 2; # Should be (1 < 2) && (3 > 2) = true
in {
a = a;
b = b;
c = c;
d = d;
}

19
tests/fixtures/select_or_default.nix vendored Normal file
View file

@ -0,0 +1,19 @@
# Test selection with 'or' default
let
attrs = {
a = 1;
b = 2;
};
in {
# Attribute exists - should use value from attrs
has_attr = attrs.a or 999;
# Attribute doesn't exist - should use default
missing_attr = attrs.c or 100;
# Nested default expression
nested = attrs.d or (attrs.a + attrs.b);
# Default with literal
with_string = attrs.name or "default_name";
}

10
tests/fixtures/shortcircuit.nix vendored Normal file
View file

@ -0,0 +1,10 @@
# Test short-circuit evaluation
let
alwaysFalse = false;
alwaysTrue = true;
x = 10;
in {
and_false = alwaysFalse && alwaysTrue;
or_true = alwaysTrue || alwaysFalse;
impl_false = alwaysFalse -> alwaysFalse;
}

1
tests/fixtures/simple_op.nix vendored Normal file
View file

@ -0,0 +1 @@
1 + 2

20
tests/fixtures/string_interp.nix vendored Normal file
View file

@ -0,0 +1,20 @@
# Test string interpolation
let
name = "world";
x = 42;
bool_val = true;
in {
# Simple interpolation
greeting = "Hello ${name}!";
# Multiple interpolations
multi = "x is ${x} and name is ${name}";
# Expression evaluation in interpolation
computed = "x + 10 = ${x + 10}";
bool_check = "${bool_val} is true!";
# Just a string, no interpolation
plain = "plain text";
}

BIN
tests/fixtures/string_interp.nixr vendored Normal file

Binary file not shown.

View file

@ -2,5 +2,7 @@
let let
x = 10; x = 10;
y = true; y = true;
in in {
{ neg = -x; not = !y; } neg = -x;
not = !y;
}

3
tests/fixtures/uri_test.nix vendored Normal file
View file

@ -0,0 +1,3 @@
https://example.com/path?query=1
#frag

View file

@ -1,2 +0,0 @@
# Conditional test
if true then 1 else 2

Binary file not shown.

View file

@ -1,17 +0,0 @@
# Test inherit keyword
let
x = 10;
y = 20;
attrs = { a = 1; b = 2; c = 3; };
in
{
# Basic inherit from outer scope
inherit x y;
# Inherit from expression
inherit (attrs) a b;
# Mixed
z = 30;
inherit (attrs) c;
}

View file

@ -1,4 +0,0 @@
let
attrs = { a = 1; };
in
{ inherit (attrs) a; }

View file

@ -1,4 +0,0 @@
let
x = 10;
in
{ inherit x; }

View file

@ -0,0 +1,19 @@
{
description = "Local flake fixture for nixir integration tests";
inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.11";
outputs = { self, nixpkgs }: {
value = 42;
nixosConfigurations.demo = nixpkgs.lib.nixosSystem {
system = "x86_64-linux";
modules = [
({ ... }: {
networking.hostName = "nixir-demo";
system.stateVersion = "24.11";
services.openssh.enable = true;
})
];
};
};
}

View file

@ -0,0 +1,7 @@
# Test that import builtin still works
let
imported = import ./imported_module.nix;
in {
value = imported.foo + 100;
nested = imported.bar.baz;
}

View file

@ -0,0 +1,7 @@
# Module to be imported
{
foo = 42;
bar = {
baz = "hello";
};
}

View file

@ -0,0 +1,13 @@
# Test our custom IR builtins
let
# Test nixIR_info
info = builtins.nixIR_info;
# Test nixIR_compile
compiled = builtins.nixIR_compile "let x = 10; in x + 5";
# Test that normal builtins still work
list = builtins.map (x: x * 2) [1 2 3];
in {
inherit info compiled list;
}

View file

@ -0,0 +1,39 @@
# Test that normal Nix evaluation is not broken
# This file should work identically with or without the plugin
let
# Basic arithmetic
math = 1 + 2 * 3;
# String operations
str = "hello" + " " + "world";
# List operations
list = [1 2 3] ++ [4 5 6];
# Attrset operations
attrs =
{
a = 1;
b = 2;
}
// {c = 3;};
# Functions
double = x: x * 2;
result = double 21;
# Conditionals
cond =
if true
then "yes"
else "no";
# Let bindings
nested = let
x = 10;
y = 20;
in
x + y;
in {
inherit math str list attrs result cond nested;
}

103
tests/integration/run.sh Executable file
View file

@ -0,0 +1,103 @@
#!/usr/bin/env bash
set -euo pipefail
echo ""
PLUGIN_PATH="$(pwd)/build/nix-ir-plugin.so"
TEST_DIR="$(pwd)/tests/integration"
if [ ! -f "$PLUGIN_PATH" ]; then
echo "ERROR: Plugin not found at $PLUGIN_PATH"
exit 1
fi
echo "Plugin path: $PLUGIN_PATH"
echo ""
echo "Test 1: Plugin Loading"
echo "----------------------"
if nix-instantiate --plugin-files "$PLUGIN_PATH" --eval "$TEST_DIR/simple_eval.nix" 2>&1 | grep -q "30"; then
echo "[PASS] Plugin loads and evaluates correctly"
else
echo "[FAIL] Plugin failed to load or evaluate"
exit 1
fi
echo ""
echo "Test 2: Normal Nix Evaluation (No Plugin)"
echo "------------------------------------------"
result=$(nix-instantiate --eval --strict --json "$TEST_DIR/regression_normal_nix.nix" 2>&1)
if echo "$result" | grep -q '"math":7'; then
echo "[PASS] Normal Nix evaluation works without plugin"
else
echo "[FAIL] Normal Nix evaluation broken"
echo "$result"
exit 1
fi
echo ""
echo "Test 3: Normal Nix Evaluation (With Plugin)"
echo "--------------------------------------------"
result=$(nix-instantiate --plugin-files "$PLUGIN_PATH" --eval --strict --json "$TEST_DIR/regression_normal_nix.nix" 2>&1)
if echo "$result" | grep -q '"math":7'; then
echo "[PASS] Normal Nix evaluation works with plugin loaded"
else
echo "[FAIL] Plugin breaks normal Nix evaluation"
echo "$result"
exit 1
fi
echo ""
echo "Test 4: Import Builtin"
echo "----------------------"
cd "$TEST_DIR"
result=$(nix-instantiate --plugin-files "$PLUGIN_PATH" --eval --strict --json import_test.nix 2>&1)
if echo "$result" | grep -q '"value":142'; then
echo "[PASS] Import builtin works correctly"
else
echo "[FAIL] Import builtin broken"
echo "$result"
exit 1
fi
cd - >/dev/null
echo ""
echo "Test 5: IR Builtins Available"
echo "------------------------------"
result=$(nix-instantiate --plugin-files "$PLUGIN_PATH" --eval "$TEST_DIR/ir_builtins_test.nix" 2>&1)
if echo "$result" | grep -q "info.*="; then
echo "[PASS] IR builtins (nixIR_info, nixIR_compile, nixIR_loadIR) available"
else
echo "[WARN] IR builtins may not be available (check plugin initialization)"
fi
echo ""
echo "Test 6: Flake Reference Compilation"
echo "-----------------------------------"
flake_ir=$(mktemp /tmp/nixir-flake-value-XXXXXX.nixir)
"$(pwd)/build/nix-irc" "$TEST_DIR/flake_ref#value" "$flake_ir"
result=$(nix-instantiate --plugin-files "$PLUGIN_PATH" --eval --strict --json --expr "builtins.nixIR_loadIR \"$flake_ir\"" 2>&1)
if echo "$result" | grep -q '^42$'; then
echo "[PASS] Flake reference compiles and evaluates correctly"
else
echo "[FAIL] Flake reference compilation broken"
echo "$result"
exit 1
fi
echo ""
echo "Test 7: NixOS Configuration Attribute Path"
echo "------------------------------------------"
config_ir=$(mktemp /tmp/nixir-flake-config-XXXXXX.nixir)
"$(pwd)/build/nix-irc" "$TEST_DIR/flake_ref#nixosConfigurations.demo.config.networking.hostName" "$config_ir"
result=$(nix-instantiate --plugin-files "$PLUGIN_PATH" --eval --strict --json --expr "builtins.nixIR_loadIR \"$config_ir\"" 2>&1)
if echo "$result" | grep -q '"nixir-demo"'; then
echo "[PASS] Nested flake attribute selection works for nixosConfigurations"
else
echo "[FAIL] NixOS configuration flake selection broken"
echo "$result"
exit 1
fi
echo ""
echo "Integration Tests Complete"

View file

@ -0,0 +1,6 @@
# Simple expression to test plugin loading
let
x = 10;
y = 20;
in
x + y

View file

@ -1,36 +0,0 @@
# Test lambda patterns
let
# Basic destructuring
f1 = { a, b }: a + b;
# With default values
f2 = { a, b ? 10 }: a + b;
# With ellipsis (extra fields allowed)
f3 = { a, ... }: a * 2;
# Named pattern with ellipsis to allow extra fields
f4 = arg@{ a, b, ... }: a + b + arg.c;
# Simple lambda (not a pattern)
f5 = x: x + 1;
in
{
# Test basic destructuring
test1 = f1 { a = 3; b = 4; };
# Test with defaults (provide both)
test2a = f2 { a = 5; b = 6; };
# Test with defaults (use default for b)
test2b = f2 { a = 5; };
# Test ellipsis (extra field ignored)
test3 = f3 { a = 7; extra = 999; };
# Test named pattern
test4 = f4 { a = 1; b = 2; c = 3; };
# Test simple lambda
test5 = f5 10;
}

View file

@ -0,0 +1,2 @@
# Test string interpolation
let x = "world"; in "Hello ${x}!"

View file

@ -0,0 +1,5 @@
# Test lambda patterns
({
name,
version ? "1.0",
}: "${name}-${version}") {name = "test";}

Binary file not shown.

Binary file not shown.

Binary file not shown.

View file

@ -1,8 +0,0 @@
# Test operator precedence
let
a = 1 + 2 * 3; # Should be 1 + (2 * 3) = 7
b = 10 - 5 - 2; # Should be (10 - 5) - 2 = 3
c = true && false || true; # Should be (true && false) || true = true
d = 1 < 2 && 3 > 2; # Should be (1 < 2) && (3 > 2) = true
in
{ a = a; b = b; c = c; d = d; }

Binary file not shown.

View file

@ -1,3 +1,5 @@
#include "irc/lexer.h"
#include "irc/parser.h"
#include "irc/serializer.h" #include "irc/serializer.h"
#include "irc/types.h" #include "irc/types.h"
#include <cassert> #include <cassert>
@ -7,21 +9,21 @@ using namespace nix_irc;
int failures = 0; int failures = 0;
#define TEST_CHECK(cond, msg) \ #define TEST_CHECK(cond, msg) \
do { \ do { \
if (!(cond)) { \ if (!(cond)) { \
std::cerr << " FAIL: " << msg << std::endl; \ std::cerr << " FAIL: " << msg << std::endl; \
failures++; \ failures++; \
} else { \ } else { \
std::cout << " PASS: " << msg << std::endl; \ std::cout << " PASS: " << msg << std::endl; \
} \ } \
} while (0) } while (0)
#define TEST_PASS(msg) std::cout << " PASS: " << msg << std::endl #define TEST_PASS(msg) std::cout << " PASS: " << msg << std::endl
#define TEST_FAIL(msg) \ #define TEST_FAIL(msg) \
do { \ do { \
std::cerr << " FAIL: " << msg << std::endl; \ std::cerr << " FAIL: " << msg << std::endl; \
failures++; \ failures++; \
} while (0) } while (0)
void test_enum_compatibility() { void test_enum_compatibility() {
@ -30,33 +32,27 @@ void test_enum_compatibility() {
if (static_cast<uint8_t>(NodeType::WITH) == 0x32) { if (static_cast<uint8_t>(NodeType::WITH) == 0x32) {
std::cout << " PASS: WITH has correct value 0x32" << std::endl; std::cout << " PASS: WITH has correct value 0x32" << std::endl;
} else { } else {
std::cerr << " FAIL: WITH should be 0x32, got " std::cerr << " FAIL: WITH should be 0x32, got " << static_cast<uint8_t>(NodeType::WITH)
<< static_cast<uint8_t>(NodeType::WITH) << std::endl; << std::endl;
} }
if (static_cast<uint8_t>(NodeType::HAS_ATTR) == 0x34) { if (static_cast<uint8_t>(NodeType::HAS_ATTR) == 0x34) {
std::cout << " PASS: HAS_ATTR has value 0x34 (new slot after WITH bump)" std::cout << " PASS: HAS_ATTR has value 0x34 (new slot after WITH bump)" << std::endl;
<< std::endl;
} else if (static_cast<uint8_t>(NodeType::HAS_ATTR) == 0x33 && } else if (static_cast<uint8_t>(NodeType::HAS_ATTR) == 0x33 &&
static_cast<uint8_t>(NodeType::WITH) == 0x32) { static_cast<uint8_t>(NodeType::WITH) == 0x32) {
std::cout << " PASS: HAS_ATTR has value 0x33 (restored original with WITH " std::cout << " PASS: HAS_ATTR has value 0x33 (restored original with WITH "
"at 0x32)" "at 0x32)"
<< std::endl; << std::endl;
} else { } else {
std::cerr << " FAIL: HAS_ATTR value is " std::cerr << " FAIL: HAS_ATTR value is " << static_cast<uint8_t>(NodeType::HAS_ATTR)
<< static_cast<uint8_t>(NodeType::HAS_ATTR)
<< " (expected 0x34 or 0x33 with WITH=0x32)" << std::endl; << " (expected 0x34 or 0x33 with WITH=0x32)" << std::endl;
} }
if (IR_VERSION == 2) { if (IR_VERSION == 3) {
std::cout << " PASS: IR_VERSION bumped to 2 for breaking change" std::cout << " PASS: IR_VERSION is 3" << std::endl;
<< std::endl;
} else if (static_cast<uint8_t>(NodeType::WITH) == 0x32) {
std::cout << " PASS: IR_VERSION unchanged but WITH restored to 0x32"
<< std::endl;
} else { } else {
std::cerr << " FAIL: Either bump IR_VERSION or fix enum values" std::cerr << " FAIL: IR_VERSION should be 3, got " << IR_VERSION << std::endl;
<< std::endl; failures++;
} }
} }
@ -80,19 +76,16 @@ void test_serializer_select_with_default() {
Deserializer deser; Deserializer deser;
auto loaded = deser.deserialize(bytes); auto loaded = deser.deserialize(bytes);
auto *loaded_select = loaded.entry->get_if<SelectNode>(); auto* loaded_select = loaded.entry->get_if<SelectNode>();
if (loaded_select && loaded_select->default_expr && if (loaded_select && loaded_select->default_expr && *loaded_select->default_expr) {
*loaded_select->default_expr) { auto* def_val = (*loaded_select->default_expr)->get_if<ConstIntNode>();
auto *def_val = (*loaded_select->default_expr)->get_if<ConstIntNode>();
if (def_val && def_val->value == 100) { if (def_val && def_val->value == 100) {
std::cout << " PASS: SELECT with default_expr round-trips correctly" std::cout << " PASS: SELECT with default_expr round-trips correctly" << std::endl;
<< std::endl;
} else { } else {
std::cerr << " FAIL: default_expr value incorrect" << std::endl; std::cerr << " FAIL: default_expr value incorrect" << std::endl;
} }
} else { } else {
std::cerr << " FAIL: default_expr not deserialized (missing u8 flag read)" std::cerr << " FAIL: default_expr not deserialized (missing u8 flag read)" << std::endl;
<< std::endl;
} }
} }
@ -114,11 +107,9 @@ void test_serializer_select_without_default() {
Deserializer deser; Deserializer deser;
auto loaded = deser.deserialize(bytes); auto loaded = deser.deserialize(bytes);
auto *loaded_select = loaded.entry->get_if<SelectNode>(); auto* loaded_select = loaded.entry->get_if<SelectNode>();
if (loaded_select && if (loaded_select && (!loaded_select->default_expr || !*loaded_select->default_expr)) {
(!loaded_select->default_expr || !*loaded_select->default_expr)) { std::cout << " PASS: SELECT without default_expr round-trips correctly" << std::endl;
std::cout << " PASS: SELECT without default_expr round-trips correctly"
<< std::endl;
} else { } else {
std::cerr << " FAIL: default_expr should be null/absent" << std::endl; std::cerr << " FAIL: default_expr should be null/absent" << std::endl;
} }
@ -127,38 +118,617 @@ void test_serializer_select_without_default() {
void test_parser_brace_depth_in_strings() { void test_parser_brace_depth_in_strings() {
std::cout << "> Parser brace depth handling in strings..." << std::endl; std::cout << "> Parser brace depth handling in strings..." << std::endl;
std::string test_input = R"( std::string test_input = R"(let s = "test}"; in s)";
let s = "test}"; in ${s}
)";
std::cout << " Test input contains '}' inside string - should not end " try {
"interpolation" Parser parser;
<< std::endl; auto ast = parser.parse(test_input);
std::cout << " NOTE: This test requires running through actual parser" TEST_PASS("Brace inside string does not confuse parser");
<< std::endl; } catch (const std::exception& e) {
TEST_FAIL("Parser should handle '}' inside strings");
}
} }
void test_parser_has_ellipsis_usage() { void test_parser_has_ellipsis_usage() {
std::cout << "> Parser has_ellipsis usage..." << std::endl; std::cout << "> Parser has_ellipsis usage..." << std::endl;
std::cout << " NOTE: LambdaNode should have strict_pattern field when " std::string with_ellipsis = "{ a, ... }: a";
"has_ellipsis is false" std::string without_ellipsis = "{ a, b }: a + b";
<< std::endl;
std::cout << " This requires checking the parser output for strict patterns" try {
<< std::endl; Parser parser1;
auto ast1 = parser1.parse(with_ellipsis);
TEST_PASS("Pattern with ellipsis parses correctly");
Parser parser2;
auto ast2 = parser2.parse(without_ellipsis);
TEST_PASS("Pattern without ellipsis parses correctly");
} catch (const std::exception& e) {
TEST_FAIL("Pattern parsing failed");
}
} }
void test_parser_expect_in_speculative_parsing() { void test_parser_expect_in_speculative_parsing() {
std::cout << "> Parser expect() in speculative parsing..." << std::endl; std::cout << "> Parser expect() in speculative parsing..." << std::endl;
std::cout << " NOTE: try_parse_lambda should not throw on non-lambda input" std::string not_a_lambda = "1 + 2";
<< std::endl; std::string actual_lambda = "x: x + 1";
std::cout << " This requires testing parser with invalid lambda patterns"
<< std::endl; try {
Parser parser1;
auto ast1 = parser1.parse(not_a_lambda);
TEST_PASS("Non-lambda input does not cause parser to throw");
Parser parser2;
auto ast2 = parser2.parse(actual_lambda);
TEST_PASS("Actual lambda parses correctly");
} catch (const std::exception& e) {
TEST_FAIL("Parser should handle both lambda and non-lambda input");
}
}
void test_implication_right_associativity() {
std::cout << "> Implication right associativity..." << std::endl;
Parser parser;
auto ast = parser.parse("a -> b -> c");
auto* outer = ast->get_if<BinaryOpNode>();
TEST_CHECK(outer != nullptr, "Top-level node is BinaryOpNode");
TEST_CHECK(outer && outer->op == BinaryOp::IMPL, "Top-level operator is implication");
if (outer) {
auto* left = outer->left->get_if<VarNode>();
auto* right = outer->right->get_if<BinaryOpNode>();
TEST_CHECK(left != nullptr && left->name && *left->name == "a", "Left branch is variable 'a'");
TEST_CHECK(right != nullptr && right->op == BinaryOp::IMPL,
"Right branch is nested implication");
}
}
void test_lookup_path_lexer_position() {
std::cout << "> Lookup path lexer position..." << std::endl;
Lexer lexer("<nixpkgs> x");
auto tokens = lexer.tokenize();
TEST_CHECK(tokens.size() >= 3, "Lexer produced lookup path, identifier, and EOF");
TEST_CHECK(tokens[0].type == Token::LOOKUP_PATH, "First token is LOOKUP_PATH");
TEST_CHECK(tokens[1].type == Token::IDENT && tokens[1].value == "x",
"Second token is identifier 'x'");
TEST_CHECK(tokens[1].col == 11, "Identifier column reflects consumed lookup path width");
}
void test_unterminated_block_comment_rejected() {
std::cout << "> Unterminated block comment rejection..." << std::endl;
try {
Lexer lexer("/* unterminated");
auto tokens = lexer.tokenize();
(void) tokens;
TEST_FAIL("Lexer should reject unterminated block comments");
} catch (const std::exception& e) {
TEST_PASS("Lexer rejects unterminated block comments");
}
}
void test_unknown_character_rejected() {
std::cout << "> Unknown character rejection..." << std::endl;
try {
Lexer lexer("1 $ 2");
auto tokens = lexer.tokenize();
(void) tokens;
TEST_FAIL("Lexer should reject unexpected characters");
} catch (const std::exception& e) {
TEST_PASS("Lexer rejects unexpected characters");
}
}
void test_lookup_path_node() {
std::cout << "> Lookup path serialization..." << std::endl;
auto lookup = std::make_shared<Node>(ConstLookupPathNode("nixpkgs"));
IRModule module;
module.entry = lookup;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
Deserializer deser;
auto loaded = deser.deserialize(bytes);
auto* loaded_lookup = loaded.entry->get_if<ConstLookupPathNode>();
TEST_CHECK(loaded_lookup != nullptr, "Deserialized node is ConstLookupPathNode");
TEST_CHECK(loaded_lookup && loaded_lookup->value == "nixpkgs", "Lookup path value is 'nixpkgs'");
}
void test_import_node() {
std::cout << "> Import node serialization..." << std::endl;
auto path = std::make_shared<Node>(ConstPathNode("./test.nix"));
auto import_node = std::make_shared<Node>(ImportNode(path));
IRModule module;
module.entry = import_node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
Deserializer deser;
auto loaded = deser.deserialize(bytes);
auto* loaded_import = loaded.entry->get_if<ImportNode>();
TEST_CHECK(loaded_import != nullptr, "Deserialized node is ImportNode");
TEST_CHECK(loaded_import && loaded_import->path != nullptr, "Import node has path");
if (loaded_import && loaded_import->path) {
auto* path_node = loaded_import->path->get_if<ConstPathNode>();
TEST_CHECK(path_node != nullptr, "Import path is ConstPathNode");
TEST_CHECK(path_node && path_node->value == "./test.nix", "Import path value is './test.nix'");
}
}
void test_import_with_lookup_path() {
std::cout << "> Import with lookup path..." << std::endl;
auto lookup = std::make_shared<Node>(ConstLookupPathNode("nixpkgs"));
auto import_node = std::make_shared<Node>(ImportNode(lookup));
IRModule module;
module.entry = import_node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
Deserializer deser;
auto loaded = deser.deserialize(bytes);
auto* loaded_import = loaded.entry->get_if<ImportNode>();
TEST_CHECK(loaded_import != nullptr, "Deserialized node is ImportNode");
if (loaded_import && loaded_import->path) {
auto* lookup_node = loaded_import->path->get_if<ConstLookupPathNode>();
TEST_CHECK(lookup_node != nullptr, "Import path is ConstLookupPathNode");
TEST_CHECK(lookup_node && lookup_node->value == "nixpkgs", "Lookup path value is 'nixpkgs'");
}
}
void test_relative_path_import_parsing() {
std::cout << "> Relative path import parsing..." << std::endl;
Parser parser;
auto ast = parser.parse("import ./simple.nix");
auto* import_node = ast->get_if<ImportNode>();
TEST_CHECK(import_node != nullptr, "Parsed expression is ImportNode");
if (import_node && import_node->path) {
auto* path_node = import_node->path->get_if<ConstPathNode>();
TEST_CHECK(path_node != nullptr, "Import argument is ConstPathNode");
TEST_CHECK(path_node && path_node->value == "./simple.nix",
"Relative path is preserved as './simple.nix'");
}
}
void test_builtin_call_node() {
std::cout << "> BuiltinCallNode serialization..." << std::endl;
auto arg = std::make_shared<Node>(ConstStringNode("/tmp/example-flake"));
auto builtin =
std::make_shared<Node>(BuiltinCallNode("getFlake", std::vector<std::shared_ptr<Node>>{arg}));
IRModule module;
module.entry = builtin;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
Deserializer deser;
auto loaded = deser.deserialize(bytes);
auto* loaded_builtin = loaded.entry->get_if<BuiltinCallNode>();
TEST_CHECK(loaded_builtin != nullptr, "Deserialized node is BuiltinCallNode");
TEST_CHECK(loaded_builtin && loaded_builtin->builtin_name == "getFlake",
"Builtin name is 'getFlake'");
TEST_CHECK(loaded_builtin && loaded_builtin->args.size() == 1, "Builtin has one argument");
if (loaded_builtin && loaded_builtin->args.size() == 1) {
auto* loaded_arg = loaded_builtin->args[0]->get_if<ConstStringNode>();
TEST_CHECK(loaded_arg != nullptr, "Builtin argument is ConstStringNode");
TEST_CHECK(loaded_arg && loaded_arg->value == "/tmp/example-flake",
"Builtin argument value round-trips");
}
}
void test_uri_node() {
std::cout << "> URI node serialization..." << std::endl;
auto uri = std::make_shared<Node>(ConstURINode("https://example.com"));
IRModule module;
module.entry = uri;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
Deserializer deser;
auto loaded = deser.deserialize(bytes);
auto* loaded_uri = loaded.entry->get_if<ConstURINode>();
TEST_CHECK(loaded_uri != nullptr, "Deserialized node is ConstURINode");
TEST_CHECK(loaded_uri && loaded_uri->value == "https://example.com",
"URI value is 'https://example.com'");
}
void test_float_node() {
std::cout << "> Float node serialization..." << std::endl;
auto float_val = std::make_shared<Node>(ConstFloatNode(3.14159));
IRModule module;
module.entry = float_val;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
Deserializer deser;
auto loaded = deser.deserialize(bytes);
auto* loaded_float = loaded.entry->get_if<ConstFloatNode>();
TEST_CHECK(loaded_float != nullptr, "Deserialized node is ConstFloatNode");
TEST_CHECK(loaded_float && loaded_float->value > 3.14 && loaded_float->value < 3.15,
"Float value is approximately 3.14159");
}
// LambdaPatternNode Tests
void test_lambda_pattern_simple() {
std::cout << "> LambdaPatternNode simple ({ a, b }: a + b)..." << std::endl;
// Body: a + b (using VarNode for a and b)
auto var_a = std::make_shared<Node>(VarNode(0, "a"));
auto var_b = std::make_shared<Node>(VarNode(0, "b"));
auto body = std::make_shared<Node>(BinaryOpNode(BinaryOp::ADD, var_a, var_b));
// Create lambda pattern with two required fields
LambdaPatternNode lambda_pattern(body);
lambda_pattern.required_fields.emplace_back("a", std::nullopt);
lambda_pattern.required_fields.emplace_back("b", std::nullopt);
lambda_pattern.allow_extra = false;
auto node = std::make_shared<Node>(std::move(lambda_pattern));
// Serialize
IRModule module;
module.entry = node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
// Deserialize
Deserializer deser;
auto loaded = deser.deserialize(bytes);
// Verify
auto* loaded_node = loaded.entry->get_if<LambdaPatternNode>();
TEST_CHECK(loaded_node != nullptr, "Type is LambdaPatternNode");
TEST_CHECK(loaded_node && loaded_node->required_fields.size() == 2, "Has 2 required fields");
TEST_CHECK(loaded_node && loaded_node->optional_fields.size() == 0, "Has 0 optional fields");
TEST_CHECK(loaded_node && loaded_node->required_fields[0].name == "a", "First field is 'a'");
TEST_CHECK(loaded_node && loaded_node->required_fields[1].name == "b", "Second field is 'b'");
TEST_CHECK(loaded_node && !loaded_node->at_binding.has_value(), "No at-binding");
TEST_CHECK(loaded_node && !loaded_node->allow_extra, "No ellipsis");
TEST_CHECK(loaded_node && loaded_node->body != nullptr, "Has body");
}
void test_lambda_pattern_with_defaults() {
std::cout << "> LambdaPatternNode with defaults ({ a, b ? 10 }: a + b)..." << std::endl;
// Default value for b
auto default_b = std::make_shared<Node>(ConstIntNode(10));
// Body: a + b
auto var_a = std::make_shared<Node>(VarNode(0, "a"));
auto var_b = std::make_shared<Node>(VarNode(0, "b"));
auto body = std::make_shared<Node>(BinaryOpNode(BinaryOp::ADD, var_a, var_b));
// Create lambda pattern
LambdaPatternNode lambda_pattern(body);
lambda_pattern.required_fields.emplace_back("a", std::nullopt);
lambda_pattern.optional_fields.emplace_back("b", default_b);
lambda_pattern.allow_extra = false;
auto node = std::make_shared<Node>(std::move(lambda_pattern));
// Serialize
IRModule module;
module.entry = node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
// Deserialize
Deserializer deser;
auto loaded = deser.deserialize(bytes);
// Verify
auto* loaded_node = loaded.entry->get_if<LambdaPatternNode>();
TEST_CHECK(loaded_node != nullptr, "Type is LambdaPatternNode");
TEST_CHECK(loaded_node && loaded_node->required_fields.size() == 1, "Has 1 required field");
TEST_CHECK(loaded_node && loaded_node->optional_fields.size() == 1, "Has 1 optional field");
TEST_CHECK(loaded_node && loaded_node->required_fields[0].name == "a", "Required field is 'a'");
TEST_CHECK(loaded_node && loaded_node->optional_fields[0].name == "b", "Optional field is 'b'");
TEST_CHECK(loaded_node && loaded_node->optional_fields[0].default_value.has_value(),
"Optional field has default");
if (loaded_node && loaded_node->optional_fields[0].default_value) {
auto* def_val = (*loaded_node->optional_fields[0].default_value)->get_if<ConstIntNode>();
TEST_CHECK(def_val && def_val->value == 10, "Default value is 10");
}
}
void test_lambda_pattern_at_binding() {
std::cout << "> LambdaPatternNode with at-binding (args@{ a, b }: args.a)..." << std::endl;
// Body: args.a (select expression)
auto var_args = std::make_shared<Node>(VarNode(0, "args"));
auto attr = std::make_shared<Node>(ConstStringNode("a"));
auto body = std::make_shared<Node>(SelectNode(var_args, attr));
// Create lambda pattern with at-binding
LambdaPatternNode lambda_pattern(body);
lambda_pattern.required_fields.emplace_back("a", std::nullopt);
lambda_pattern.required_fields.emplace_back("b", std::nullopt);
lambda_pattern.at_binding = "args";
lambda_pattern.allow_extra = false;
auto node = std::make_shared<Node>(std::move(lambda_pattern));
// Serialize
IRModule module;
module.entry = node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
// Deserialize
Deserializer deser;
auto loaded = deser.deserialize(bytes);
// Verify
auto* loaded_node = loaded.entry->get_if<LambdaPatternNode>();
TEST_CHECK(loaded_node != nullptr, "Type is LambdaPatternNode");
TEST_CHECK(loaded_node && loaded_node->at_binding.has_value(), "Has at-binding");
TEST_CHECK(loaded_node && loaded_node->at_binding.value() == "args", "At-binding is 'args'");
}
void test_lambda_pattern_ellipsis() {
std::cout << "> LambdaPatternNode with ellipsis ({ a, ... }: a)..." << std::endl;
// Body: a
auto body = std::make_shared<Node>(VarNode(0, "a"));
// Create lambda pattern with ellipsis
LambdaPatternNode lambda_pattern(body);
lambda_pattern.required_fields.emplace_back("a", std::nullopt);
lambda_pattern.allow_extra = true;
auto node = std::make_shared<Node>(std::move(lambda_pattern));
// Serialize
IRModule module;
module.entry = node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
// Deserialize
Deserializer deser;
auto loaded = deser.deserialize(bytes);
// Verify
auto* loaded_node = loaded.entry->get_if<LambdaPatternNode>();
TEST_CHECK(loaded_node != nullptr, "Type is LambdaPatternNode");
TEST_CHECK(loaded_node && loaded_node->allow_extra, "Has ellipsis (allow_extra=true)");
}
void test_lambda_pattern_complete() {
std::cout << "> LambdaPatternNode complete (args@{ a, b ? 5, ... }: body)..." << std::endl;
// Default value for b
auto default_b = std::make_shared<Node>(ConstIntNode(5));
// Body: simple var
auto body = std::make_shared<Node>(VarNode(0, "x"));
// Create lambda pattern with all features
LambdaPatternNode lambda_pattern(body);
lambda_pattern.required_fields.emplace_back("a", std::nullopt);
lambda_pattern.optional_fields.emplace_back("b", default_b);
lambda_pattern.at_binding = "args";
lambda_pattern.allow_extra = true;
auto node = std::make_shared<Node>(std::move(lambda_pattern));
// Serialize
IRModule module;
module.entry = node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
// Deserialize
Deserializer deser;
auto loaded = deser.deserialize(bytes);
// Verify all fields
auto* loaded_node = loaded.entry->get_if<LambdaPatternNode>();
TEST_CHECK(loaded_node != nullptr, "Type is LambdaPatternNode");
TEST_CHECK(loaded_node && loaded_node->required_fields.size() == 1, "Has 1 required field");
TEST_CHECK(loaded_node && loaded_node->optional_fields.size() == 1, "Has 1 optional field");
TEST_CHECK(loaded_node && loaded_node->at_binding.has_value(), "Has at-binding");
TEST_CHECK(loaded_node && loaded_node->at_binding.value() == "args", "At-binding is 'args'");
TEST_CHECK(loaded_node && loaded_node->allow_extra, "Has ellipsis");
}
void test_lambda_pattern_empty() {
std::cout << "> LambdaPatternNode empty ({ }: body)..." << std::endl;
// Body: simple constant
auto body = std::make_shared<Node>(ConstIntNode(42));
// Create empty lambda pattern
LambdaPatternNode lambda_pattern(body);
lambda_pattern.allow_extra = false;
auto node = std::make_shared<Node>(std::move(lambda_pattern));
// Serialize
IRModule module;
module.entry = node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
// Deserialize
Deserializer deser;
auto loaded = deser.deserialize(bytes);
// Verify
auto* loaded_node = loaded.entry->get_if<LambdaPatternNode>();
TEST_CHECK(loaded_node != nullptr, "Type is LambdaPatternNode");
TEST_CHECK(loaded_node && loaded_node->required_fields.size() == 0, "Has 0 required fields");
TEST_CHECK(loaded_node && loaded_node->optional_fields.size() == 0, "Has 0 optional fields");
TEST_CHECK(loaded_node && !loaded_node->at_binding.has_value(), "No at-binding");
TEST_CHECK(loaded_node && !loaded_node->allow_extra, "No ellipsis");
}
// StringInterpolationNode Tests
void test_string_interpolation_simple() {
std::cout << "> StringInterpolationNode simple (\"hello ${name}\")..." << std::endl;
// "hello ${name}" = literal "hello " + expr(name)
std::vector<StringPart> parts;
parts.push_back(StringPart::make_literal("hello "));
parts.push_back(StringPart::make_expr(std::make_shared<Node>(VarNode(0, "name"))));
auto node = std::make_shared<Node>(StringInterpolationNode(std::move(parts)));
// Serialize
IRModule module;
module.entry = node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
// Deserialize
Deserializer deser;
auto loaded = deser.deserialize(bytes);
// Verify
auto* loaded_node = loaded.entry->get_if<StringInterpolationNode>();
TEST_CHECK(loaded_node != nullptr, "Type is StringInterpolationNode");
TEST_CHECK(loaded_node && loaded_node->parts.size() == 2, "Has 2 parts");
TEST_CHECK(loaded_node && loaded_node->parts[0].type == StringPart::Type::LITERAL,
"First part is LITERAL");
TEST_CHECK(loaded_node && loaded_node->parts[0].literal == "hello ", "First part is 'hello '");
TEST_CHECK(loaded_node && loaded_node->parts[1].type == StringPart::Type::EXPR,
"Second part is EXPR");
TEST_CHECK(loaded_node && loaded_node->parts[1].expr != nullptr, "Second part has expression");
}
void test_string_interpolation_multiple() {
std::cout << "> StringInterpolationNode multiple (\"${a} and ${b}\")..." << std::endl;
// "${a} and ${b}" = expr(a) + literal " and " + expr(b)
std::vector<StringPart> parts;
parts.push_back(StringPart::make_expr(std::make_shared<Node>(VarNode(0, "a"))));
parts.push_back(StringPart::make_literal(" and "));
parts.push_back(StringPart::make_expr(std::make_shared<Node>(VarNode(0, "b"))));
auto node = std::make_shared<Node>(StringInterpolationNode(std::move(parts)));
// Serialize
IRModule module;
module.entry = node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
// Deserialize
Deserializer deser;
auto loaded = deser.deserialize(bytes);
// Verify
auto* loaded_node = loaded.entry->get_if<StringInterpolationNode>();
TEST_CHECK(loaded_node != nullptr, "Type is StringInterpolationNode");
TEST_CHECK(loaded_node && loaded_node->parts.size() == 3, "Has 3 parts");
TEST_CHECK(loaded_node && loaded_node->parts[0].type == StringPart::Type::EXPR, "Part 0 is EXPR");
TEST_CHECK(loaded_node && loaded_node->parts[1].type == StringPart::Type::LITERAL,
"Part 1 is LITERAL");
TEST_CHECK(loaded_node && loaded_node->parts[1].literal == " and ", "Part 1 is ' and '");
TEST_CHECK(loaded_node && loaded_node->parts[2].type == StringPart::Type::EXPR, "Part 2 is EXPR");
}
void test_string_interpolation_complex() {
std::cout << "> StringInterpolationNode complex (\"result: ${a + b}\")..." << std::endl;
// "result: ${a + b}" = literal "result: " + expr(a + b)
auto expr_a = std::make_shared<Node>(VarNode(0, "a"));
auto expr_b = std::make_shared<Node>(VarNode(0, "b"));
auto add_expr = std::make_shared<Node>(BinaryOpNode(BinaryOp::ADD, expr_a, expr_b));
std::vector<StringPart> parts;
parts.push_back(StringPart::make_literal("result: "));
parts.push_back(StringPart::make_expr(add_expr));
auto node = std::make_shared<Node>(StringInterpolationNode(std::move(parts)));
// Serialize
IRModule module;
module.entry = node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
// Deserialize
Deserializer deser;
auto loaded = deser.deserialize(bytes);
// Verify
auto* loaded_node = loaded.entry->get_if<StringInterpolationNode>();
TEST_CHECK(loaded_node != nullptr, "Type is StringInterpolationNode");
TEST_CHECK(loaded_node && loaded_node->parts.size() == 2, "Has 2 parts");
TEST_CHECK(loaded_node && loaded_node->parts[1].type == StringPart::Type::EXPR, "Part 1 is EXPR");
// Verify the expression is a BinaryOpNode
if (loaded_node && loaded_node->parts[1].expr) {
auto* bin_op = loaded_node->parts[1].expr->get_if<BinaryOpNode>();
TEST_CHECK(bin_op != nullptr, "Expression is BinaryOpNode");
TEST_CHECK(bin_op && bin_op->op == BinaryOp::ADD, "Operation is ADD");
}
}
void test_string_interpolation_nested() {
std::cout << "> StringInterpolationNode nested (\"${prefix}/${path}\")..." << std::endl;
// "${prefix}/${path}" = expr(prefix) + literal "/" + expr(path)
std::vector<StringPart> parts;
parts.push_back(StringPart::make_expr(std::make_shared<Node>(VarNode(0, "prefix"))));
parts.push_back(StringPart::make_literal("/"));
parts.push_back(StringPart::make_expr(std::make_shared<Node>(VarNode(0, "path"))));
auto node = std::make_shared<Node>(StringInterpolationNode(std::move(parts)));
// Serialize
IRModule module;
module.entry = node;
Serializer ser;
auto bytes = ser.serialize_to_bytes(module);
// Deserialize
Deserializer deser;
auto loaded = deser.deserialize(bytes);
// Verify
auto* loaded_node = loaded.entry->get_if<StringInterpolationNode>();
TEST_CHECK(loaded_node != nullptr, "Type is StringInterpolationNode");
TEST_CHECK(loaded_node && loaded_node->parts.size() == 3, "Has 3 parts");
TEST_CHECK(loaded_node && loaded_node->parts[1].type == StringPart::Type::LITERAL,
"Middle part is LITERAL");
TEST_CHECK(loaded_node && loaded_node->parts[1].literal == "/", "Middle part is '/'");
} }
int main() { int main() {
std::cout << "=== Regression Tests for Nixir ===" << std::endl << std::endl; std::cout << "=== Regression Tests ===" << std::endl << std::endl;
test_enum_compatibility(); test_enum_compatibility();
std::cout << std::endl; std::cout << std::endl;
@ -178,6 +748,69 @@ int main() {
test_parser_expect_in_speculative_parsing(); test_parser_expect_in_speculative_parsing();
std::cout << std::endl; std::cout << std::endl;
test_implication_right_associativity();
std::cout << std::endl;
test_lookup_path_lexer_position();
std::cout << std::endl;
test_unterminated_block_comment_rejected();
std::cout << std::endl;
test_unknown_character_rejected();
std::cout << std::endl;
test_lookup_path_node();
std::cout << std::endl;
test_import_node();
std::cout << std::endl;
test_import_with_lookup_path();
std::cout << std::endl;
test_relative_path_import_parsing();
std::cout << std::endl;
test_builtin_call_node();
std::cout << std::endl;
test_uri_node();
std::cout << std::endl;
test_float_node();
std::cout << std::endl;
test_lambda_pattern_simple();
std::cout << std::endl;
test_lambda_pattern_with_defaults();
std::cout << std::endl;
test_lambda_pattern_at_binding();
std::cout << std::endl;
test_lambda_pattern_ellipsis();
std::cout << std::endl;
test_lambda_pattern_complete();
std::cout << std::endl;
test_lambda_pattern_empty();
std::cout << std::endl;
test_string_interpolation_simple();
std::cout << std::endl;
test_string_interpolation_multiple();
std::cout << std::endl;
test_string_interpolation_complex();
std::cout << std::endl;
test_string_interpolation_nested();
std::cout << std::endl;
std::cout << "=== Tests Complete ===" << std::endl; std::cout << "=== Tests Complete ===" << std::endl;
std::cout << "Failures: " << failures << std::endl; std::cout << "Failures: " << failures << std::endl;
return failures > 0 ? 1 : 0; return failures > 0 ? 1 : 0;

View file

@ -1,11 +0,0 @@
# Test short-circuit evaluation
let
alwaysFalse = false;
alwaysTrue = true;
x = 10;
in
{
and_false = alwaysFalse && alwaysTrue;
or_true = alwaysTrue || alwaysFalse;
impl_false = alwaysFalse -> alwaysFalse;
}

Binary file not shown.

View file

@ -1 +0,0 @@
1 + 2

Binary file not shown.

View file

@ -1,19 +0,0 @@
# Test string interpolation
let
name = "world";
x = 42;
bool_val = true;
in
{
# Simple interpolation
greeting = "Hello ${name}!";
# Multiple interpolations
multi = "x is ${x} and name is ${name}";
# Nested expression
nested = "Result: ${if bool_val then "yes" else "no"}";
# Just a string (no interpolation)
plain = "plain text";
}

Binary file not shown.