8.7 KiB
NIP: Formats and Core Concepts
Version: 1.2 Date: 2025-07-16
This document specifies the core data formats, storage architecture, and fundamental concepts for the Nexus Installation Program (nip). It merges the initial project plan with a content-addressed, Merkle-based architecture for maximum efficiency, verifiability, and reproducibility.
1. Core Principles
- Content-Addressable: All data is stored based on its content hash, providing automatic deduplication.
- Cryptographically Verifiable: The entire system state can be verified with a single cryptographic hash.
- Immutable & Atomic: Installations and updates are atomic operations, ensuring system consistency.
- Declarative: The system state is defined by declarative manifest files.
1.1. Trust and Authenticity
To ensure not just integrity but also authenticity, nip incorporates a trust layer based on Ed25519 signatures.
- Manifest Signatures: Each
.npkmanifest can be signed using Ed25519 keys (e.g., OpenSSH keys). Signatures can be detached or inline within the manifest. This allows verification of the package's origin. Multiple signatures (e.g., personal, CI, Foundation) are supported. - Root-of-Trust for
nip.lock: Thenip.lockfile, representing a complete system generation, can be signed. A single signature over the lockfile transforms the Merkle root into a tamper-evident release artifact. - Key Management: Support for
keyid,created, andexpiresmetadata for keys facilitates revocation and rotation without requiring every package to be rebuilt. This lays the groundwork for future TUF-style metadata integration.
2. Hashing Algorithms
- Cryptographic Hashing: The default hash algorithm is BLAKE2b-512 until BLAKE3 becomes available in Nimble. The digest is encoded as Multihash (varint
<code><len><digest>) to ensure future-proofing, allowing for easy transitions to other algorithms like BLAKE3, SHA-512 or KangarooTwelve without redesigning the CAS. - Non-Cryptographic Hashing: SipHash is recommended for internal data structures.
3. Storage Architecture
3.1. The Content-Addressable Store (CAS)
The CAS is the canonical source of all file data.
- Locations:
~/.nip/cas/(user) and/var/lib/nip/cas/(system). - Compression: To conserve disk space, objects are stored compressed by default using
zstd. However, the canonical hash of an object is always the hash of its uncompressed content using the configured algorithm (BLAKE2b-512 by default). Integrity is always verified against the true data. This behavior can be configured innip.conf(e.g.,cas.compress = true,cas.compression_level = 19). - Structure: Objects are stored by their multihash (hex-encoded), sharded by the first two hex characters (e.g.,
cas/ab/cdef1234...). For large fleets, sharding can extend to more levels (e.g.,cas/ab/cd/efgh...for 4-level fan-out after 16k objects). - Garbage Collection: A reference-counted garbage collector (
nip gc) reclaims space by scanning every reachable manifest hash in all livenip.lockfiles (system + user cells) and marking CAS objects reachable via those manifests. Unmarked blobs are then deleted. Optionally, "pin sets" (named live roots, à la Docker) can be added to prevent specific objects from being collected.
3.2. The Manifest Store
- Locations:
~/.nip/manifests/(user) and/var/lib/nip/manifests/(system). - Structure: Stores
.npkmanifest files, whose own BLAKE3 hashes serve as their unique IDs.
4. The .npk Manifest Format
A single, self-contained KDL document. Its BLAKE3 hash is the package's unique identifier.
4.1. KDL Schema for .npk
package "htop" {
version "3.3.0"
description "Interactive process viewer"
channels { stable, testing } # Lets one manifest live in multiple Streams without duplication.
source "pacman" { /* ... */ }
dependencies { /* ... */ }
build {
system "x86_64-linux"
compiler "nim-2.2.4"
env_hash "blake3-d34db33f..." # Stores the deterministic build fingerprint—needed for exact rebuilds & `nip verify --rebuild`.
}
snapshots {
created "2025-07-16T20:00:00Z" # Easy human audit; ISO 8601 timestamp.
}
files {
file "/Programs/Htop/3.3.0/bin/htop" "blake3-f4e5d6..." "755"
file "/Programs/Htop/3.3.0/share/man/man1/htop.1.gz" "blake3-a9b8c7..." "644"
}
artifacts { /* ... */ }
services {
systemd "htop.service" "blake3-unit..." # For packages that ship systemd units.
}
signatures {
# Ed25519 signatures on each .npk manifest (detached, or inline `signature "ed25519" "<base64>"`).
# Supports multiple keys (personal, CI, Foundation).
# Record `keyid`, `created`, `expires`.
}
}
5. The System Lockfile: nip.lock
The System Generation Manifest, defining the complete state of installed packages.
5.1. KDL Schema for nip.lock
lockfile_version 1.2
generation {
id "blake3-d34db33f..." # The hash of this file.
created "2025-07-16T20:05:17Z" # ISO 8601 timestamp.
previous "blake3-abcdef..." # Hash of the previous generation's lockfile, forming a hash-chained log.
}
packages {
package "htop-3.3.0.npk" "blake3-htophash..."
package "ncurses-6.4.npk" "blake3-ncurseshash..."
}
signature "ed25519" "<base64>" # Root-of-trust for `nip.lock` (system generation). `nip sign lock --key ~/.ssh/nip_ed25519`
6. The Installation Filesystem
6.1. GoboLinux-style Hierarchy (/Programs)
A human-readable hierarchy of symlinks pointing to the CAS, providing a view of an immutable backend.
6.2. PATH Management via Active Index
To expose executables to the user's shell, nip uses an "Active Index" directory. This is a single, stable location the user adds to their PATH.
- System-wide:
/System/Index/bin - User-specific:
~/.nip/profile/bin
When a new generation is activated via nip switch, nip atomically repopulates this directory with symlinks to the executables of the new generation. This provides fast shell startup and race-free activation.
7. Cross-Platform Compatibility & Security
7.1. Path Separators
- Manifests must use POSIX forward slashes (
/) for all paths. This is the canonical format. - The
nipclient is responsible for translating paths to the native format (e.g.,\on Windows) at runtime. - Manifests containing backslashes will be rejected by
nip verify.
7.2. Symlink Security & Hardening
- Only relative symlinks are created and verified before writing to prevent filesystem escapes.
- Manifests attempting path traversals (e.g.,
../../etc/passwd) are rejected during verification. - Optionally,
/Programscan be mounted asnoexec,nodevand rely on a "programs overlay" bind mount that flips execute bits only for whitelisted directories, enhancing security. - On older Windows versions,
nipwill fall back to using junctions or hard-links if developer-mode symlinks are unavailable.
8. Remote Operations & Caching
nip supports fetching missing objects from remote binary caches (e.g., a static HTTP server or S3 bucket). Since objects are content-addressed, a remote cache is a simple key-value store, mirroring Nix's binary cache feature.
nip remote add <name> <url>: Adds a remote cache (e.g.,nip remote add origin https://cache.nexushub.io). Missing objects/manifests can then be fetched via HTTP range GET.nip remote push <remote> <package.npk>: Uploads missing CAS blobs + manifest to the remote cache, returning content URIs.nip remote serve --path /var/lib/nip: Starts a read-only cache server, trivial for air-gapped labs.
9. Advanced CAS Concepts: Delta & Chunk-level Deduplication (Phase 2)
To further optimize bandwidth and storage, nip is designed to support a future phase of delta and chunk-level deduplication.
- Fixed-size "Merkle-chunk" layer: Large binaries often compress poorly across versions, but chunk hashes can deduplicate a significant portion (e.g., ~90% of same-version-family kernels).
- Implementation: This would involve a
filesnode in the.npkmanifest referencing a list of chunk hashes instead of a single file hash, allowing for efficient storage and transfer of only changed chunks.
10. Proposed CLI Tooling
nip cat <hash>: Dumps a CAS blob to stdout. Great for debugging. Use--rawflag for uncompressed stream.nip fsck: Verifies that every symlink in/Programstargets a valid CAS object referenced in some manifest; repairs stray links.nip doctor: Runsfsck,gc --dry-run, and prints actionable suggestions for system health.nip diff <genA> <genB>: Compares two lockfiles; outputs added/removed/changed manifests (with semantic version bump hints).