The Nix language is a purely functional, lazy, statically scoped configuration language that is commonly used to specify software package configurations, OS system configurations, distributed system configurations, and cloud system configurations.
Static scoping means that variables are statically bound; all variable references are resolved based on their scope at declaration time. For example, if we declare a set with recursive bindings,
❄> let a = rec { x = "abc"; x2 = x + "123"; }; in a { x = "abc"; x2 = "abc123"; }then the use of
x
in the definition of x2
is resolved to "abc".
Even if we later update the definition of x
, the definition of x2
will not change.
❄> let a = rec { x = "abc"; x2 = x + "123"; }; in a // { x = "def"; } { x = "def"; x2 = "abc123"; }
Generally speaking, static binding is a good thing.
I find that languages with static scoping are easy to understand because variable references can be followed in the source code.
Static scoping lets Nix be referentially transparent, so, modulo α-renaming, you can always substitute expressions by their (partially) normalized values.
In the previous example, we can replace the expression rec { x = "abc"; x2 = x + "123"; }
with { x = "abc"; x2 = "abc123"; }
and the result of the program does not change.
❄> let a = { x = "abc"; x2 = "abc123"; }; in a // { x = "def"; } { x = "def"; x2 = "abc123"; }
That said, on some occasions it would be nice to have some dynamic binding.
Dynamic binding is used in Nixpkgs to allow users to override libraries with alternate versions in such a way that all other software packages that depend on it pick up the replacement version.
For example, we might have the following in our nixpkgs
declaration
… boost149 = callPackage ../development/libraries/boost/1.49.nix { }; boost155 = callPackage ../development/libraries/boost/1.55.nix { }; boost = boost155; …and perhaps we want to make
boost149
the default boost version to work around some regression.
If we write nixpkgs // { boost = boost149; }
then we only update the boost
field of the nix package collection and none of the packages depending on boost will change.
Instead we need to use the config.packageOverrides
to change boost
in such a way that all expressions depending on boost
are also updated.
Our goal is to understand the technique that packageOverrides
and other similar overrides employ to achieve this sort of dynamic binding in a statically scoped language such as Nix.
This same technique is also used to give semantics to object-oriented languages.
First we need to review normal recursive bindings.
The rec
operator is can almost be defined as a function in Nix itself by taking a fixed point.
Recall that in Nix lambda expressions are written as x: expr
.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint; in let a = fix (self: { x = "abc"; x2 = self.x + "123"; }); in a { x = "abc"; x2 = "abc123"; }
The function self: { x = "abc"; x2 = self.x + "123"; }
is an object written in the open recursive style.
By taking the fixpoint of this function, the recursion is closed yielding the desired value.
Written this way, we had to prefix the recursive references to x
with self.
.
However using Nix’s with
operator, we can bring the values of self
into scope allowing us to drop the prefix.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint; in let a = fix (self: with self; { x = "abc"; x2 = x + "123"; }); in a { x = "abc"; x2 = "abc123"; }
This is pretty close to a definition of rec
.
We can almost think of rec { bindings }
as syntactic sugar for fix (self: with self; { bindings })
.
In order to override the definition of x
instead up updating it, we need to intercept the definition of x
before the open recursion is closed.
To this end, we write a fixWithOverride
function that takes a set of overriding bindings and an open recursive object and applies the override bindings before closing the recursion.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint; withOverride = overrides: f: self: f self // overrides; fixWithOverride = overrides: f: fix (withOverride overrides f); in let open_a = self: with self; { x = "abc"; x2 = x + "123"; }; in fixWithOverride { x = "def"; } open_a { x = "def"; x2 = "def123"; }
Success!
We have manage to override the definition of x
and get the definition of x2
updated automatically to reflect the new value of x
.
Let us step through this code to see how it works.
First we defined open_a
to be the same as our previous definition of a
, but written as an open recursive object.
The expression fixWithOverride { x = "def"; } open_a
reduces to fix (withOverride { x = "def"; } open_a)
.
What the withOverride
function does is takes an open recursive object and returns a new open recursive object with updated bindings.
In particular, withOverride { x = "def"; } open_a
reduces to
self: (with self; { x = "abc"; x2 = x + "123"; }) // { x = "def"; }which in turn reduces to
self: { x = "def"; x2 = self.x + "123"; }
.
Finally, fix
takes the fixpoint of this updated open recursive object to get the closed value { x = "def"; x2 = "def123"; }
.
This is great; however, we do not want to have to work with open recursive objects everywhere.
Instead, what we can do is build a closed recursive value, but tack on an extra field named _override
that provides access to withOverride
applied to the open recursive object.
This will allow us to perform both updates and overrides at our discretion.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint; withOverride = overrides: f: self: f self // overrides; virtual = f: fix f // { _override = overrides: virtual (withOverride overrides f); }; in let a = virtual (self: with self; { x = "abc"; x2 = x + "123"; }); in rec { example1 = a; example2 = a // { x = "def"; }; example3 = a._override { x = "def"; }; example4 = example3._override { y = true; }; example5 = example4._override { x = "ghi"; }; } { example1 = { _override = <LAMBDA>; x = "abc"; x2 = "abc123"; }; example2 = { _override = <LAMBDA>; x = "def"; x2 = "abc123"; }; example3 = { _override = <LAMBDA>; x = "def"; x2 = "def123"; }; example4 = { _override = <LAMBDA>; x = "def"; x2 = "def123"; y = true; }; example5 = { _override = <LAMBDA>; x = "ghi"; x2 = "ghi123"; y = true; }; }
One remaining problem with the above definition of virtual
is that we cannot override the method x2
and still get access to x
.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint; withOverride = overrides: f: self: f self // overrides; virtual = f: fix f // { _override = overrides: virtual (withOverride overrides f); }; in let a = virtual (self: with self; { x = "abc"; x2 = x + "123"; }); in a._override { x2 = x + "456"; } error: undefined variable `x' at `(string):5:23'
Remembering that Nix is statically scoped, we see that the variable x
in a._override { x2 = x + "456"; }
is a dangling reference and does not refer to anything in lexical scope.
To remedy this, we allow the overrides
parameter to withOverride
to optionally take a open recursive object rather than necessarily a set of bindings.
❄> let fix = f: let fixpoint = f fixpoint; in fixpoint; withOverride = overrides: f: self: f self // (if builtins.isFunction overrides then overrides self else overrides); virtual = f: fix f // { _override = overrides: virtual (withOverride overrides f); }; in let a = virtual (self: with self; { x = "abc"; x2 = x + "123"; }); in rec { example6 = a._override (self: with self; { x2 = x + "456"; }); example7 = example6._override { x = "def"; }; } { example6 = { _override = <LAMBDA>; x = "abc"; x2 = "abc456"; }; example7 = { _override = <LAMBDA>; x = "def"; x2 = "def456"; }; }
This illustrates is the basic technique that packageOverrides
and other similar overrides use.
The packageOverrides
code is a bit more complex because there are multiple points of entry into the package override system, but the above is the essential idea behind it.
The makeOverridable
function from customisation.nix creates an override
field similar to my _override
field above, but overrides function arguments rather than overriding recursive bindings.
The syntax virtual (self: with self; { bindings })
is a little heavy.
One could imagine adding a virtual
keyword to Nix, similar to rec
, so that virtual { bindings }
would denote this expression.
After writing all this I am not certain my title is correct. I called this faking dynamic binding, but I think one could argue that this is actually real dynamic binding.