Macro Tutorial#
Lisp is a higher-level language than Python, in the same sense that Python is a higher-level language than C, and C is a higher-level language than assembly.
In C, abstractions like for-loops and the function call stack are primitivesâfeatures built into the language. But in assembly, those are design patterns built with lower-level jumps/GOTOs that have to be repeated each time theyâre needed. Things like call stacks had to be discovered and developed and learned as best practice in the more primitive assembly languages. Before the development of the structured programming paradigm, the industry standard was GOTO spaghetti.
Similarly, in Python, abstractions like iterators, classes, higher-order functions, hash tables, and garbage collection are primitives, but in C, those are design patterns, discovered and developed over time as best practice, and built with lower-level parts like arrays, structs, and pointers, which have to be repeated each time theyâre needed.
To someone who started out in assembly or BASIC, or C, or even Java, Python seems marvelously high-level, once mastered. Python makes everything that was so tedious before seem so easy.
But the advanced Python developer eventually starts to notice the cracks. You can get a lot further in Python, but like the old GOTO spaghetti code, large enough projects start to collapse under their own weight. Python seemed so easy before, but certain repeated patterns canât be abstracted away. Youâre stuck with a certain amount of boilerplate and ceremony.
Programmers comfortable with C,
but unfamiliar with Python,
will tend to write C idioms in Python,
like using explicit indexes into lists in for-loops over a range
,
instead of using the listâs iterator directly.
Their code is said to be unpythonic.
They forgo much of Pythonâs power,
because they donât know the right idioms.
âDesign patternsâ and âidiomsâ in low-level languages are language-level built-in features of higher-level ones. Lisp is even higher-level than that. In Lisp, you donât have âdesign patternsâ for long, because they are a thing you can abstract to avoid repeating. You can create your own language-level features, because macros give you hooks into the compiler itself.
Lisp can do things you might not have realized were possible. Until you understand what Lisp can do, youâre forgoing much of its power. This is a tutorial, not a reference, and Iâll be explaining not just how to write macros, but why you need them.
If youâre new to Lisp, go back and read the style guide if you havenât already. Understanding how Lisp is formatted helps you to read it, not just write it. And you will need to read it. Learning to read a new programming language can be difficult, because youâre using up working memory that would otherwise be helping with the meaning of the code on the syntax itself. This does get better with familiarity, because you can offload that part to your long-term memory. That also means that itâs more difficult the more different the new language is from those you already know.
Fortunately, Lisspâs syntax is very minimal, so thereâs not that much to remember, and most of the vocabulary you know from Python already. You can skim over the Python in this tutorial, but resist the urge to skim the Lissp. S-expressions are a very direct representation of the same kind of syntax trees that you mentally generate when reading any other programming language. Take your time and comprehend each subexpression instead of taking it in all at once.
The Hissp Primer was mostly about learning how to program with a subset of Python in a new skin. This one is about using that knowledge to reprogram the skin itself.
If you donât know the basics from the Primer, go back and read that now, or at least read the Lissp Whirlwind Tour.
In the Primer we mostly used the REPL, but it can become tedious to type long forms into the REPL, and it doesnât save your work. S-expressions are awkward to edit without editor support for them, and the included Lissp REPL is layered on Pythonâs interactive console, which has only basic line editing support.
The usual workflow when developing Lissp is to create a .lissp
file and work in there.
Then you can save as you go
and send fragments of it to the REPL for evaluation and experimentation.
You might already develop Python this way.
A good editor can be configured to send selected text to the REPL
with a simple keyboard command,
but copy-and-paste into a terminal window will do.
Setting up your editor for Lissp is beyond the scope of this tutorial. If youâre not already comfortable with Emacs and Paredit, give Parinfer a try.
Shorter Lambdas#
The defect rate in computer programs seems to be a near-constant fraction of the number of kilobytes of source code. For reasonable line length, it doesnât seem to matter how much those lines are doing, or what language itâs written in. Code is a liability. Itâs that much more space for bugs to hide â that much more you have to read to understand the system. The less code you have, the better, as long as it still gets the job done.
Perhaps this can be taken too far. Code golf is good exercise, not good practice. Eventually, there are diminishing returns, and other costs to consider. But as a rule of thumb, one of the best things you can do to improve a codebase is to make it shorter, almost any way you can. Fewer slightly less-readable lines are much more readable than too many slightly more-readable lines.
Consider Pythonâs humble lambda
.
Itâs important to programming in the functional style,
and central to the way Hissp works,
as a compilation target for one of its two special forms.
Itâs actually really powerful.
But the overhead of typing out a six-letter word might make you a little too reluctant to use it, unlike in Smalltalk where itâs just square brackets, and itâs used all the time in control flow methods.
Wouldnât it be nice if we could give lambda
a shorter name?
L = lambda
Could we then use L
in place of lambda
?
Maybe like this?
squares = map(L x: x * x, range(10))
Alas, this doesnât work.
The L = lambda
is a syntax error.
To be fair to Python, Iâd use a generator expression here, which is the same length:
squares = map(L x: x * x, range(10))
squares = (x * x for x in range(10))
But I need a simple example, and lambdas are a lot more general:
product = reduce(L a, x: a * x, range(1, 7))
A genexpr doesnât really help us in a reduce
.
They say that in Python everything is an object.
But itâs not quite true, is it?
lambda
isnât an object in Python.
Itâs a reserved word, but at run time, thatâs not an object.
Itâs not anything.
If youâre rolling your eyes and thinking,
âWhy would I even expect this to work?â
then youâre still thinking inside the Python box.
You can store class and function objects in variables and pass them as arguments to functions in Python. To someone who came from a language without higher-order functions, this feels like breaking the rules. Using it effectively feels like amazing out-of-the-box thinking.
Letâs begin.
Warm-Up#
Create a Lissp file (perhaps macros.lissp
),
and open it in your Lisp editor of choice.
Fire up the Lissp REPL in a terminal, or in your editor if it does that, in the same directory as your Lissp file.
Add the prelude to the top of the file:
(hissp.._macro_.prelude)
And push it to the REPL as well:
#> (hissp.._macro_.prelude)
>>> # hissp.._macro_.prelude
... __import__('builtins').exec(
... ('from functools import partial,reduce\n'
... 'from itertools import *;from operator import *\n'
... 'def engarde(xs,h,f,/,*a,**kw):\n'
... ' try:return f(*a,**kw)\n'
... ' except xs as e:return h(e)\n'
... 'def enter(c,f,/,*a):\n'
... ' with c as C:return f(*a,C)\n'
... "class Ensue(__import__('collections.abc').abc.Generator):\n"
... ' send=lambda s,v:s.g.send(v);throw=lambda s,*x:s.g.throw(*x);F=0;X=();Y=[]\n'
... ' def __init__(s,p):s.p,s.g,s.n=p,s._(s),s.Y\n'
... ' def _(s,k,v=None):\n'
... " while isinstance(s:=k,__class__) and not setattr(s,'sent',v):\n"
... ' try:k,y=s.p(s),s.Y;v=(yield from y)if s.F or y is s.n else(yield y)\n'
... ' except s.X as e:v=e\n'
... ' return k\n'
... "_macro_=__import__('types').SimpleNamespace()\n"
... "try:exec('from hissp.macros._macro_ import *',vars(_macro_))\n"
... 'except ModuleNotFoundError:pass'),
... __import__('builtins').globals())
Caution
The prelude
macro overwrites your _macro_
namespace with a copy of the bundled one.
Any macros youâve defined in there are lost.
In Lissp files, the prelude is meant to be used before any definitions,
when it is used at all.
Likewise, in the REPL, enter it first, or be prepared to re-enter your definitions.
The REPL already comes with the bundled macros loaded,
but not the en- group or imports.
Iâll mostly be showing the REPL from here on. Remember, compose non-trivial forms in your Lissp file first, then push to the REPL, not the other way around. Your editor is for editing. The REPL isnât good at that. Weâll be modifying these definitions through several iterations.
You can compile your Lissp file to Python using the REPL with a command like
#> (hissp..transpile __package__ 'foo)
where foo
is the name of your module
(so, macros
if your Lissp file was named that).
Weâre not actually in a package,
so the __package__
argument is just going to resolve to None
(empty string also works),
but itâs important that you do include it when you are,
or the compiler might not be able to resolve names correctly,
so it doesnât hurt to always add it.
You should see a Python file with the same name appear. If you open it in your editor, you should see the compiled prelude, like you saw in the REPL.
Start a subREPL in the new Python module. The command is like
#> (hissp..interact (vars foo.))
And confirm that __name__
resolves to your foo.
If you need to, you can quit the subREPL and return to main by entering an EOF.
(Thatâs Ctrl+D, if you didnât know,
or Ctrl+Z Enter, for the Windows terminal.)
Itâs just a subREPL, so this doesnât exit Python.
Any globals you defined in the module will still be there.
Now, letâs try that same idea in Lissp:
#> (define L lambda)
>>> # define
... __import__('builtins').globals().update(
... L=lambda)
Traceback (most recent call last):
...
File "<console>", line 5
lambda)
^
SyntaxError: invalid syntax
Still a syntax error.
The problem is that we tried to evaluate the lambda
before the assignment.
You can use Hisspâs other special form, quote
, to prevent evaluation.
#> (define L 'lambda)
>>> # define
... __import__('builtins').globals().update(
... L='lambda')
OK, but that just turned it into a string. We could have done that much in Python:
>>> L = 'lambda'
That worked, but can we use it?
>>> squares = map(L x: x * x, range(10))
Traceback (most recent call last):
...
squares = map(L x: x * x, range(10))
^
SyntaxError: invalid syntax
Another syntax error. No surprise.
Write the equivalent example in your Lissp file and push it to the REPL:
#> (define squares (map (L (x)
#.. (mul x x))
#.. (range 10)))
>>> # define
... __import__('builtins').globals().update(
... squares=map(
... L(
... x(),
... mul(
... x,
... x)),
... range(
... (10))))
Traceback (most recent call last):
File "<console>", line 7, in <module>
NameError: name 'x' is not defined
Not a syntax error, but itâs not working either. Why not? Quote the whole thing to see the Hissp code.
#> '(define squares (map (L (x)
#.. (mul x x))
#.. (range 10)))
>>> ('define',
... 'squares',
... ('map',
... ('L',
... ('x',),
... ('mul',
... 'x',
... 'x',),),
... ('range',
... (10),),),)
('define', 'squares', ('map', ('L', ('x',), ('mul', 'x', 'x')), ('range', 10)))
We donât want that 'L'
string in the Hissp, but 'lambda'
.
Hissp isnât compiling it like a special form.
Is that possible?
It is with one more step. We want to dereference this at read time. Inject:
#> (define squares (map (.#L (x)
#.. (mul x x))
#.. (range 10)))
>>> # define
... __import__('builtins').globals().update(
... squares=map(
... (lambda x:
... mul(
... x,
... x)),
... range(
... (10))))
#> (list squares)
>>> list(
... squares)
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Amazing.
Those of you who started with Python might be a little impressed, but you C people are thinking, âYeah, thatâs just a macro. We can do that much in C with the preprocessor. I bet we could preprocess Python too somehow.â To which Iâd reply, What do you think Lissp is?
Lissp is a transpiler. Itâs much more powerful than the C preprocessor, but despite this it is also less error prone, because it mostly operates on the more structured AST rather than text.
Since Python is supposed to be such a marvelously high-level language compared to C that it doesnât need a preprocessor, canât it do that too?
No, it really canât:
>>> squares = map(eval(f"{L} x: x * x"), range(10))
>>> list(squares)
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Sometimes higher-level tools cut you off from the lower level. You can get pretty close to the same idea, but thatâs about the best Python can do. Compare:
eval(f"{L} x: x * x")
lambda x: x * x
It didnât help, did it? It got longer. Can we do better?
>>> e = eval
e(f"{L} x:x*x")
lambda x:x*x
Nope.
And there are good reasons to avoid eval
in Python:
We have to compile code at run time,
and put more than we wanted to in a string,
and deal with separate namespaces. Ick.
Lissp had none of those problems.
This simple substitution metaprogramming task that was so easy in Lissp was so awkward in Python.
But Lissp does more than substitutions.
Simple Compiler Macros#
Despite my recent boasting, our Lissp version is not actually shorter than Pythonâs yet:
(.#L (x)
(mul x x))
lambda x: x * x
If you like, we can give mul
a shorter name:
#> (define * mul)
>>> # define
... __import__('builtins').globals().update(
... QzSTAR_=mul)
And the params tuple doesnât technically have to be a tuple:
(.#L x (* x x))
lambda x: x * x
Symbols become strings at the Hissp level, which are iterables containing character strings. This only works because the variable name is a single character. Now weâre at the same length as Python.
Letâs make it even shorter.
Given a tuple containing the minimum amount of information, we want expand that into the necessary code using a macro.
Isnât there something extra here we could get rid of? With a compiler macro, we wonât need the inject.
The template needs to look something like
(lambda <params> <body>)
.
Try this definition.
(defmacro L (params : :* body)
`(lambda ,params ,@body))
#> (list (map (L x (* x x))
#.. (range 10)))
>>> list(
... map(
... # L
... (lambda x:
... QzSTAR_(
... x,
... x)),
... range(
... (10))))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Success. Now compare:
(L x (* x x))
lambda x: x * x
Are we doing better? Barely. If we remove the spaces that arenât required:
(L x(* x x))
lambda x:x*x
Weâve caught up to where Python started. But is this really the minimum amount of information required? It depends on how general you need to be, but wouldnât this be enough?
(L * X X)
We need to expand that into this:
(lambda (X)
(* X X))
So the template would look something like this:
(lambda (X)
(<expr>))
Remember this is basically the same as that anaphoric macro we did in the Hissp Primer.
(defmacro L (: :* expr)
`(lambda (,'X) ; Interpolate anaphors to prevent qualification!
,expr))
#> (list (map (L * X X) (range 10)))
>>> list(
... map(
... # L
... (lambda X:
... QzSTAR_(
... X,
... X)),
... range(
... (10))))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Now weâre shorter than Python:
(L * X X)
lambda x:x*x
But weâre also less general.
We can change the expression,
but weâve hardcoded the parameters to it.
The fixed parameter name is fine as long unless it shadows a nonlocal
we need,
but what if we needed two parameters?
Could we make a macro for that?
Think about it.
Seriously, close your eyes and think about it for at least fifteen seconds before moving on.
Donât generalize before we have examples to work with.
Iâll wait.
...
...
...
...
...
...
...
...
...
...
...
...
...
...
...
Ready?
(defmacro L2 (: :* expr)
`(lambda (,'X ,'Y)
,expr))
#> (L2 * X Y)
>>> # L2
... (lambda X,Y:
... QzSTAR_(
... X,
... Y))
<function <lambda> at ...>
Thatâs another easy template.
Between L
and L2
,
weâve probably covered the Pareto 80% majority of short-lambda use cases.
But you can see the pattern now.
We could continue to an L3
with a Z
parameter,
and then weâve run out of alphabet.
When you see a âdesign patternâ in Lissp, you donât keep repeating it.
Nothing Is Above Abstraction#
Are you ready for this? Youâve seen all these pieces before, even if you havenât realized they could be used this way.
Donât panic.
#> .#`(progn ,@(map (lambda (i)
#.. `(defmacro ,(.format "L{}" i)
#.. (: :* $#expr)
#.. `(lambda ,',(getitem "ABCDEFGHIJKLMNOPQRSTUVWXYZ" (slice i))
#.. ,$#expr)))
#.. (range 27)))
>>> # __main__.._macro_.progn
... (lambda :(
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... '',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L0',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L0',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'A',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L1',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L1',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'AB',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L2',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L2',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABC',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L3',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L3',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCD',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L4',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L4',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDE',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L5',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L5',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEF',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L6',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L6',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFG',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L7',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L7',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGH',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L8',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L8',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHI',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L9',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L9',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJ',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L10',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L10',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJK',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L11',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L11',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKL',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L12',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L12',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLM',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L13',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L13',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMN',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L14',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L14',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNO',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L15',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L15',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNOP',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L16',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L16',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNOPQ',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L17',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L17',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNOPQR',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L18',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L18',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNOPQRS',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L19',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L19',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNOPQRST',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L20',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L20',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNOPQRSTU',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L21',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L21',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNOPQRSTUV',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L22',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L22',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNOPQRSTUVW',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L23',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L23',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNOPQRSTUVWX',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L24',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L24',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNOPQRSTUVWXY',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L25',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L25',
... _QzAW22OE5Kz_fn))[-1])(),
... # __main__.._macro_.defmacro
... # hissp.macros.._macro_.let
... (lambda _QzAW22OE5Kz_fn=(lambda *_QzQ46NYXTBz_expr:
... (lambda * _: _)(
... 'lambda',
... 'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
... _QzQ46NYXTBz_expr)):(
... __import__('builtins').setattr(
... _QzAW22OE5Kz_fn,
... '__qualname__',
... ('.').join(
... ('_macro_',
... 'L26',))),
... __import__('builtins').setattr(
... __import__('operator').getitem(
... __import__('builtins').globals(),
... '_macro_'),
... 'L26',
... _QzAW22OE5Kz_fn))[-1])())[-1])()
Whoa.
That little bit of Lissp expanded into that much Python. It totally works too.
#> ((L3 add C (add A B))
#.. "A" "B" "C")
>>> # L3
... (lambda A,B,C:
... add(
... C,
... add(
... A,
... B)))(
... ('A'),
... ('B'),
... ('C'))
'CAB'
#> (L26)
>>> # L26
... (lambda A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z:())
<function <lambda> at ...>
#> (L13)
>>> # L13
... (lambda A,B,C,D,E,F,G,H,I,J,K,L,M:())
<function <lambda> at ...>
#> ((L0 print "Hello, World!"))
>>> # L0
... (lambda :
... print(
... ('Hello, World!')))()
Hello, World!
How does this work? I donât blame you for glossing over the Python output. Itâs pretty big this time. I mostly ignore it when it gets longer than a few lines, unless thereâs something in particular Iâm looking for.
But letâs look at this Lissp snippet again, more carefully.
.#`(progn ,@(map (lambda (i)
`(defmacro ,(.format "L{}" i)
(: :* $#expr)
`(lambda ,',(getitem "ABCDEFGHIJKLMNOPQRSTUVWXYZ" (slice i))
,$#expr)))
(range 27)))
Itâs injecting some Hissp we generated with a template.
Those are the first two reader macros .#
(inject) and `
(template quote).
The progn
sequences multiple expressions for their side effects.
Itâs like having multiple âstatementsâ in a single expression.
We splice in multiple expressions generated with a map
.
The map
generates a code tuple for each integer from the range
.
The lambda takes the int i
from the range
and produces a defmacro
form,
(not a macro, the code for defining one)
which, when run in the progn
by our inject,
will define a macro.
Nothing is above abstraction in Lissp.
defmacro
forms are still code,
and Hissp code is made of data structures we can manipulate programmatically.
We can make them with templates like anything else.
We need to give each one a different name,
so we combine the i
with "L"
.
The parameters tuple for defmacro
contains a gensym, $#expr
,
since it shouldnât be qualified and it doesnât need to be an anaphor.
The next part is tricky. Weâve directly nested a template inside another one, without unquoting it first, because the defmacro also needed a template to work. Note that you can unquote through nested templates. This is an important capability, but itâs a little mind-bending.
Finally, we slice the params string to the appropriate number of characters.
Take a breath. Weâre not done.
Macros Can Read Code Too.#
Weâre still providing more information than is required. You have to change the name of your macro based on the number of arguments you expect. But canât the macro infer this based on which parameters your expression contains?
Also, weâre kind of running out of alphabet when we start on X
,
You often see 4-D vectors labeled (x, y, z, w),
but beyond that, mathematicians just number them with subscripts.
We got around this by starting at A
instead,
but then weâre using up all of the uppercase ASCII one-character names.
We might want to save those for other things.
Weâre also limited to 26 parameters this way.
Itâs rare that weâd need more than three or four,
but 26 seems kind of arbitrary.
So a better approach might be with numbered parameters, like X1
, X2
, X3
, etc.
Then, if you macro is smart enough,
it can look for the highest X-number in your expression
and automatically provide that many parameters for you.
We can create numbered Xâs the same way we created the numbered Lâs.
(defmacro L (number : :* expr)
`(lambda ,(map (lambda (i)
(.format "X{}" i))
(range 1 (add 1 number)))
,expr))
Tip
Oh, by the way, weâve been pushing individual forms to the subREPL up till now,
but itâs sometimes more convenient to save, recompile,
and reload the whole module.
Comment out anything you donât want loaded.
You can still push them later.
A _#
can discard a tuple and everything in it.
(Although it still gets read.)
You already know how to compile.
No, you donât have to restart the REPL!
importlib.reload
. See also, defonce
, The del statement
.
#> (L 10)
>>> # L
... (lambda X1,X2,X3,X4,X5,X6,X7,X8,X9,X10:())
<function <lambda> at ...>
#> ((L 2 add X1 X2) "A" "B")
>>> # L
... (lambda X1,X2:
... add(
... X1,
... X2))(
... ('A'),
... ('B'))
'AB'
This version uses a number as the first argument instead of baking them into the macro names. Weâre using numbered parameters now, so thereâs no limit. That takes care of generating the parameters, but weâre still providing a redundant expected number for them.
Letâs make a slight tweak.
(defmacro L (: :* expr)
`(lambda ,(map (lambda (i)
(.format "X{}" i))
(range 1 (add 1 (max-X expr))))
,expr))
What is this max-X
?
Itâs a venerable design technique known as wishful thinking.
We havenât implemented it yet.
This doesnât work.
But we wish it would find the maximum X number in the expression.
Can we just iterate through the expression and check?
(define max-X
(lambda (expr)
(max (map (lambda (x)
(|| (when (is_ str (type x))
(let (match (re..fullmatch "X([1-9][0-9]*)" x))
(when match
(int (.group match 1)))))
0))
expr))))
Does that make sense?
Read the definition carefully.
You can view the docs for any bundled macro
you donât recognize in the REPL like (help hissp.._macro_.foo)
,
but you might prefer searching the rendered version in the API docs.
Most have documented usage examples you can experiment with in the REPL.
Weâre using them to coalesce Pythonâs awkward regex matches,
which can return None
, into a 0
,
unless itâs a string with a match.
It gets the parameters right:
#> ((L add X2 X1) : :* "AB")
>>> # L
... (lambda X1,X2:
... add(
... X2,
... X1))(
... *('AB'))
'BA'
Pretty cool.
#> ((L add X1 (add X2 X3))
#.. : :* "BAR")
>>> # L
... (lambda X1:
... add(
... X1,
... add(
... X2,
... X3)))(
... *('BAR'))
Traceback (most recent call last):
File "<console>", line 2, in <module>
TypeError: <lambda>() takes 1 positional argument but 3 were given
Oh. Not that easy.
What happened?
The error message says that lambda only took one parameter,
even though the expression contained an X3
.
We need to be able to check for symbols nested in tuples. This sounds like a job for recursion.
(define flatten
(lambda (form)
chain#(map (lambda x
(if-else (is_ (type x) tuple)
(flatten x)
`(,x)))
form)))
More bundled macros here. Search Hisspâs docs if you canât figure out what they do.
Flatten
is a good utility to have for macros that have to read code.
Now we can fix max-X
.
(define max-X
(lambda (expr)
(max (map (lambda (x)
(|| (when (is_ str (type x))
(let (match (re..fullmatch "X([1-9][0-9]*)" x))
(when match
(int (.group match 1)))))
0))
(flatten expr)))))
Letâs try again.
#> ((L add X1 (add X2 X3))
#.. : :* "BAR")
>>> # L
... (lambda X1,X2,X3:
... add(
... X1,
... add(
... X2,
... X3)))(
... *('BAR'))
'BAR'
Try doing that with the C preprocessor!
Function Literals#
Letâs review. The code you need to make the version we have so far is
(hissp.._macro_.prelude)
(defmacro L (: :* expr)
`(lambda ,(map (lambda (i)
(.format "X{}" i))
(range 1 (add 1 (max-X expr))))
,expr))
(define max-X
(lambda (expr)
(max (map (lambda (x)
(|| (when (is_ str (type x))
(let (match (re..fullmatch "X([1-9][0-9]*)" x))
(when match
(int (.group match 1)))))
0))
(flatten expr)))))
(define flatten
(lambda (form)
chain#(map (lambda x
(if-else (is_ (type x) tuple)
(flatten x)
`(,x)))
form)))
Tip
Is there more than that in your file?
If youâve been composing in your editor (rather than directly in the REPL)
like youâre supposed to,
youâve probably accumulated some junk from experiments.
Donât delete it yet!
Experiments often make excellent test cases.
Wrap them in top-level assure
forms.
In a larger project, you might move them to unittest
modules.
Additionally, the Lissp REPL was designed for compatibility with doctest
,
although that wonât test the compilation from Lissp to Python
(making it less useful for testing macros).
Given all of this in a file named macros.lissp
,
you can start a subREPL with these already loaded using the shell command
$ lissp -ic "(hissp..interact (vars macros.))"
rather than pasting them all in again.
To use your macros from other Lissp modules,
use their fully-qualified names,
abbreviate the qualifier with alias
,
or (if you must) attach
them to your current moduleâs _macro_
object.
That last one would require that your macros also be available at run time,
although there are ways to avoid that if you need to.
See the prelude
expansion for a hint.
You can use the resulting macro as a shorter lambda for higher-order functions:
#> (list (map (L add X1 X1) (range 10)))
>>> list(
... map(
... # L
... (lambda X1:
... add(
... X1,
... X1)),
... range(
... (10))))
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
Itâs still a little awkward.
It feels like the add
should be in the first position,
but thatâs taken by the L
.
We can fix that with a reader macro.
Reader Syntax#
To use reader macros unqualified,
you must define them in _macro_
with a name ending in a #
.
(defmacro X\# (expr)
`(L ,@expr))
We have to escape the #
with a backslash
or the reader will recognize the name as a macro rather than a symbol
and immediately try to apply it to (expr)
, which is not what we want.
Notice that we still used a defmacro
,
like we do for compiler macros.
Itâs the way you invoke it (with a reader tag#
) that makes it happen at read time:
#> (list (map X#(add X1 X1) ; Read-time expansion.
#.. (range 10)))
>>> list(
... map(
... # __main__.._macro_.L
... (lambda X1:
... add(
... X1,
... X1)),
... range(
... (10))))
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
#> (list (map (X\# (add X1 X1)) ; Compile-time expansion.
#.. (range 10)))
>>> list(
... map(
... # XQzHASH_
... # __main__.._macro_.L
... (lambda X1:
... add(
... X1,
... X1)),
... range(
... (10))))
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
Caution
Avoid side effects in reader macros.
Well-written reader macros should not have side effects at read time,
or at least make them idempotent.
Tooling that reads Lissp may have to backtrack
or restart reading of an invalid form.
E.g. before compiling a form,
the bundled LisspREPL
attempts to read it to see if it is complete.
If it isnât, it will ask for another line and attempt to read it again.
Thus, a reader macro on the first line will get evaluated again for each line input after,
until the form is completed or aborted.
Reader macros like this effectively create new reader syntax by reinterpreting existing reader syntax.
So now we have function literals.
These are very similar to the function literals in Clojure, and we implemented them from scratch in about a page of Lissp code. Thatâs the power of metaprogramming. You can copy features from other languages, tweak them, and experiment with your own.
Clojureâs version still has a couple more features. Letâs add them.
Catch-All Parameter#
(defmacro L (: :* expr)
`(lambda (,@(map (lambda (i)
(.format "X{}" i))
(range 1 (add 1 (max-X expr))))
:
,@(when (contains (flatten expr)
'Xi)
`(:* ,'Xi)))
,expr))
#> (X#(print X1 X2 Xi) 1 2 3 4 5)
>>> # __main__.._macro_.L
... (lambda X1,X2,*Xi:
... print(
... X1,
... X2,
... Xi))(
... (1),
... (2),
... (3),
... (4),
... (5))
1 2 (3, 4, 5)
How does it work? Look at whatâs changed. Here they are again.
;; old version
(defmacro L (: :* expr)
`(lambda ,(map (lambda (i)
(.format "X{}" i))
(range 1 (add 1 (max-X expr))))
,expr))
;; new version
(defmacro L (: :* expr)
`(lambda (,@(map (lambda (i)
(.format "X{}" i))
(range 1 (add 1 (max-X expr))))
:
,@(when (contains (flatten expr)
'Xi)
`(:* ,'Xi)))
,expr))
We splice the result of the logic that made the numbered parameters from the old version into the new parameters tuple. Following that is the colon separator. Remember that itâs always allowed in Hisspâs lambda forms, even if you donât need it, which makes this kind of metaprogramming easier.
Following that is the code for a star arg.
The Xi
is an anaphor,
so it must be interpolated into the template to prevent automatic qualification.
The when
macro will return an empty tuple when its condition is false.
Attempting to splice in an empty tuple conveniently doesnât do anything
(like ânil punningâ in other Lisps),
so the Xi
anaphor is only present in the parameters tuple when the
(flattened) expr
contains
it.
Implied Number 1#
Clojureâs version has one more feature:
the name of the first parameter doesnât need the 1
,
but itâs allowed.
The more special cases you have to add, the more complex the macro might get.
Here you go:
(defmacro L (: :* expr)
`(lambda (,@(map (lambda (i)
(.format "X{}" i))
(range 1 (add 1 (|| (max-X expr)
(contains (flatten expr)
'X)))))
:
,@(when (contains (flatten expr)
'Xi)
`(:* ,'Xi)))
,(if-else (contains (flatten expr)
'X)
`(let (,'X ,'X1)
,expr)
expr)))
#> (list (map X#(add X X1) (range 10)))
>>> list(
... map(
... # __main__.._macro_.L
... (lambda X1:
... # __main__.._macro_.let
... (lambda X=X1:
... add(
... X,
... X1))()),
... range(
... (10))))
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
Now both X
and X1
refer to the same value,
even if you mix them.
Read the macro and its outputs carefully.
This version uses a bool pun.
Recall that False
is a special case of 0
and True
is a special case of 1
in Python.
Results#
Are we shorter than Python now?
lambda x:x*x
%#(* % %)
Did we lose generality? Yes, but not much. You canât really nest these. The parameters get generated even if the only occurrence in the expression is quoted. This is the kind of thing to be aware of. If youâre not sure about something, try it in the REPL. But Clojureâs version has the same problems, and it gets used quite a lot.
Why You Should Be Reluctant to Use Python Injections#
Suppose we wanted to use Python infix notation for a complex formula.
Do you see the problem with this?
%#(.#"(-%2 + (%2**2 - 4*%1*%3)**0.5)/(2*%1)")
This was supposed to be the quadratic formula.
The %
is an operator in Python,
and it canât be unary.
In an injection you would have to spell it using the munged name QzPCENT_
.
But what if we had kept the X
?
#> X#(.#"(-X2 + (X2**2 - 4*X1*X3)**0.5)/(2*X1)")
>>> # __main__.._macro_.L
... (lambda :(-X2 + (X2**2 - 4*X1*X3)**0.5)/(2*X1)())
<function <lambda> at ...>
Look at the Python compilation. It looks like weâre trying to invoke the formula itself, which would evaluate to a number, not a callable, so this doesnât really make sense.
The macro is expecting at least one function in prefix notation. Sure, the macro could be modified, but maybe we can do the divide in prefix and keep the others infix? This doesnât look too bad if you think of it like a fraction bar.
#> X#(truediv .#"(-X2 + (X2**2 - 4*X1*X3)**0.5)"
#.. .#"(2*X1)")
>>> # __main__.._macro_.L
... (lambda :
... truediv(
... (-X2 + (X2**2 - 4*X1*X3)**0.5),
... (2*X1)))
<function <lambda> at ...>
Now the formula looks right,
but look at the compiled Python output.
This lambda takes no parameters!
Python injections hide information that code-reading macros need to work.
A macro that doesnât have to read the code,
like our L3
(or the bundled XYZ#
), would have worked fine.
The code-reading macro was unable to detect any matching symbols because it doesnât look inside the injected strings. In principle, it could have, but it might be a lot more work if you want it to be reliable. It could function if the highest parameter also appeared outside the string, but at that point, you might as well use a normal lambda.
Regex might be good enough for a simple case like this, but even if you write it very carefully, are you sure youâre catching all the edge cases? To really do it right, youâd have to parse the Python to AST, understand the structure (not exactly trivial), search it, and then keep it up to date with new versions of Python, since itâs not an especially stable API.
The whole point of using Hissp instead is so you donât have to do all this. Hissp is a kind of AST with lower complexity. Itâs just tuples. Stay out of parsing text.
Arguably, our final %#
or X#
macro didnât do it right either,
since it still detects the anaphors even if theyâre quoted,
but this level of correctness is good enough for Clojureâs function literals,
which have the same issue.
A simple basic syntax means there are relatively few edge cases you have to be aware of.
Hissp is so simple that a full code-walking macro would only have to pre-expand all macros,
and handle atoms, calls, quote
, and lambda
.
If you add injections to the list, then you also have to handle the entirety of all Python expressions. Donât expect Hissp macros to do this. Be reluctant to use Python injections, and be aware of where they might break things. Theyâre mainly useful as performance optimizations (but can be convenient when used judiciously). In principle, you should be able to do everything else without them.
More Literals#
While other data types in code must be built up from the primitive notation, Python has built-in notation for certain common ones. (And Lissp inherits most of these.)
This can be very convenient compared to the alternative. Imagine if you had to represent text as lists of numbers. Thatâs closer to what the machine uses in memory. Many common programming tasks would become very tedious that way. Thus, the need for string literal notation.
But the available notations are somewhat arbitrary. Many languages in common use lack Pythonâs notation for complex numbers, for example. Python, on the other hand, currently lacks built-in notation for exact fractions, which many Lisps include. Other languages made other selections, which may make them more or less convenient for certain problem domains.
What notations would an ideal language have? Every conceivable âprimitiveâ? Such a language would be more difficult to learn. Itâs much easier to familiarize oneself with a small set of primitive notations, and the means of combination. And in any case, many desirable notations would collide and then be ambiguous.
Hissp has a better way: extensibility through simplicity.
With Lisspâs reader macros, we can create new notation as-needed, with an overhead of just a few characters for a tag to disambiguate from the built-ins (and each other). You only have to learn a new notation when itâs worth your while.
Hexadecimal#
You can use Pythonâs int
builtin to convert a string containing a hexadecimal
number to the corresponding integer value.
>>> int("FF", 16)
255
Of course, Python already has a built-in notation for this,
disambiguated from normal base-ten ints using the 0x
tag.
>>> 0xFF
255
But what if it didnât?
About the best Python could do would be something like this.
>>> def b16(x):
... return int(x, 16)
...
>>> b16("FF")
255
Lissp gives us a better option.
(defmacro \16\# (x)
(int x 16))
Weâve defined a tag that turns hexadecimal strings into ints. And it does it so at read time. Thereâs no run-time overhead for the conversion; the result is compiled in.
This works,
#> 16#FF
>>> (255)
255
however, this doesnât.
#> 16#12
Traceback (most recent call last):
...
TypeError: int() can't convert non-string with explicit base
Whatâs going on?
Well, FF
is a valid identifier,
so it reads as a string containing that identifier,
but 12
is a valid base-ten int,
so itâs read as an int.
Pythonâs int
builtin doesnât do base conversions for those.
>>> int(12, 16)
Traceback (most recent call last):
...
TypeError: int() can't convert non-string with explicit base
No matter, this is an easy fix. Convert it to a string, and it works regardless of which type you start with.
>>> int(str(12), 16)
18
>>> int(str("FF"), 16)
255
New version.
(defmacro \16\# (x)
(int (str x) 16))
And now it works as well as the built-in notation.
#> '(16#ff 0xff 16#12 0x12 16#FEED_FACE 0xFEED_FACE)
>>> ((255),
... (255),
... (18),
... (18),
... (4277009102),
... (4277009102),)
(255, 255, 18, 18, 4277009102, 4277009102)
Or does it?
#> -16#1
File "<console>", line 1
-16#1
^
SyntaxError: Unknown reader macro Qz_16
The minus sign changed the tag!
If we donât want to define a new -16#
tag
(which is one option),
weâd have to put the sign after.
#> 16#-1
>>> (-1)
-1
That worked. Not.
#> 16#-FF
Traceback (most recent call last):
...
ValueError: invalid literal for int() with base 16: 'Qz_FF'
But this is fine.
#> 16#.#"-FF"
>>> (-255)
-255
Whatâs going on? Symbols do read as strings, but special characters get munged!
Remember, Lisspâs reader macros are applied to the next parsed object, not to the next token from the lexer, and certainly not to the raw character stream. This makes them more like Clojureâs tagged literals than like Common Lispâs reader macros.
The 16#
reader macro was very easy to implement when you only applied it to strings,
but since it can take multiple types,
you have to be sure to handle each of them.
Fortunately, we can fix this too, because munging is (mostly) reversible.
(defmacro \16\# (x)
"hexadecimal"
(int (hissp..demunge (str x))
16))
#> 16#-FF
>>> (-255)
-255
But whatâs the point of all of this when we already have hexadecimal notation built in? Well, with reader macros, you can implement any base you want.
(defmacro \6\# (x)
"seximal"
(int (str x) 6))
#> '(6#5 6#10 6#11 6#12)
>>> ((5),
... (6),
... (7),
... (8),)
(5, 6, 7, 8)
#> 6#543210
>>> (44790)
44790
Or you can add floating-point. Pythonâs notation canât do that.
(defmacro \16\# (x)
(let (x (hissp..demunge (str x)))
(if-else (re..search "[.Pp]" x)
(float.fromhex x)
(int x 16))))
#> '(16#FEED_FACE 16#-FEED.FACE 16#0.1 16#-.2 16#.4 16#-.8)
>>> ((4277009102),
... (-65261.97970581055),
... (0.0625),
... (-0.125),
... (0.25),
... (-0.5),)
(4277009102, -65261.97970581055, 0.0625, -0.125, 0.25, -0.5)
#> 16#Cp-2 ; 12.*2**-2
>>> (3.0)
3.0
Decimal#
Floating-point numbers are very useful, but they have some important limitations.
>>> 0.2 * 3
0.6000000000000001
Not quite what you expected? Binary floating-point canât represent exact fifths like decimal can. For exact decimals, you need decimal floating-point.
#> (mul (decimal..Decimal "0.2") 3)
>>> mul(
... __import__('decimal').Decimal(
... ('0.2')),
... (3))
Decimal('0.6')
Because it takes a single string argument,
you can already use decimal.Decimal
as a reader macro:
#> (mul decimal..Decimal#.#".2" 3)
>>> mul(
... __import__('pickle').loads( # Decimal('0.2')
... b'cdecimal\n'
... b'Decimal\n'
... b'(V0.2\n'
... b'tR.'
... ),
... (3))
Decimal('0.6')
Itâs kind of long though.
Notice that Hissp had to use a pickle here, because it had to emit code for the object, but Python has no literal notation for Decimal objects.
The reader macro didnât inject the code for making a Decimal, but an actual Decimal object, at read time. The pickling isnât done by the reader. It doesnât happen until the compiler has to emit something that it doesnât have a round-tripping representation for.
Something like this never goes through a pickle.
#> 'builtins..repr#decimal..Decimal#.#".2"
>>> "Decimal('0.2')"
"Decimal('0.2')"
It changed to a string before the compiler had to emit it.
Decimal can also take float objects, but this isnât always a good idea.
#> decimal..Decimal#.2
>>> __import__('pickle').loads( # Decimal('0.200000000000000011102230246251565404236316680908203125')
... b'cdecimal\n'
... b'Decimal\n'
... b'(V0.200000000000000011102230246251565404236316680908203125\n'
... b'tR.'
... )
Decimal('0.200000000000000011102230246251565404236316680908203125')
Thereâs no bug in Decimal. Thatâs just the exact binary fraction closest to one-fifth, given the available precision in a float, when represented as a decimal.
Maybe we could work around this if we converted to a string first? We can improve this a lot with a custom defmacro.
(defmacro \10\# (x)
`(decimal..Decimal ',(str x)))
#> 10#.2
>>> __import__('decimal').Decimal(
... '0.2')
Decimal('0.2')
This is better. Itâs a much shorter notation; there are no extra digits after the 2; and (because we used a template) it compiled to the straightforward code for a Decimal, rather than a pickle. This makes the compiled output a bit easier to read, but using code like this, rather than the Decimal object itself, may make it less useful as input to other macros. Which approach is better depends on your needs.
But thereâs still a subtle problem:
#> 10#.1234567890_1234567890_000
>>> __import__('decimal').Decimal(
... '0.12345678901234568')
Decimal('0.12345678901234568')
#> 10#.#".1234567890_1234567890_000"
>>> __import__('decimal').Decimal(
... '.1234567890_1234567890_000')
Decimal('0.12345678901234567890000')
We have limited precision when tagging a float instead of a string. If you donât need the precision, itâs fine. If you do, you can still use a string, but you have to be aware of this. Decimal also keeps trailing zeros to represent significant figures. But floats never do this, even when the precision is available.
It would be nice if the macro could deal with it for us, but thereâs just no getting around these issues when using a float. Lissp reader macros get the parsed object, and by then, some information has been lost. One could argue that a float literal written with more precision than is available should be a syntax error, but Python doesnât care.
In cases like this, itâs best to not use a float at all, but a string is not the only alternative available:
(defmacro \10\# (x)
`(decimal..Decimal ',(getitem x (slice 1 None))))
#> 10#:.1234567890_1234567890_000
>>> __import__('decimal').Decimal(
... '.1234567890_1234567890_000')
Decimal('0.12345678901234567890000')
With a control word like this, you get full precision and donât need a trailing double quote.
A Slice of Python#
Python has a powerful and compact notation for operating on slices of sequences.
It has three arguments: start, stop, and step.
Each one is optional, and defaults to None
.
>>> "abcdefg"[-1::-2]
'geca'
However, this notation is only valid in the context of a subscription operator []
.
It is possible to separate the operands using the slice
builtin,
but it comes at a cost.
(Weâll be reusing this simple âgecaâ test case as we iterate. Feel free to try others.)
>>> a = "abcdefg"
>>> b = slice(-1, None, -2)
>>> a[b]
'geca'
Thereâs the cost: this separated approach is much less concise compared to the slice notation.
Even without macros, Hissp can slice this way.
#> (operator..getitem "abcdefg" (slice -1 None -2))
>>> __import__('operator').getitem(
... ('abcdefg'),
... slice(
... (-1),
... None,
... (-2)))
'geca'
This is so much longer that one would be tempted to inject the Python version.
Unfortunately, the rest of the expression is often easier to write in Lissp.
You can usually work around this by using
let
to give an easily-injectable name to a complex operand,
but that adds as significant overhead.
#> (let (x "abcdefg") .#"x[-1::-2]")
>>> # let
... (lambda x=('abcdefg'):x[-1::-2])()
'geca'
We need a better abstraction.
Typically, in a Python function call, optional arguments would be skipped, and the remainder passed by keyword.
slice(-1, None, -2)
slice(-1, step=-2) # This doesn't work, but a new def could do it!
The slice
builtin doesnât support this, and, as you can see,
it wouldnât help much anyway,
saving only one character (or perhaps a few more with shorter names).
In the more compact slice notation, the omitted stop argument is implied by the colons, and the final argument is still passed positionally, without the overhead of an explicit name.
foo[-1::-2]
Doubling commas to imply omission like this would be a syntax error in Python.
slice(-1,,-2)
Itâs not an option for Lissp either, even with macros, because arguments are separated with whitespace. We could add delimiters, but theyâd need spaces around them as well.
Slice Notation as Object#
Slice notation really is hard to beat here, even in Python.
It would be nice if we could take just that part of it,
but it only works in the operator context.
Python does have a way to convert it: the __getitem__
method.
(More bundled macros incoming. Search Hisspâs docs if you canât figure out what they do.)
#> (define slicer ((type 'Slicer () (% '__getitem__ XY#Y))))
>>> # define
... __import__('builtins').globals().update(
... slicer=type(
... 'Slicer',
... (),
... # QzPCENT_
... (lambda x0,x1:{x0:x1})(
... '__getitem__',
... (lambda X,Y:Y)))())
#> .#"slicer[-1::-2]"
>>> slicer[-1::-2]
slice(-1, None, -2)
#> (getitem "abcdefg" .#"slicer[-1::-2]")
>>> getitem(
... ('abcdefg'),
... slicer[-1::-2])
'geca'
Getting better, but not actually shorter yet.
slice(-1, None, -2)
slice(-1, step=-2)
.#"slicer[-1::-2]"
With shorter names, we see thereâs a ways to go yet.
S(-1,N,-2) # S, N = slice, None
S(-1,c=-2) # S = lambda a=None, b=None, c=None: slice(a, b, c)
.#"S[-1::-2]" ; (define S slicer)
Time for Macros#
We can remove the getitem
overhead by using the bundled
get#
macro to make an itemgetter
.
#> (get#.#"slicer[-1::-2]" "abcdefg")
>>> __import__('operator').itemgetter(
... slicer[-1::-2])(
... ('abcdefg'))
'geca'
Notice we have two reader macros in a row now:
get#
and .#
.
We could consolidate these with a single reader macro.
Macros can expand to any Python object at the Hissp level, including code strings.
The slicer
part never changes,
so we could include that in the expansion.
And, as we learned earlier, we can often demunge
a symbol instead of using a string,
although you have to be careful.
(defmacro S\# e
`(op#itemgetter ,(.format "slicer{}" (hissp..demunge e))))
#> (S#[-1::-2] "abcdefg")
>>> __import__('operator').itemgetter(
... slicer[-1::-2])(
... ('abcdefg'))
'geca'
Compare.
(operator..getitem "abcdefg" (slice -1 None -2)) ; No macros.
(S#[-1::-2] "abcdefg") ; Slice-getter literal.
"abcdefg"[-1::-2] # Python slice notation.
We have made a lot of progress. This is pretty good. Python is better. Thereâs still room for improvement. Check this out.
#> '[ ; This is a symbol too!
>>> 'QzLSQB_'
'QzLSQB_'
Lissp doesnât care, but Parinfer likes to keep []
and {}
balanced.
Theyâre literal notation in Clojure, and sometimes used paired in other Lisps.
Currently, best practice is to keep them balanced, even in symbols,
But theyâre OK individually if you escape them.
Also, the reader macro is a bit sloppy.
Best practice is (usually) to surround string injections with ()
.
Sometimes it matters, and macros donât know their expansion context.
#> (.bit_length .#"7")
>>> 7.bit_length()
Traceback (most recent call last):
...
SyntaxError: invalid syntax
#> (.bit_length .#"(7)")
>>> (7).bit_length()
3
And slicer
is only valid where thatâs a global and hasnât been shadowed by a local.
This means the macro wouldnât work in another module,
and might subtly break if someone uses the wrong word.
Normally templates qualify symbols to avoid these problems,
but since the slicer
identifier was part of the string,
that never had a chance to happen.
Yet another reason to be cautious with string injections.
You usually want to run Hissp objects through
readerless
before embedding them in a code string.
This lets the compiler do the conversion to Python.
When run in a macro, the compiler will use the appropriate namespace:
its expansion context, not (necessarily) its definition context.
#> `slicer ; qualified
>>> '__main__..slicer'
'__main__..slicer'
#> (hissp..readerless `slicer) ; Qualified and compiled to code string.
>>> __import__('hissp').readerless(
... '__main__..slicer')
"__import__('builtins').globals()['slicer']"
Notice that even though the symbol was a string already, compiling did some extra processing in this context.
Putting that all together we get
(defmacro \[\# e
`(op#itemgetter ,(.format "({}[{})" (hissp..readerless `slicer)
(hissp..demunge e))))
Notice that this requires the ]
in the symbol itâs applied to.
This keeps it balanced. It also pretty well ensures the argument is a symbol
or at least a control word. Numbers donât contain a ]
.
Now look at what we can do.
#> ([#-1::-2]"abcdefg")
>>> __import__('operator').itemgetter(
... (__import__('builtins').globals()['slicer'][-1::-2]))(
... ('abcdefg'))
'geca'
#> ([#3]"abcdefg") ; Not restricted to slices.
>>> __import__('operator').itemgetter(
... (__import__('builtins').globals()['slicer'][3]))(
... ('abcdefg'))
'd'
#> (-> (@ "abc") ([#0]) ([#::-1]))
>>> # Qz_QzGT_
... __import__('operator').itemgetter(
... (__import__('builtins').globals()['slicer'][::-1]))(
... __import__('operator').itemgetter(
... (__import__('builtins').globals()['slicer'][0]))(
... # QzAT_
... (lambda *xs:[*xs])(
... ('abc'))))
'cba'
Amazing. Not quite as concise as Python, but really close. To within a few characters.
([#-1::-2]"abcdefg")
"abcdefg"[-1::-2]
But our version is more powerful.
Itâs a function object even when detached from the lookup context.
And, as a macro we programmed ourselves, itâs entirely customizable.
It is possible to do a little better with !
by eliminating the ()
and []
.
That gets us to within one character, but itâs probably not worth it.
This is good enough.
Letâs review. This section covered a number of advanced techniques.
Brackets in symbols. A code string macro leveraging partial Python syntax.
The need for parentheses in injections.
Demunging.
Calls to runtime helpers, even in other modules.
Qualifying symbols with template quote.
Compiling in macros using readerless
.
This macro produced a code injection. We already talked about why you should be reluctant to use those. This one is probably worth it. Pythonâs slice notation is that good. The alternative was injecting both operands, or using a far more verbose notation. This macro lets us use a concise notation from Python while injecting a minimal amount.
But what if we had nested a [#
usage inside our X#
function literals?
This would usually not be a problem since the slice arguments are numeric literals.
But what if one of the slice arguments was X
?
Thatâs still valid Python.
Normally, that would work in an injection.
But if thatâs the only X
, X#
wonât be able to find it.
Injections are somewhat opaque. Sometimes this is OK.
The [#
macro works best on simple literal arguments,
and works OK on local variables and their attributes:
the kind of things you wouldnât bother putting spaces around in Python.
These cases are very common.
For more complex expressions, itâs probably a bad idea.
You lose out on munging, module handles, and any macros.
For those cases, the extra overhead for using slice
is hardly noticeable.
Use the right tool for the job.
A Simpler Solution#
The itemgetter
function is a function factory;
itâs a function to make functions, at run time.
Using run time helpers like this is an important technique for writing macros,
but sometimes more work can be done at compile time.
If we write a lambda form ourselves,
itâs not necessary to separate the operands for the subscription operator,
which means we donât need the Slicer
helper class either.
Compare.
#> ((op#itemgetter (slice -1 None -2)) "abcdefg")
>>> __import__('operator').itemgetter(
... slice(
... (-1),
... None,
... (-2)))(
... ('abcdefg'))
'geca'
#> ((lambda a .#"a[-1::-2]") "abcdefg")
>>> (lambda a:a[-1::-2])(
... ('abcdefg'))
'geca'
Weâd be giving up a little transparency.
Notice the nice repr
provided by the operator.itemgetter
.
#> (op#itemgetter (slice -1 None -2))
>>> __import__('operator').itemgetter(
... slice(
... (-1),
... None,
... (-2)))
operator.itemgetter(slice(-1, None, -2))
The lambda object, on the other hand, is opaque.
#> (lambda a .#"a[-1::-2]")
>>> (lambda a:a[-1::-2])
<function <lambda> at 0x...>
But if we can eliminate the Slicer
this is probably worth it.
We can pretty easily expand to this form. Our previous macro was almost there.
(defmacro \[\# e
`(lambda ,'a ,(.format "({}[{})" 'a (hissp..demunge e))))
It works.
#> (.\[\# _macro_ '-1::-2]) ; shows Hissp expansion for [#-1::-2]
>>> _macro_.QzLSQB_QzHASH_(
... 'Qz_1QzCOLON_QzCOLON_Qz_2QzRSQB_')
('lambda', 'a', '(a[-1::-2])')
#> ([#-1::-2] "abcdefg")
>>> (lambda a:(a[-1::-2]))(
... ('abcdefg'))
'geca'
Maybe even better than expected.
#> ([#1][1] '(foo bar))
>>> (lambda a:(a[1][1]))(
... ('foo',
... 'bar',))
'a'
Everything in the atom after the #
is Python code.
The initial [
does have to be closed,
but after that,
other Python expressions work too.
The format string could even be simplified to "(a[{})"
,
but thereâs a subtle flaw which is reason enough not to follow through with that.
#> (let (a -1)
#.. ([#a::-2] "abcdefg"))
>>> # let
... (lambda a=(-1):
... (lambda a:(a[a::-2]))(
... ('abcdefg')))()
Traceback (most recent call last):
...
TypeError: slice indices must be integers or None or have an __index__ method
Yet it works fine with b
.
#> (let (b -1)
#.. ([#b::-2] "abcdefg"))
>>> # let
... (lambda b=(-1):
... (lambda a:(a[b::-2]))(
... ('abcdefg')))()
'geca'
See the problem?
Look at the Python compilation.
Our slicer
version didnât have this flaw.
Auto-qualification isnât compatible with local variables, but since we didnât want this accidental anaphor, we should suppress the qualification with a gensym instead of a symbol interpolation.
(defmacro \[\# e
`(lambda ($#G) ,(.format "({}[{})" '$#G (hissp..demunge e))))
Read this carefully.
$#
only works inside of templates,
but itâs still allowed in an unquote context inside of a template;
the template context hasnât completely turned off.
unquote_context
and gensym_context
are both tracked by the reader.
Since we want the symbol itself,
not its value,
we need to quote it with '
.
Remember, reader macros apply inside-out,
like functions,
so the $#
macro applies before the '
does.
It works.
#> ([#-1::-2] "abcdefg")
>>> (lambda _QzAVTK4YRWz_G:(_QzAVTK4YRWz_G[-1::-2]))(
... ('abcdefg'))
'geca'
Notice the gensym in the expansion (your template hash may be diferent than mine),
which would prevent the kind of accidental name collision we saw in our let
a
example.
And this is the bundled version
, sans docstring.
It has no dependencies; no helpers.
Now you understand how it works, know its limitations,
its tricks,
and how to implement it yourself.
Superpower stolen.