-
Notifications
You must be signed in to change notification settings - Fork 15
/
Copy pathsyntax-case.tex
1137 lines (978 loc) · 42.2 KB
/
syntax-case.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{{\tt syntax-case}}
\label{syntaxcasechapter}
% should include algebra for marks and substitutions
% ---see Waddell's dissertation or POPL '99 module paper
% but don't want to rule out van Tonder's shallow-binding approach
The \defrsixlibrary{syntax-case} library
provides
support for writing low-level macros
in a high-level style, with automatic syntax checking, input
destructuring, output restructuring, maintenance of lexical scoping
and referential transparency (hygiene), and support for controlled
identifier capture.
\section{Hygiene}
\label{hygienesection}
% hygiene condition for macro expansion
% (Kohlbecker, E.E., Friedman, D.P., Felleisen, M., Duba, B. 'Hygienic macro expansion' (1986))
% "Generated identifiers that become binding instances in the completely
% expanded program must only bind variables that are generated at the same
% transcription step."
Barendregt's \emph{hygiene condition}~\cite{barendregt} for the
lambda calculus is an informal notion that requires the free variables of
an expression $N$ that is to be substituted into another expression $M$ not to
be captured by bindings in $M$ when such capture is not intended.
Kohlbecker, et al~\cite{hygienic} propose a corresponding
\emph{hygiene condition for macro expansion} that applies in all situations
where capturing is not explicit:
``Generated identifiers that become binding instances in
the completely expanded program must only bind variables that
are generated at the same transcription step''.
In the terminology of this document, the ``generated identifiers'' are
those introduced by a transformer rather than those present in the form
passed to the transformer, and a ``macro transcription step'' corresponds
to a single call by the expander to a transformer.
Also, the hygiene condition applies to all introduced bindings rather than
to introduced variable bindings alone.
This leaves open what happens to an introduced identifier that appears
outside the scope of a binding introduced by the same call.
Such an identifier refers to the lexical binding in effect where it
appears (within a {\cf syntax} \hyper{template};
see section~\ref{syntaxcasesection}) inside the transformer body or one of
the helpers it calls.
This is essentially the referential transparency property described
by Clinger and Rees~\cite{macrosthatwork}.
Thus, the hygiene condition can be restated as follows:
\begin{quotation}
\noindent
A binding for an identifier introduced into the output of a transformer
call from the expander must capture only references to the identifier
introduced into the output of the same transformer call.
A reference to an identifier introduced into the output of a transformer
refers to the closest enclosing binding for the introduced identifier or,
if it appears outside of any enclosing binding for the introduced
identifier, the closest enclosing lexical binding where the identifier
appears (within a {\cf syntax} \hyper{template})
inside the transformer body or one of the helpers it calls.
\end{quotation}
Explicit captures are handled via {\cf datum\coerce{}syntax}; see
section~\ref{conversionssection}.
Operationally, the expander can maintain hygiene with the help of
\emph{marks\mainindex{mark}} and \emph{substitutions\mainindex{substitution}}.
Marks are applied selectively by the expander to the output of each
transformer it invokes, and substitutions are applied to the portions
of each binding form that are supposed to be within the scope of the bound
identifiers.
Marks are used to distinguish like-named identifiers that are
introduced at different times (either present in the source or introduced
into the output of a particular transformer call), and substitutions are
used to map identifiers to their expand-time values.
Each time the expander encounters a macro use, it applies an
\defining{antimark} to the input form, invokes the associated transformer,
then applies a fresh mark to the output.
Marks and antimarks cancel, so the portions of the input that appear in
the output are effectively left unmarked, while the portions of the output
that are introduced are marked with the fresh mark.
Each time the expander encounters a binding form it creates a set of
substitutions, each mapping one of the (possibly marked) bound identifiers
to information about the binding.
(For a {\cf lambda} expression, the expander might map each bound
identifier to a representation of the formal parameter in the output of
the expander.
For a {\cf let-syntax} form, the expander might map each bound
identifier to the associated transformer.)
These substitutions are applied to the portions of the input form in
which the binding is supposed to be visible.
Marks and substitutions together form a \defining{wrap} that is layered on the
form being processed by the expander and pushed down toward the leaves as
necessary.
A wrapped form is referred to as a \defining{wrapped syntax object}.
Ultimately, the wrap may rest on a leaf that represents an identifier, in
which case the wrapped syntax object is also referred to
as an \emph{identifier}.
An identifier contains a name along with the wrap.
(Names are typically represented by symbols.)
When a substitution is created to map an identifier to an expand-time
value, the substitution records the name of the identifier and
the set of marks that have been applied to that identifier, along
with the associated expand-time value.
The expander resolves identifier references by looking for the latest
matching substitution to be applied to the identifier, i.e., the outermost
substitution in the wrap whose name and marks match the name and
marks recorded in the substitution.
The name matches if it is the same name (if using symbols, then by
{\cf eq?}), and the marks match if the marks recorded with the
substitution are the same as those that appear \emph{below} the
substitution in the wrap, i.e., those that were applied \emph{before} the
substitution.
Marks applied after a substitution, i.e., appear over the substitution in
the wrap, are not relevant and are ignored.
An algebra that defines how marks and substitutions work more precisely is
given in section~2.4 of Oscar Waddell's PhD thesis~\cite{Waddellphd}.
\section{Syntax objects}
\label{syntaxobjectssection}
A \defining{syntax object} is a representation of a Scheme form that contains
contextual information about the form in addition to its structure.
This contextual information is used by the expander to maintain
lexical scoping and may also be used by an implementation to maintain
source-object correlation~\cite{syntacticabstraction}.
A syntax object may be wrapped, as described in section~\ref{hygienesection}.
It may also be unwrapped, fully or partially, i.e., consist of list and
vector structure with wrapped syntax objects or nonsymbol values at the
leaves.
More formally, a syntax object is:
\begin{itemize}
\item a pair of syntax objects,
\item a vector of syntax objects,
\item a nonpair, nonvector, nonsymbol value, or
\item a wrapped syntax object.
\end{itemize}
The distinction between the terms ``syntax object'' and ``wrapped syntax
object'' is important.
For example, when invoked by the expander, a transformer
(section~\ref{transformerssection}) must accept a wrapped syntax object but
may return any syntax object, including an unwrapped syntax object.
Syntax objects representing identifiers are always wrapped and are distinct
from other types of values.
Wrapped syntax objects that are not identifiers may or may not be distinct
from other types of values.
\section{Transformers}
\label{transformerssection}
In {\cf define-syntax} (report
section~\extref{report:define-syntax}{Syntax definitions}), {\cf
let-syntax}, and {\cf letrec-syntax} forms (report
section~\extref{report:let-syntax}{Binding constructs for syntactic
keywords}), a binding for a syntactic keyword is an expression
that evaluates to a \defining{transformer}\index{macro transformer}.
A transformer is a \defining{transformation procedure} or a
\defining{variable transformer}.
A transformation procedure is a procedure that must accept one
argument, a wrapped syntax object (section~\ref{syntaxobjectssection})
representing the input, and return a syntax object
(section~\ref{syntaxobjectssection}) representing the output.
The transformer is called by the expander whenever a reference to
a keyword with which it has been associated is found.
If the keyword appears in the car of a list-structured
input form, the transformer receives the entire list-structured
form, and its output replaces the entire form.
Except with variable transformers (see below),
if the keyword is found in any other definition or expression
context, the transformer receives a wrapped syntax object representing
just the keyword reference, and its output replaces just the reference.
Except with variable transformers, an exception with condition
type {\cf\&syntax} is raised if the keyword appears on the left-hand side
of a {\cf set!} expression.
\begin{entry}{%
\proto{make-variable-transformer}{ proc}{procedure}}
\domain{\var{Proc} should accept one argument,
a wrapped syntax object, and return a syntax object.}
The {\cf make-variable-transformer} procedure creates a
\defining{variable transformer}.
A variable transformer is like an ordinary transformer except
that, if a keyword associated with a variable transformer appears on
the left-hand side of a {\cf set!} expression, an exception is
not raised.
Instead, \var{proc} is called with a
wrapped syntax object representing the entire {\cf set!} expression as
its argument, and its return value replaces the entire {\cf set!}
expression.
\implresp The implementation must check the restrictions on \var{proc}
only to the extent performed by applying it as described.
An
implementation may check whether \var{proc} is an appropriate argument
before applying it.
\end{entry}
\section{Parsing input and producing output}
\label{syntaxcasesection}
Transformers can destructure their input with {\cf syntax-case} and rebuild
their output with {\cf syntax}.
\begin{entry}{%
\pproto{(syntax-case \hyper{expression} (\hyper{literal} \dotsfoo)}{\exprtype}
\mainschindex{syntax-case}{\tt\obeyspaces%
\hspace{2em}\hyper{syntax-case clause} \dotsfoo)}\\
\litprotonoindex{\_}
\litprotonoindex{...}}\schindex{\_}\schindex{...}
\syntax Each \hyper{literal} must be an identifier.
Each \hyper{syntax-case clause} must take one of the following two forms.
\begin{scheme}
(\hyper{pattern} \hyper{output expression})
(\hyper{pattern} \hyper{fender} \hyper{output expression})%
\end{scheme}
\hyper{Fender} and \hyper{output expression} must be
\hyper{expression}s.
A \hyper{pattern} is an identifier, constant, or one of the following.
\begin{schemenoindent}
(\hyper{pattern} \ldots)
(\hyper{pattern} \hyper{pattern} \ldots . \hyper{pattern})
(\hyper{pattern} \ldots \hyper{pattern} \hyper{ellipsis} \hyper{pattern} \ldots)
(\hyper{pattern} \ldots \hyper{pattern} \hyper{ellipsis} \hyper{pattern} \ldots . \hyper{pattern})
\#(\hyper{pattern} \ldots)
\#(\hyper{pattern} \ldots \hyper{pattern} \hyper{ellipsis} \hyper{pattern} \ldots)%
\end{schemenoindent}
An \hyper{ellipsis} is the identifier ``{\cf ...}'' (three periods).\schindex{...}
An identifier appearing within a \hyper{pattern} may be an underscore
(~{\cf \_}~), a literal identifier listed in the list of literals
{\cf (\hyper{literal} \dotsfoo)}, or an ellipsis (~{\cf ...}~).
All other identifiers appearing within a \hyper{pattern} are
\textit{pattern variables\mainindex{pattern variable}}.
It is a syntax violation if an ellipsis or underscore appears in {\cf (\hyper{literal} \dotsfoo)}.
{\cf \_} and {\cf ...} are the same as in the \rsixlibrary{base} library.
Pattern variables match arbitrary input subforms and
are used to refer to elements of the input.
It is a syntax violation if the same pattern variable appears more than once in a
\hyper{pattern}.
Underscores also match arbitrary input subforms but are not pattern variables
and so cannot be used to refer to those elements.
Multiple underscores may appear in a \hyper{pattern}.
A literal identifier matches an input subform if and only if the input
subform is an identifier and either both its occurrence in the input
expression and its occurrence in the list of literals have the same
lexical binding, or the two identifiers have the same name and both have
no lexical binding.
A subpattern followed by an ellipsis can match zero or more elements of
the input.
More formally, an input form $F$ matches a pattern $P$ if and only if
one of the following holds:
\begin{itemize}
\item $P$ is an underscore (~{\cf \_}~).
\item $P$ is a pattern variable.
\item $P$ is a literal identifier
and $F$ is an equivalent identifier in the
sense of {\cf free-identifier=?}
(section~\ref{identifierpredicatessection}).
\item $P$ is of the form
{\cf ($P_1$ \dotsfoo{} $P_n$)}
and $F$ is a list of $n$ elements that match $P_1$ through
$P_n$.
\item $P$ is of the form
{\cf ($P_1$ \dotsfoo{} $P_n$ . $P_x$)}
and $F$ is a list or improper list of $n$ or more elements
whose first $n$ elements match $P_1$ through $P_n$
and
whose $n$th cdr matches $P_x$.
\item $P$ is of the form
{\cf ($P_1$ \dotsfoo{} $P_k$ $P_e$ \hyper{ellipsis} $P_{m+1}$ \dotsfoo{} $P_n$)},
where \hyper{ellipsis} is the identifier {\cf ...}
and $F$ is a proper list of $n$
elements whose first $k$ elements match $P_1$ through $P_k$,
whose next $m-k$ elements each match $P_e$,
and
whose remaining $n-m$ elements match $P_{m+1}$ through $P_n$.
\item $P$ is of the form
{\cf ($P_1$ \dotsfoo{} $P_k$ $P_e$ \hyper{ellipsis} $P_{m+1}$ \dotsfoo{} $P_n$ . $P_x$)},
where \hyper{ellipsis} is the identifier {\cf ...}
and $F$ is a list or improper list of $n$
elements whose first $k$ elements match $P_1$ through $P_k$,
whose next $m-k$ elements each match $P_e$,
whose next $n-m$ elements match $P_{m+1}$ through $P_n$,
and
whose $n$th and final cdr matches $P_x$.
\item $P$ is of the form
{\cf \#($P_1$ \dotsfoo{} $P_n$)}
and $F$ is a vector of $n$ elements that match $P_1$ through
$P_n$.
\item $P$ is of the form
{\cf \#($P_1$ \dotsfoo{} $P_k$ $P_e$ \hyper{ellipsis} $P_{m+1}$ \dotsfoo{} $P_n$)},
where \hyper{ellipsis} is the identifier {\cf ...}
and $F$ is a vector of $n$ or more elements
whose first $k$ elements match $P_1$ through $P_k$,
whose next $m-k$ elements each match $P_e$,
and
whose remaining $n-m$ elements match $P_{m+1}$ through $P_n$.
\item $P$ is a pattern datum (any nonlist, nonvector, nonsymbol
datum) and $F$ is equal to $P$ in the sense of the
{\cf equal?} procedure.
\end{itemize}
\semantics
A {\cf syntax-case} expression first evaluates \hyper{expression}.
It then attempts to match
the \hyper{pattern} from the first \hyper{syntax-case clause} against the resulting value,
which is unwrapped as necessary to perform the match.
If the pattern matches the value and no
\hyper{fender} is present,
\hyper{output expression} is evaluated and its value returned as the
value of the {\cf syntax-case} expression.
If the pattern does not match the value, {\cf syntax-case} tries
the second \hyper{syntax-case clause}, then the third, and so on.
It is a syntax violation if the value does not match any of the patterns.
If the optional \hyper{fender} is present, it serves as an additional
constraint on acceptance of a clause.
If the \hyper{pattern} of a given \hyper{syntax-case clause} matches the input value,
the corresponding \hyper{fender} is evaluated.
If \hyper{fender} evaluates to a true value, the clause is accepted;
otherwise, the clause is rejected as if the pattern had failed to match
the value.
Fenders are logically a part of the matching process, i.e., they
specify additional matching constraints beyond the basic structure of
the input.
Pattern variables contained within a clause's
\hyper{pattern} are bound to the corresponding pieces of the input
value within the clause's \hyper{fender} (if present) and
\hyper{output expression}.
Pattern variables can be referenced only within {\cf syntax}
expressions (see below).
Pattern variables occupy the same name space as program variables and
keywords.
If the {\cf syntax-case} form is in tail context, the \hyper{output
expression}s are also in tail position.
\end{entry}
\begin{entry}{%
\proto{syntax}{ \hyper{template}}{\exprtype}}
\begin{note}
{\cf \#'\hyper{template}} is equivalent to {\cf (syntax
\hyper{template})}.
\end{note}
A {\cf syntax} expression is similar to a {\cf quote} expression
except that (1) the values of pattern variables appearing within
\hyper{template} are inserted into \hyper{template}, (2) contextual
information associated both with the input and with the template is
retained in the output to support lexical scoping, and (3) the value
of a {\cf syntax} expression is a syntax object.
A \hyper{template} is a pattern variable, an identifier that
is not a pattern
variable, a pattern datum, or one of the following.
\begin{scheme}
(\hyper{subtemplate} \ldots)
(\hyper{subtemplate} \ldots . \hyper{template})
\#(\hyper{subtemplate} \ldots)%
\end{scheme}
A \hyper{subtemplate} is a \hyper{template} followed by zero or more ellipses.
The value of a {\cf syntax} form is a copy of \hyper{template} in which
the pattern variables appearing within the template are replaced with
the input subforms to which they are bound.
Pattern data and identifiers that are not pattern variables
or ellipses are copied directly into the output.
A subtemplate followed by an ellipsis expands
into zero or more occurrences of the subtemplate.
Pattern variables that occur in subpatterns followed by one or more
ellipses may occur only in subtemplates that are
followed by (at least) as many ellipses.
These pattern variables are replaced in the output by the input
subforms to which they are bound, distributed as specified.
If a pattern variable is followed by more ellipses in the subtemplate
than in the associated subpattern, the input form is replicated as
necessary.
The subtemplate must contain at least one pattern variable from a
subpattern followed by an ellipsis, and for at least one such pattern
variable, the subtemplate must be followed by exactly as many ellipses as
the subpattern in which the pattern variable appears.
(Otherwise, the expander would not be able to determine how many times the
subform should be repeated in the output.)
It is a syntax violation if the constraints of this paragraph are not met.
A template of the form
{\cf (\hyper{ellipsis} \hyper{template})} is identical to \hyper{template}, except that
ellipses within the template have no special meaning.
That is, any ellipses contained within \hyper{template} are
treated as ordinary identifiers.
In particular, the template {\cf (... ...)} produces a single
ellipsis.
This allows macro uses to expand into forms containing
ellipses.
\label{wrappingrules}
The output produced by {\cf syntax} is wrapped or unwrapped according to
the following rules.
\begin{itemize}
\item the copy of {\cf (\hyperi{t} . \hyperii{t})} is a pair if \hyperi{t}
or \hyperii{t} contain any pattern variables,
\item the copy of {\cf (\hyper{t} \hyper{ellipsis})} is a list if \hyper{t}
contains any pattern variables,
\item the copy of {\cf \#(\hyperi{t} ... \hypern{t})} is a vector if any of
\hyperi{t},~\dots,~\hypern{t} contain any pattern variables, and
\item the copy of any portion of \hyper{t} not containing any pattern variables
is a wrapped syntax object.
\end{itemize}
The input subforms inserted in place of the pattern variables are wrapped
if and only if the corresponding input subforms are wrapped.
\end{entry}
The following definitions of {\cf or} illustrate {\cf syntax-case}
and {\cf syntax}.
The second is equivalent to the first but uses the {\cf \#'}
prefix instead of the full {\cf syntax} form.
\begin{schemenoindent}
(define-syntax or
(lambda (x)
(syntax-case x ()
[(\_) (syntax \schfalse{})]
[(\_ e) (syntax e)]
[(\_ e1 e2 e3 ...)
(syntax (let ([t e1])
(if t t (or e2 e3 ...))))])))
(define-syntax or
(lambda (x)
(syntax-case x ()
[(\_) \#'\schfalse{}]
[(\_ e) \#'e]
[(\_ e1 e2 e3 ...)
\#'(let ([t e1])
(if t t (or e2 e3 ...)))])))%
\end{schemenoindent}
The examples below define \emph{identifier macros\mainindex{identifier
macro}}, macro uses
supporting keyword references that do not necessarily appear in the first
position of a list-structured form.
The second example uses {\cf make-variable-transformer} to handle the case
where the keyword appears on the left-hand side of a
{\cf set!} expression.
\begin{scheme}
(define p (cons 4 5))
(define-syntax p.car
(lambda (x)
(syntax-case x ()
[(\_ . rest) \#'((car p) . rest)]
[\_ \#'(car p)])))
p.car \ev 4
(set! p.car 15) \ev \exception{\&syntax}
(define p (cons 4 5))
(define-syntax p.car
(make-variable-transformer
(lambda (x)
(syntax-case x (set!)
[(set! \_ e) \#'(set-car! p e)]
[(\_ . rest) \#'((car p) . rest)]
[\_ \#'(car p)]))))
(set! p.car 15)
p.car \ev 15
p \ev (15 5)%
\end{scheme}
\section{Identifier predicates}
\label{identifierpredicatessection}
\begin{entry}{%
\proto{identifier?}{ obj}{procedure}}
Returns \schtrue{} if \var{obj} is an identifier, i.e., a
syntax object representing an identifier, and \schfalse{} otherwise.
The {\cf identifier?} procedure is often used within a fender to verify
that certain subforms of an input form are identifiers, as in the
definition of {\cf rec}, which creates self-contained
recursive objects, below.
\begin{scheme}
(define-syntax rec
(lambda (x)
(syntax-case x ()
[(\_ x e)
(identifier? \#'x)
\#'(letrec ([x e]) x)])))
(map (rec fact
(lambda (n)
(if (= n 0)
1
(* n (fact (- n 1))))))
'(1 2 3 4 5)) \lev (1 2 6 24 120)
(rec 5 (lambda (x) x)) \ev \exception{\&syntax}%
\end{scheme}
\end{entry}
The procedures {\cf bound-identifier=?} and {\cf free-\hp{}identifier=?}
each take two identifier arguments and return \schtrue{} if their
arguments are equivalent and \schfalse{} otherwise.
These predicates are used to compare identifiers according to their
\emph{intended use} as free references or bound identifiers in a given
context.
\begin{entry}{%
\proto{bound-identifier=?}{ \vari{id} \varii{id}}{procedure}}
\domain{\vari{Id} and \varii{id} must be identifiers.}
The procedure {\cf bound-\hp{}identifier=?} returns \schtrue{} if a
binding for one would capture a reference to the other in the output of
the transformer, assuming that the reference appears within the scope of
the binding, and \schfalse{} otherwise.
In general, two identifiers are {\cf bound-identifier=?} only if
both are present in the original program or both are introduced by the
same transformer application
(perhaps implicitly---see {\cf datum\coerce{}syntax}).
Operationally, two identifiers are
considered equivalent by {\cf bound-identifier=?} if and only if they
have the same name and same marks (section~\ref{hygienesection}).
The {\cf bound-identifier=?} procedure can be used for detecting
duplicate identifiers in a binding construct or for other
preprocessing of a binding construct that requires detecting instances
of the bound identifiers.
\end{entry}
\begin{entry}{%
\proto{free-identifier=?}{ \vari{id} \varii{id}}{procedure}}
\domain{\vari{Id} and \varii{id} must be identifiers.}
The {\cf free-identifier=?} procedure returns \schtrue{} if and
only if the two identifiers would resolve to the same binding if both were
to appear in the output of a transformer outside of any bindings inserted
by the transformer.
(If neither of two like-named identifiers resolves to a binding, i.e., both
are unbound, they are considered to resolve to the same binding.)
Operationally, two identifiers are considered equivalent by
{\cf free-identifier=?} if and only the topmost matching
substitution for each maps to the same binding (section~\ref{hygienesection})
or the identifiers have the same name and no matching substitution.
The {\cf syntax-case} and {\cf syntax-rules} forms internally use
{\cf free-identifier=?} to compare identifiers listed in the literals
list against input identifiers.
\begin{scheme}
(let ([fred 17])
(define-syntax a
(lambda (x)
(syntax-case x ()
[(\_ id) \sharpsign{}'(b id fred)])))
(define-syntax b
(lambda (x)
(syntax-case x ()
[(\_ id1 id2)
\sharpsign{}`(list
\sharpsign{},(free-identifier=? \sharpsign{}'id1 \sharpsign{}'id2)
\sharpsign{},(bound-identifier=? \sharpsign{}'id1 \sharpsign{}'id2))])))
(a fred)) \ev (\schtrue{} \schfalse{})%
\end{scheme}
The following definition of unnamed {\cf let}
uses {\cf bound-identifier=?} to detect duplicate identifiers.
\begin{schemenoindent}
(define-syntax let
(lambda (x)
(define unique-ids?
(lambda (ls)
(or (null? ls)
(and (let notmem?
([x (car ls)] [ls (cdr ls)])
(or (null? ls)
(and (not (bound-identifier=?
x (car ls)))
(notmem? x (cdr ls)))))
(unique-ids? (cdr ls))))))
(syntax-case x ()
[(\_ ((i v) ...) e1 e2 ...)
(unique-ids? \#'(i ...))
\#'((lambda (i ...) e1 e2 ...) v ...)])))%
\end{schemenoindent}
The argument {\cf \#'(i ...)} to {\cf unique-ids?} is guaranteed
to be a list by the rules given in the description of {\cf syntax}
above.
With this definition of {\cf let}:
\begin{scheme}
(let ([a 3] [a 4]) (+ a a)) \lev \exception{\&syntax}%
\end{scheme}
However,
\begin{scheme}
(let-syntax
([dolet (lambda (x)
(syntax-case x ()
[(\_ b)
\#'(let ([a 3] [b 4]) (+ a b))]))])
(dolet a)) \lev 7%
\end{scheme}
since the identifier {\cf a} introduced by {\cf dolet}
and the identifier {\cf a} extracted from the input form are not
{\cf bound-identifier=?}.
The following definition of {\cf case} is equivalent to the one in
section~\ref{syntaxcasesection}.
Rather than including {\cf else} in the literals list as before,
this version explicitly tests for {\cf else} using
{\cf free-identifier=?}.
\begin{schemenoindent}
(define-syntax case
(lambda (x)
(syntax-case x ()
[(\_ e0 [(k ...) e1 e2 ...] ...
[else-key else-e1 else-e2 ...])
(and (identifier? \#'else-key)
(free-identifier=? \#'else-key \#'else))
\#'(let ([t e0])
(cond
[(memv t '(k ...)) e1 e2 ...]
...
[else else-e1 else-e2 ...]))]
[(\_ e0 [(ka ...) e1a e2a ...]
[(kb ...) e1b e2b ...] ...)
\#'(let ([t e0])
(cond
[(memv t '(ka ...)) e1a e2a ...]
[(memv t '(kb ...)) e1b e2b ...]
...))])))%
\end{schemenoindent}
With either definition of {\cf case}, {\cf else} is not
recognized as an auxiliary
keyword if an enclosing lexical binding for {\cf else} exists.
For example,
\begin{scheme}
(let ([else \schfalse{}])
(case 0 [else (write "oops")])) \lev \exception{\&syntax}%
\end{scheme}
since {\cf else} is bound
lexically and is
therefore not the same {\cf else} that appears in the definition of
{\cf case}.
\end{entry}
\section{Syntax-object and datum conversions}
\label{conversionssection}
\begin{entry}{%
\proto{syntax->datum}{ syntax-object}{procedure}}
Strips all syntactic information from a syntax
object and returns the corresponding Scheme datum.
\end{entry}
Identifiers stripped in this manner are converted to their symbolic
names, which can then be compared with {\cf eq?}.
Thus, a predicate {\cf symbolic-identifier=?} might be defined as follows.
\begin{scheme}
(define symbolic-identifier=?
(lambda (x y)
(eq? (syntax->datum x)
(syntax->datum y))))%
\end{scheme}
% not be true with import alias and rename
%Two identifiers that are {\cf bound-identifier=?} or
%{\cf free-identifier=?} are {\cf symbolic-identifier=?}; in order to
%refer to the same binding, two identifiers must have the same name.
%The converse is not always true, since two identifiers may have
%the same name but different bindings.
\begin{entry}{%
\proto{datum->syntax}{ template-id datum}{procedure}}
\end{entry}
\domain{\var{Template-id} must be a
template identifier and \var{datum} should be a datum value.}
The {\cf datum->syntax} procedure returns a syntax-object representation of \var{datum} that
contains the same contextual information as
\var{template-id}, with the effect that the
syntax object behaves
as if it were introduced into the code when
\var{template-id} was introduced.
The {\cf datum\coerce{}syntax} procedure allows a transformer to ``bend'' lexical
scoping rules by creating \textit{implicit
identifiers\mainindex{implicit identifier}}
that behave as if they were present in the input form,
thus permitting the definition of macros
that introduce visible bindings for or references to
identifiers that do not appear explicitly in the input form.
For example, the following defines a {\cf loop} expression that
uses this controlled form of identifier capture to
bind the variable {\cf break} to an escape procedure
within the loop body.
(The derived {\cf with-syntax} form is like {\cf let} but binds
pattern variables---see section~\ref{derivedsection}.)
\begin{scheme}
(define-syntax loop
(lambda (x)
(syntax-case x ()
[(k e ...)
(with-syntax
([break (datum->syntax \#'k 'break)])
\#'(call-with-current-continuation
(lambda (break)
(let f () e ... (f)))))])))
(let ((n 3) (ls '()))
(loop
(if (= n 0) (break ls))
(set! ls (cons 'a ls))
(set! n (- n 1)))) \lev (a a a)%
\end{scheme}
Were {\cf loop} to be defined as
\begin{scheme}
(define-syntax loop
(lambda (x)
(syntax-case x ()
[(\_ e ...)
\#'(call-with-current-continuation
(lambda (break)
(let f () e ... (f))))])))%
\end{scheme}
the variable {\cf break} would not be visible in {\cf e \dots}.
The datum argument \var{datum} may also represent an arbitrary
Scheme form, as demonstrated by the following definition of
{\cf include}.
\begin{scheme}
(define-syntax include
(lambda (x)
(define read-file
(lambda (fn k)
(let ([p (open-file-input-port fn)])
(let f ([x (get-datum p)])
(if (eof-object? x)
(begin (close-port p) '())
(cons (datum->syntax k x)
(f (get-datum p))))))))
(syntax-case x ()
[(k filename)
(let ([fn (syntax->datum \#'filename)])
(with-syntax ([(exp ...)
(read-file fn \#'k)])
\#'(begin exp ...)))])))%
\end{scheme}
{\cf (include "filename")} expands into a {\cf begin} expression
containing the forms found in the file named by
{\cf "filename"}.
For example, if the file {\cf flib.ss} contains
{\cf (define f (lambda (x) (g (* x x))))}, and the file
{\cf glib.ss} contains
{\cf (define g (lambda (x) (+ x x)))},
the expression
\begin{scheme}
(let ()
(include "flib.ss")
(include "glib.ss")
(f 5))%
\end{scheme}
evaluates to {\cf 50}.
The definition of {\cf include} uses {\cf datum\coerce{}syntax} to convert
the objects read from the file into syntax objects in the proper
lexical context, so that identifier references and definitions within
those expressions are scoped where the {\cf include} form appears.
Using {\cf datum\coerce{}syntax}, it is even possible to break hygiene
entirely and write macros in the style of old Lisp macros.
The {\cf lisp-transformer} procedure defined below creates a transformer
that converts its input into a datum, calls the programmer's procedure on
this datum, and converts the result back into a syntax object scoped
where the original macro use appeared.
\begin{scheme}
(define lisp-transformer
(lambda (p)
(lambda (x)
(syntax-case x ()
[(kwd . rest)
(datum\coerce{}syntax \#'kwd
(p (syntax\coerce{}datum x)))]))))%
\end{scheme}
\section{Generating lists of temporaries}
\label{generatingtemporariessection}
Transformers can introduce a fixed number of identifiers into their
output simply by naming each identifier.
In some cases, however, the number of identifiers to be introduced depends
upon some characteristic of the input expression.
A straightforward definition of {\cf letrec}, for example,
requires as many
temporary identifiers as there are binding pairs in the
input expression.
The procedure {\cf generate-temporaries} is used to construct
lists of temporary identifiers.
\begin{entry}{%
\proto{generate-temporaries}{ l}{procedure}}
\domain{\var{L} must be be a list or syntax object representing a list-structured
form; its contents are not important.}
The number of temporaries generated is the number of elements in \var{l}.
Each temporary is guaranteed to be unique, i.e., different from all other
identifiers.
A definition of {\cf letrec} equivalent to the one using
{\cf syntax-rules} given in report
appendix~\extref{report:derivedformsappendix}{Sample definitions for
derived forms} is shown below.
\begin{schemenoindent}
(define-syntax letrec
(lambda (x)
(syntax-case x ()
((\_ ((i e) ...) b1 b2 ...)
(with-syntax
(((t ...) (generate-temporaries \#'(i ...))))
\#'(let ((i <undefined>) ...)
(let ((t e) ...)
(set! i t) ...
(let () b1 b2 ...))))))))%
\end{schemenoindent}
This version uses {\cf generate-temporaries} instead of recursively defined
helper to generate the necessary temporaries.
\end{entry}
\section{Derived forms and procedures}
\label{derivedsection}
The forms and procedures described in this section can be defined in
terms of the forms and procedures described in earlier sections of
this chapter.
\begin{entry}{%
\pproto{(with-syntax ((\hyper{pattern} \hyper{expression}) \dotsfoo) \hyper{body})}{\exprtype}}
\mainschindex{with-syntax}
The {\cf with-syntax} form is used to bind pattern variables,
just as {\cf let} is used to bind variables.
This allows a transformer to construct its output in separate
pieces, then put the pieces together.
Each \hyper{pattern} is identical in form to a {\cf syntax-case} pattern.
The value of each \hyper{expression} is computed and destructured according
to the corresponding \hyper{pattern}, and pattern variables within
the \hyper{pattern} are bound as with {\cf syntax-case} to the
corresponding portions of the value within \hyper{body}.
The {\cf with-syntax} form may be defined in terms of {\cf syntax-case} as
follows.
\begin{scheme}
(define-syntax with-syntax
(lambda (x)
(syntax-case x ()
((\_ ((p e0) ...) e1 e2 ...)
(syntax (syntax-case (list e0 ...) ()
((p ...) (let () e1 e2 ...))))))))%
\end{scheme}
The following definition of {\cf cond} demonstrates the use of
{\cf with-syntax} to support transformers that employ recursion
internally to construct their output.
It handles all {\cf cond} clause variations and takes care to produce
one-armed {\cf if} expressions where appropriate.
\begin{schemenoindent}
(define-syntax cond
(lambda (x)
(syntax-case x ()
[(\_ c1 c2 ...)
(let f ([c1 \#'c1] [c2* \#'(c2 ...)])
(syntax-case c2* ()
[()
(syntax-case c1 (else =>)
[(else e1 e2 ...) \#'(begin e1 e2 ...)]
[(e0) \#'e0]
[(e0 => e1)
\#'(let ([t e0]) (if t (e1 t)))]
[(e0 e1 e2 ...)
\#'(if e0 (begin e1 e2 ...))])]
[(c2 c3 ...)
(with-syntax ([rest (f \#'c2 \#'(c3 ...))])
(syntax-case c1 (=>)
[(e0) \#'(let ([t e0]) (if t t rest))]
[(e0 => e1)
\#'(let ([t e0]) (if t (e1 t) rest))]
[(e0 e1 e2 ...)
\#'(if e0
(begin e1 e2 ...)
rest)]))]))])))%
\end{schemenoindent}
\end{entry}
\begin{entry}{%
\proto{quasisyntax}{ \hyper{template}}{\exprtype}
\litproto{unsyntax}
\litproto{unsyntax-splicing}}
The {\cf quasisyntax} form is similar to {\cf syntax}, but it allows parts
of the quoted text to be evaluated, in a manner similar to the operation
of {\cf quasiquote} (report section~\extref{report:quasiquotesection}{Quasiquotation}).
Within a {\cf quasisyntax} \var{template}, subforms of
{\cf unsyntax} and {\cf unsyntax-splicing} forms are evaluated,
and everything else is treated as ordinary template material, as
with {\cf syntax}.
The value of each {\cf unsyntax} subform is inserted into the output
in place of the {\cf unsyntax} form, while the value of each
{\cf unsyntax-splicing} subform is spliced into the surrounding list
or vector structure.
Uses of {\cf unsyntax} and {\cf unsyntax-splicing} are valid only within
{\cf quasisyntax} expressions.
A {\cf quasisyntax} expression may be nested, with each {\cf quasisyntax}
introducing a new level of syntax quotation and each {\cf unsyntax} or
{\cf unsyntax-splicing} taking away a level of quotation.
An expression nested within $n$ {\cf quasisyntax} expressions must
be within $n$ {\cf unsyntax} or {\cf unsyntax-splicing} expressions to
be evaluated.
As noted in report section~\extref{report:abbreviationsection}{Abbreviations},
{\cf \#`\hyper{template}} is equivalent to {\cf (quasisyntax
\hyper{template})}, {\cf \#,\hyper{template}} is equivalent to {\cf (unsyntax
\hyper{template})}, and {\cf \#,@\hyper{template}} is equivalent to {\cf (unsyntax-splicing
\hyper{template})}.
The {\cf quasisyntax} keyword can be used in place of {\cf with-syntax} in many
cases.
For example, the definition of {\cf case} shown under the description
of {\cf with-syntax} above can be rewritten using {\cf quasisyntax}
as follows.