-
Notifications
You must be signed in to change notification settings - Fork 15
/
Copy pathsemantics.tex
628 lines (521 loc) · 41.1 KB
/
semantics.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
%!TEX root = r6rs.tex
本附录为Scheme提供了一个非正式的,形式的,操作语义,其基于一个早期的语义\cite{mf:scheme-op-sem}。它没有覆盖整个语言。明显没有包含的特性是宏系统,I/O,和数值塔。包含特性的精确列表在第\ref{sec:semantics:grammar}小节给出。
规范的核心是一个单步术语重写的关系,其指出一个(抽象的)机器行为。通常,报告不是完整的规范,已给予实现不同行为的自由,特别是在可以优化的时候。这种不指明在语义中以两种方式展示。
第一个是减少规则,其减少到特定的“\textbf{未知:}\textit{字符串}”状态(其中字符串提供未知状态的一个描述)。意图是减少到这种状态的规则可以用任意的减少规则取代。怎样代替这些规则的精确规范在第\ref{sec:semantics:underspecification}小节被给出。
另一个是单步关系涉及到一个程序到多个程序,每一个对应于一个抽象机器可以做的合法的转换。相应地,我们使用单步关系$\rightarrow^*$的传递闭包来定义语义,\calS{},作为一个从程序(\calP{})到观察结果集合(\calR)的函数。
\begin{center}
\begin{tabular}{l}
$\calS : \calP \longrightarrow 2^{\calR}$ \\
$\calS(\calP) = \{ \scrO(\calA) \mid \calP \rightarrow^* \calA \}$
\end{tabular}
\end{center}
其中函数$\scrO$将一个答案($\calA${})从语义转换到一个观察结果。大概的说,$\scrO$在简单的基本值上是恒等函数,对于更加复杂的值,像过程和点对,则返回特定的标签。
所以,一个实现符合这样的语义如果,对所有的程序$\calP$,实现产生$\calS(\calP)$中结果的一个或,如果实现无限循环,那么有一个无限的减少序列以$\calP$开始,假设减少关系$\rightarrow$已经被调整去取代\textbf{未知:}状态。
$\calP$,$\calA$,$\calR$和$\scrO$的精确定义也在第\ref{sec:semantics:grammar}小节被给出。
为帮助理解语义和它的行为,我们在PLT Redex中实现了它。我们可以在这篇报告的官方网站发现这个实现:\url{http://www.r6rs.org/}。在本语义图表中展示的所有的减少规则和元方程都是从源代码自动生成的。
\section{背景}
我们假设读者对上下文敏感的减少语义有基础的了解。对此系统不了解的读者可能想要查阅Felleisen和Flatt的专题著作\cite{ff:monograph}或Wright和Felleisen\cite{wf:type-soundness}的一个全面的介绍,其包括相关的技术背景,或者轻松一点的PLT Redex\cite{mfff:plt-redex}的一个介绍。
作为一个简单的指南,我们通过程序术语的关系定义一个语言的操作语义,其中这个关系对应于一个抽象机器的单步。这个关系使用求值上下文定义,也就是它们当中有区别地方的术语,叫做\emph{孔(holes)}\index{hole,孔},求值的下一步在这里进行。我们说一个术语$e$分解成一个表达式上下文$E$和宁一个术语$e'$,如果$e$和$E$是一样的但孔被$e'$代替。我们写成$E[e']$来指示术语在$E$中用$e'$来代替孔。
比如,我们为表达式上下文($E$),表达式($e$),变量($x$),和值($v$)定义了一个包含了非终结符的语法。我们可以这样写:
%
\begin{displaymath}
\begin{array}{l}
E_1[\texttt{((}\sy{lambda}~\texttt{(}x_1 \cdots{}\texttt{)}~e_1\texttt{)}~v_1~\cdots\texttt{)}] \rightarrow
\\
E_1[\{ x_1 \cdots \mapsto v_1 \cdots \} e_1] ~~~~~ (\#x_1 = \#v_1)
\end{array}
\end{displaymath}
%
去定义$\beta_v$重写规则(作为$\rightarrow$单步关系的一部分)。我们在重写规则中使用非终结符的名字(可能和下标一起使用)来限制规则的应用,所以只有当一些语法产生的条款出现在条款中对应的位置时,它才起作用。如果相同的非终结符和相同的下标出现多次,那么规则只有当相应的条款在结构上是一样的时候才起作用(没有下标的非终结符不强迫相互匹配。因此,上面规则中,$E_1$同时出现在左手边和右手边意味着应用表达式的上下文在使用这条规则的时候没有改变。省略号是Kleene星号的一个形式,意味着零个或多个省略号之前%<!-- TODO: proceeding应该是preceding。 -->
的匹配模式条款的事件可以出现在省略号和它之前模式的位置。我们使用符号$\{ x_1 \cdots \mapsto v_1 \cdots \} e_1$表示捕获避免的替代;在这种情况下它意味着每一个$x_1$在$e_1$中被替代成相应的$v_1$。最后,除了规则,我们在小括号中写附加条件(side-conditions);上面规则的附加条件指示$x_1$的数量必须和$v_1$的数量相等。我们有时在附加条件中使用等式;当我们这样做时仅仅意味着简单的条目相等,也就是说,这两个条目必须有相同的句法形状。
\addtocounter{figure}{1} % get the figure counter in sync with the section counter
\subfigurestart{}
\beginfig
\input{r6-fig-grammar-parti.tex}
\caption{程序和观测(observables)的语法}\label{fig:grammar}
\endfig
在规则中指明求值上下文$E$允许我们定义操作它们上下文的关系。作为一个简单的例子,我们可以添加另一条规则,其在程序应用错误数量参数的时候通过丢弃规则右手边求值上下文的方式产生错误:
%
\begin{displaymath}
\begin{array}{l}
E[\texttt{((}\sy{lambda}~\texttt{(}x_1 \cdots\texttt{)}~e\texttt{)}~v_1~\cdots\texttt{)}] \rightarrow
\\
\textrm{\textbf{violation:} 错误的参数数量} ~~~~~ (\#x_1 \neq \#v_1)
\end{array}
\end{displaymath}
%
以后,我们会以更复杂的方式利用显示的求值上下文。
\section{语法}\label{sec:semantics:grammar}
\beginfig
\subfigureadjust{}
\input{r6-fig-grammar-partii.tex}
\caption{求值上下文(evaluation contexts)的语法}\label{fig:ec-grammar}
\endfig
\subfigurestop{}
图\ref{fig:grammar}展示了本报告语义模型的子集。非终结符被写成\textit{斜体}或一种艺术字体($\calP$,$\calA$,$\calR$和$\calRv$)且字面量被写成\texttt{等宽}}字体。
非终结符$\calP$表示可能的程序状态。第一个选择是有一个存储和一个表达式的程序。第二个选择是一个未捕获的异常,及第三个被用作指示它建模的模型没有完全指定原语行为的地方(那些情况具体的细节见第\ref{sec:semantics:underspecification}小节)。非终结符$\calA$表示程序的最终结果。它和$\calP$是一样的除了表达式已经被减少到一些值得序列之外。
非终结符$\calR$和$\calRv$表示程序的可观测结果。每个$\calR$或者是一个对应于正常终止的程序产生的值的值的序列,或者是一个指示未捕获异常被抛出的标签,或者是\sy{unknown}如果程序到达一个本语义没有指明的情况的话。非终结符$\calRv$指明对于一个特定的值来说可观测的结果是什么:一个点对,空表,一个符号,一个自引用的值(\schtrue{},\allowbreak{}\schfalse{}\allowbreak{}和数字),一个条件,或一个过程。
非终结符\nt{sf}产生存储的一个单独元素。存储保持一个程序所有的可变状态。它,连同操作它的规则一起,被更加详细地解释了。
表达式($\mathit{es}$)包括被引用的数据,\sy{begin}表达式,\sy{begin0}表达式%
\footnote{ \sy{begin0}不是标准的一部分,但为了\va{dynamic-wind}和\va{letrec}的规则更加易读,我们包含了它。尽管我们直接给它建模,但是它可以根据我们在此定义的来自标准的其它形式被定义:
\begin{displaymath}
\begin{array}{rcl}\tt
\texttt{(}\sy{begin0}~e_1~e_2~\cdots\texttt{)} &=&
\begin{array}{l}
\texttt{(}\va{call\mbox{-}with\mbox{-}values}\\
~\texttt{(}\sy{lambda}~\texttt{()}~e_1\texttt{)}\\
~\texttt{(}\sy{lambda}~x\\
~~~e_2~\cdots\\
~~~\texttt{(}\va{apply}~\va{values}~x\texttt{)))}
\end{array}
\end{array}
\end{displaymath}
},应用表达式,\sy{if}表达式,\sy{set!}表达式,变量,非过程值(\nt{nonproc}),基本过程(\nt{pproc}),lambda表达式,\sy{letrec}和\sy{letrec*}表达式。
最后几个表达式形式仅仅是为了中间状态生成的(\sy{dw}为了\sy{dynamic-wind},\sy{throw}为了继续,\sy{unspecified}为了赋值操作的规则,\sy{handlers}为了异常处理,及\sy{l!}和\sy{reinit}为了\sy{letrec}),且在初始程序中不应该出现。它们的使用在本附录的相关章节被描述。
非终结符\nt{f}描述\sy{lambda}表达式的形式。(\sy{dot}代替(西文)句号来描述接受任意数量参数的过程,这是为了避免和我们PLT Redex模型中的元圆(meta-circular)相混淆。)
非终结符\nt{s}覆盖所有的数据,其可以是非空序列(\nt{seq}),空序列,自引用值(\nt{sqv}),或符号。非空序列或者仅是一个数据的序列,或者以一个点终结,这个点或者跟着一个符号,或者跟着一个自引用值。最后,自引用值是数字和布尔\semtrue{}与\semfalse{}。
非终结符\nt{p}表示没有引用数据的程序。大部分消去规则重写\nt{p}到\nt{p},而不是$\calP$到$\calP$,这是因为在平常的求值之前被引用的数据第一次被重写成表构造函数的调用。平行于\nt{es},\nt{e}表示没有被引用表达式的表达式。
\beginfig
\begin{center}
\input{r6-fig-Quote.tex}
\input{r6-fig-QtocQtoic.tex}
\end{center}
\caption{Quote}\label{fig:quote}
\endfig
值($v$)被分成四类:
%
\begin{itemize}
\item 非过程(\nt{nonproc})包括点对指针(\va{pp}),空表(\va{null}),符号,自引用值(\nt{sqv}),和条件。条件表示报告的条件值,但这儿只包含一条信息且其它情况无效。
\item 用户过程(\texttt{(}\sy{lambda} \nt{f} \nt{e} \nt{e} $\cdots$\texttt{)})包括多参数的lambda表达式和伴随点参数列表的表达式。
\item 基本函数(\nt{pproc})包括
\begin{itemize}
\item
算术过程(\nt{aproc}):\va{+}, \va{-}, \va{/}, 和\va{*},
\item
一个参数的过程(\nt{proc1}):\va{null?}, \va{pair?}, \va{car}, \va{cdr}, \va{call/cc}, \va{procedure?}, \va{condition?}, \va{unspecified?}, \va{raise}, 和\va{raise-continuable},
\item
两个参数的过程(\nt{proc2}):\va{cons}, \va{set-car!}, \va{set-cdr!}, \va{eqv?}, 和\va{call-with-values},
\item
以及\va{list}, \va{dynamic-wind}, \va{apply}, \va{values}, 和\va{with-exception-handler}。
\end{itemize}
\item 最后,继续被表示为\sy{throw}表达式,其内部由捕获继续的上下文组成。
\end{itemize}
%
The next three set of non-terminals in figure~\ref{fig:grammar} represent pairs (\nt{pp}), which are divided into immutable pairs (\nt{ip}) and mutable pairs (\nt{mp}). The final set of non-terminals in figure~\ref{fig:grammar}, \nt{sym},
\nt{x}, and $n$ represent symbols, variables, and
numbers respectively. The non-terminals \nt{ip}, \nt{mp}, and \nt{sym} are all assumed to all be disjoint. Additionally, the variables $x$ are assumed not to include any keywords or primitive operations, so any program variables whose names coincide with them must be renamed before the semantics can give the meaning of that program.
\beginfig
\begin{center}
\input{r6-fig-Multiple--values--and--call-with-values.tex}
\end{center}
\caption{Multiple values and call-with-values}\label{fig:Multiple--values--and--call-with-values}
\endfig
The set of non-terminals for evaluation contexts is shown in
figure~\ref{fig:ec-grammar}. The \nt{P} non-terminal controls where
evaluation happens in a program that does not contain any quoted data.
The $E$ and $F$ evaluation contexts are for expressions. They are factored in
that manner so that the \nt{PG}, \nt{G}, and \nt{H} evaluation contexts can
re-use \nt{F} and have fine-grained control over the context to support
exceptions and \va{dynamic-wind}. The starred and circled variants,
\Estar{}, \Eo{}, \Fstar{}, and \Fo{} dictate where a single value is
promoted to multiple values and where multiple values are demoted to a
single value. The \nt{U} context is used to manage the report's underspecification of the results of \sy{set!}, \va{set-car!}, and \va{set-cdr!} (see section~\ref{sec:semantics:underspecification} for details). Finally, the \nt{S} context is where quoted expressions can be simplified. The precise use of the evaluation contexts is explained along with the relevant rules.
To convert the answers ($\calA$) of the semantics into observable results, we use these two functions:
\input{r6-fig-observable}
\input{r6-fig-observable-value}
They eliminate the store, and replace complex values with simple tags that indicate only the kind of value that was produced or, if no values were produced, indicates that either an uncaught exception was raised, or that the program reached a state that is not specified by the semantics.
\section{Quote}\label{sec:semantics:quote}
The first reduction rules that apply to any program is the
rules in figure~\ref{fig:quote} that eliminate quoted expressions.
The first two rules erase the quote for quoted expressions that do not introduce any pairs.
The last two rules lift quoted datums to the top of the expression so
they are evaluated only once, and turn the datums into calls to either \va{cons} or \va{consi}, via the metafunctions $\mathscr{Q}_i$ and $\mathscr{Q}_m$.
Note that the left-hand side of the \rulename{6qcons} and \rulename{6qconsi} rules are identical, meaning that if one rule applies to a term, so does the other rule.
Accordingly, a quoted expression may be lifted out into a sequence of \va{cons} expressions, which create mutable pairs, or into a sequence of \va{consi} expressions, which create immutable pairs (see section~\ref{sec:semantics:lists} for the rules on how that happens).
These rules apply before any other because of the contexts in which they, and all of the other rules, apply. In particular, these rule applies in the
\nt{S} context. Figure~\ref{fig:ec-grammar} shows that the
\nt{S} context allows this reduction to apply in
any subexpression of an \nt{e}, as long as all of the
subexpressions to the left have no quoted expressions in them,
although expressions to the right may have quoted expressions.
Accordingly, this rule applies once for each quoted expression in the
program, moving out to the beginning of the program.
The rest of the rules apply in contexts that do not contain any quoted
expressions, ensuring that these rules convert all quoted data
into lists before those rules apply.
Although the identifier \nt{qp} does not have a subscript, the semantics of PLT Redex's ``fresh'' declaration takes special care to ensures that the \nt{qp} on the right-hand side of the rule is indeed the same as the one in the side-condition.
\beginfig
\begin{center}
\input{r6-fig-Exceptions}
\end{center}
\caption{Exceptions}\label{fig:Exceptions}
\endfig
\section{Multiple values}
The basic strategy for multiple values is to add a rule that demotes
$(\va{values}~v)$ to $v$ and another rule that promotes
$v$ to $(\va{values}~v)$. If we allowed these rules to apply
in an arbitrary evaluation context, however, we would get infinite
reduction sequences of endless alternation between promotion and
demotion. So, the semantics allows demotion only in a context
expecting a single value and allows promotion only in a context
expecting multiple values. We obtain this behavior with a small
extension to the Felleisen-Hieb framework (also present in the
operational model for R$^5$RS~\cite{mf:op-r5rs}).
We extend the notation so that
holes have names (written with a subscript), and the context-matching
syntax may also demand a hole of a particular name (also written with
a subscript, for instance $E[e]_{\star}$). The extension
allows us to give different names to the holes in which multiple
values are expected and those in which single values are expected, and
structure the grammar of contexts accordingly.
To exploit this extension, we use three kinds of holes in the
evaluation context grammar in figure~\ref{fig:ec-grammar}. The
ordinary hole \hole{} appears where the usual kinds of
evaluation can occur. The hole \holes{} appears in contexts that
allow multiple values and \holeone{} appears in
contexts that expect a single value. Accordingly, the rule
\rulename{6promote} only applies in \holes{} contexts, and
\rulename{6demote} only applies in \holeone{} contexts.
To see how the evaluation contexts are organized to ensure that
promotion and demotion occur in the right places, consider the \nt{F},
\Fstar{} and \Fo{} evaluation contexts. The \Fstar{} and \Fo{}
evaluation contexts are just the same as \nt{F}, except that they allow
promotion to multiple values and demotion to a single value,
respectively. So, the \nt{F} evaluation context, rather than being
defined in terms of itself, exploits \Fstar{} and \Fo{} to dictate
where promotion and demotion can occur. For example, \nt{F} can be
$\texttt{(}\sy{if}~\Fo{}~e~e\texttt{)}$ meaning that demotion from
$\texttt{(}\va{values}~v\texttt{)}$ to
$v$ can occur in the test of an \sy{if} expression.
Similarly, $F$ can be $\texttt{(}\sy{begin}~\Fstar{}~e~e~\cdots\texttt{)}$ meaning that
$v$ can be promoted to $\texttt{(}\va{values}~v\texttt{)}$ in the first subexpression of a \sy{begin}.
In general, the promotion and demotion rules simplify the definitions
of the other rules. For instance, the rule for \sy{if} does not
need to consider multiple values in its first subexpression.
Similarly, the rule for \sy{begin} does not need to consider the
case of a single value as its first subexpression.
\beginfig
\begin{center}
\input{r6-fig-Arithmetic.tex}
\input{r6-fig-Basic--syntactic--forms.tex}
\end{center}
\caption{Arithmetic and basic forms}\label{fig:Arithmetic}
\endfig
The other two rules in
figure~\ref{fig:Multiple--values--and--call-with-values} handle
\va{call-\hp{}with-\hp{}values}. The evaluation contexts for
\va{call-with-values} (in the $F$ non-terminal) allow
evaluation in the body of a procedure that has been passed as the first
argument to \va{call-with-values}, as long as the second argument
has been reduced to a value. Once evaluation inside that procedure
completes, it will produce multiple values (since it is an \Fstar{}
position), and the entire \va{call-with-values} expression reduces
to an application of its second argument to those values, via the rule
\rulename{6cwvd}. Finally, in the
case that the first argument to \va{call-with-values} is a value,
but is not of the form $\texttt{(}\sy{lambda}~\texttt{()}~e\texttt{)}$, the rule
\rulename{6cwvw} wraps it in a thunk to trigger evaluation.
\beginfig
\begin{center}
\input{r6-fig-Cons.tex}
\end{center}
\caption{Lists}\label{fig:Cons}
\endfig
\section{Exceptions}
The workhorses for the exception system are $$\texttt{(}\sy{handlers}~\nt{proc}~\cdots{}~\nt{e}\texttt{)}$$ expressions and the \nt{G} and \nt{PG} evaluation contexts (shown in figure~\ref{fig:ec-grammar}).
The \sy{handlers} expression records the
active exception handlers (\nt{proc} $\cdots$) in some expression (\nt{e}). The
intention is that only the nearest enclosing \sy{handlers} expression
is relevant to raised exceptions, and the $G$ and \nt{PG} evaluation
contexts help achieve that goal. They are just like their counterparts
\nt{E} and \nt{P}, except that \sy{handlers} expressions cannot occur on the
path to the hole, and the exception system rules take advantage of
that context to find the closest enclosing handler.
To see how the contexts work together with \sy{handler}
expressions, consider the left-hand side of the \rulename{6xunee}
rule in figure~\ref{fig:Exceptions}.
It matches expressions that have a call to \va{raise} or
\va{raise-continuable} (the non-terminal \nt{raise*} matches
both exception-raising procedures) in a \nt{PG}
evaluation context. Since the \nt{PG} context does not contain any
\sy{handlers} expressions, this exception cannot be caught, so
this expression reduces to a final state indicating the uncaught
exception. The rule \rulename{6xuneh} also signals an uncaught
exception, but it covers the case where a \sy{handlers} expression
has exhausted all of the handlers available to it. The rule applies to
expressions that have a \sy{handlers} expression (with no
exception handlers) in an arbitrary evaluation context where a call to
one of the exception-raising functions is nested in the
\sy{handlers} expression. The use of the \nt{G} evaluation
context ensures that there are no other \sy{handler} expressions
between this one and the raise.
The next two rules cover call to the procedure \va{with-exception-handler}.
The \rulename{6xwh1} rule applies when there are no \sy{handler}
expressions. It constructs a new one and applies $\nt{v}_2$ as a
thunk in the \sy{handler} body. If there already is a handler
expression, the \rulename{6xwhn} applies. It collects the current
handlers and adds the new one into a new \sy{handlers} expression
and, as with the previous rule, invokes the second argument to
\va{with-exception-handlers}.
The next two rules cover exceptions that are raised in the context of
a \sy{handlers} expression. If a continuable exception is raised,
\rulename{6xrc} applies. It takes the most recently installed
handler from the nearest enclosing \sy{handlers} expression and
applies it to the argument to \va{raise-continuable}, but in a
context where the exception handlers do not include that latest
handler. The \rulename{6xr} rule behaves similarly, except it
raises a new exception if the handler returns. The new exception is
created with the \sy{make-cond} special form.
\beginfig
\begin{center}
\input{r6-fig-Eqv.tex}
\end{center}
\caption{Eqv}\label{fig:Eqv}
\endfig
The \sy{make-cond} special form is a stand-in for the report's
conditions. It does not evaluate its argument (note its absence from
the $E$ grammar in figure~\ref{fig:ec-grammar}). That argument
is just a literal string describing the context in which the exception
was raised. The only operation on conditions is \va{condition?},
whose semantics are given by the two rules \rulename{6ct} and
\rulename{6cf}.
Finally, the rule \rulename{6xdone} drops a \sy{handlers} expression
when its body is fully evaluated, and the rule \rulename{6weherr}
raises an exception when \va{with-exception-handler} is supplied with
incorrect arguments.
\section{Arithmetic and basic forms}
This model does not include the report's arithmetic, but does include
an idealized form in order to make experimentation with other features
and writing test suites for the model simpler.
Figure~\ref{fig:Arithmetic} shows the reduction rules for the
primitive procedures that implement addition, subtraction,
multiplication, and division. They defer to their mathematical
analogues. In addition, when the subtraction or divison operator are
applied to no arguments, or when division receives a zero as a
divisor, or when any of the arithmetic operations receive a
non-number, an exception is raised.
The bottom half of figure~\ref{fig:Arithmetic} shows the rules for
\sy{if}, \sy{begin}, and \sy{begin0}. The relevant
evaluation contexts are given by the $F$ non-terminal.
The evaluation contexts for \sy{if} only allow evaluation in its
test expression. Once that is a value, the rules reduce
an \sy{if} expression to its consequent if the test is not
\semfalse{}, and to its alternative if it is \semfalse{}.
The \sy{begin} evaluation contexts allow evaluation in the first
subexpression of a begin, but only if there are two or more
subexpressions. In that case, once the first expression has been fully
simplified, the reduction rules drop its value. If there is only a
single subexpression, the \sy{begin} itself is dropped.
\subfigurestart{}
\beginfig
\begin{center}
\input{r6-fig-Procedure--application.tex}
\end{center}
\caption{Procedures \& application}\label{fig:Procedure--application}
\endfig
Like the \sy{begin} evaluation contexts, the \sy{begin0}
evaluation contexts allow evaluation of the first subexpression of a
\sy{begin0} expression when there are two or more subexpressions.
The \sy{begin0} evaluation contexts also allow evaluation in the
second subexpression of a \sy{begin0} expression, as long as the first
subexpression has been fully simplified. The \rulename{6begin0n} rule for
\sy{begin0} then drops a fully simplified second subexpression.
Eventually, there is only a single expression in the \sy{begin0},
at which point the \rulename{begin01} rule fires, and removes the
\sy{begin0} expression.
\section{Lists}\label{sec:semantics:lists}
The rules in figure~\ref{fig:Cons} handle lists. The first two rules handle \va{list} by reducing it to a succession of calls to \va{cons}, followed by \va{null}.
The next two rules, \rulename{6cons} and \rulename{6consi}, allocate new \va{cons} cells.
They both move $\texttt{(}\va{cons}~v_1~v_2\texttt{)}$ into the store, bound to a fresh
pair pointer (see also section~\ref{sec:semantics:quote} for a description of ``fresh'').
The \rulename{6cons} uses a \nt{mp} variable, to indicate the pair is mutable, and the \rulename{6consi} uses a \nt{ip} variable to indicate the pair is immutable.
The rules \rulename{6car} and \rulename{6cdr} extract the components of a pair from the store when presented with a pair pointer (the \nt{pp} can be either \nt{mp} or \nt{ip}, as shown in figure~\ref{fig:grammar}).
The rules \rulename{6setcar} and \rulename{6setcdr} handle assignment of mutable pairs.
They replace the contents of the appropriate location in the store with the new value, and reduce to \va{unspecified}. See section~\ref{sec:semantics:underspecification} for an explanation of how \va{unspecified} reduces.
\beginfig
\subfigureadjust{}
\begin{center}
\input{r6-fig-Var-set!d_.tex}
\end{center}
\caption{Variable-assignment relation}\label{fig:varsetd}
\endfig
The next four rules handle the \va{null?} predicate and the \va{pair?} predicate, and the final four rules raise exceptions when \va{car}, \va{cdr}, \va{set-car!} or \va{set-cdr!} receive non pairs.
\section{Eqv}
The rules for \va{eqv?} are shown in figure~\ref{fig:Eqv}. The first two rules cover most of the behavior of \va{eqv?}.
The first says that when the two arguments to \va{eqv?} are syntactically identical, then \va{eqv?} produces \semtrue{} and the second says that when the arguments are not syntactically identical, then \va{eqv?} produces \semfalse{}.
The structure of \nt{v} has been carefully designed so that simple term equality corresponds closely to \va{eqv?}'s behavior.
For example, pairs are represented as pointers into the store and \va{eqv?} only compares those pointers.
The side-conditions on those first two rules ensure that they do not apply when simple term equality does not match the behavior of \va{eqv?}. There are two situations where it does not match: comparing two conditions and comparing two procedures. For the first, the report does not specify \va{eqv?}'s behavior, except to say that it must return a boolean, so the remaining two rules (\rulename{6eqct}, and \rulename{6eqcf}) allow such comparisons to return \semtrue{} or \semfalse{}. Comparing two procedures is covered in section~\ref{sec:semantics:underspecification}.
\section{Procedures and application}
In evaluating a procedure call, the report leaves
unspecified the order in which arguments are evaluated. So, our reduction system allows multiple, different reductions to occur, one for each possible order of evaluation.
To capture unspecified evaluation order but allow only evaluation that
is consistent with some sequential ordering of the evaluation of an
application's subexpressions, we use non-deterministic choice to first pick
a subexpression to reduce only when we have not already committed to
reducing some other subexpression. To achieve that effect, we limit
the evaluation of application expressions to only those that have a
single expression that is not fully reduced, as shown in the
non-terminal $F$, in figure~\ref{fig:ec-grammar}. To evaluate
application expressions that have more than two arguments to evaluate,
the rule \rulename{6mark} picks one of the subexpressions of an
application that is not fully simplified and lifts it out in its own
application, allowing it to be evaluated. Once one of the lifted
expressions is evaluated, the \rulename{6appN} substitutes its value
back into the original application.
The \rulename{6appN} rule also handles other applications whose
arguments are finished by substituting the first argument for
the first formal parameter in the expression. Its side-condition uses
the relation in figure~\ref{fig:varsetd} to ensure that there are no
\sy{set!} expressions with the parameter $x_1$ as a target.
If there is such an assignment, the \rulename{6appN!} rule applies (see also section~\ref{sec:semantics:quote} for a description of ``fresh'').
Instead of directly substituting the actual parameter for the formal
parameter, it creates a new location in the store, initially bound the
actual parameter, and substitutes a variable standing for that
location in place of the formal parameter. The store, then, handles
any eventual assignment to the parameter. Once all of the parameters
have been substituted away, the rule \rulename{6app0} applies and
evaluation of the body of the procedure begins.
At first glance, the rule \rulename{6appN} appears superfluous, since it seems like the rules could just reduce first by \rulename{6appN!} and then look up the variable when it is evaluated.
There are two reasons why we keep the \rulename{6appN}, however.
The first is purely conventional: reducing applications via substitution is taught to us at an early age and is commonly used in rewriting systems in the literature.
The second reason is more technical: the
\rulename{6mark} rule requires that \rulename{6appN} be applied once $\nt{e}_i$ has been reduced to a value. \rulename{6appN!} would
lift the value into the store and put a variable reference into the application, leading to another use of \rulename{6mark}, and another use of \rulename{6appN!}, which continues forever.
The rule \rulename{6$\mu$app} handles a well-formed application of a function with a dotted parameter lists.
It such an application into an application of an
ordinary procedure by constructing a list of the extra arguments. Similarly, the rule \rulename{6$\mu$app1} handles an application of a procedure that has a single variable as its parameter list.
The rule \rulename{6var} handles variable lookup in the store and \rulename{6set} handles variable assignment.
The next two rules \rulename{6proct} and \rulename{6procf} handle applications of \va{procedure?}, and the remaining rules cover applications of non-procedures and arity violations.
\beginfig
\subfigureadjust{}
\begin{center}
\input{r6-fig-Apply.tex}
\input{r6-fig-circular_.tex}
\end{center}
\caption{Apply}\label{fig:Apply}
\endfig
\subfigurestop{}
The rules in figure~\ref{fig:Apply}
cover \va{apply}.
The first rule, \rulename{6applyf}, covers the case where the last argument to
\va{apply} is the empty list, and simply reduces by erasing the
empty list and the \va{apply}. The second rule, \rulename{6applyc}
covers a well-formed application of \va{apply} where \va{apply}'s final argument is a pair. It
reduces by extracting the components of the pair from the store and
putting them into the application of \va{apply}. Repeated
application of this rule thus extracts all of the list elements passed
to \va{apply} out of the store.
The remaining five rules cover the
various violations that can occur when using \va{apply}. The first one covers the case where \va{apply} is supplied with a cyclic list. The next four cover applying a
non-procedure, passing a non-list as the last argument, and supplying
too few arguments to \va{apply}.
\section{Call/cc and dynamic wind}
\beginfig
\begin{center}
\input{r6-fig-Call-cc--and--dynamic-wind.tex} \\
\input{r6-fig-TrimpRepoSt.tex}
\end{center}
\caption{Call/cc and dynamic wind}\label{fig:Call-cc--and--dynamic-wind}
\endfig
The specification of \va{dynamic-wind} uses
$\texttt{(}\sy{dw}~x~e~e~e\texttt{)}$
expressions to record which dynamic-wind \var{thunk}s are active at
each point in the computation. Its first argument is an identifier
that is globally unique and serves to identify invocations of
\va{dynamic-wind}, in order to avoid exiting and re-entering the
same dynamic context during a continuation switch. The second, third,
and fourth arguments are calls to some \var{before}, \var{thunk}, and
\var{after} procedures from a call to \va{dynamic-wind}. Evaluation only
occurs in the middle expression; the \sy{dw} expression only
serves to record which \var{before} and \var{after} procedures need to be run during a
continuation switch. Accordingly, the reduction rule for an
application of \va{dynamic-wind} reduces to a call to the
\var{before} procedure, a \sy{dw} expression and a call to the
\var{after} procedure, as
shown in rule \rulename{6wind} in
figure~\ref{fig:Call-cc--and--dynamic-wind}. The next two rules cover
abuses of the \va{dynamic-wind} procedure: calling it with
non-procedures, and calling it with the wrong number of arguments. The
\rulename{6dwdone} rule erases a \sy{dw} expression when its second
argument has finished evaluating.
The next two rules cover \va{call/cc}. The rule
\rulename{6call/cc} creates a new continuation. It takes the context
of the \va{call/cc} expression and packages it up into a
\sy{throw} expression that represents the continuation. The
\sy{throw} expression uses the fresh variable $x$ to record
where the application of \va{call/cc} occurred in the context for
use in the \rulename{6throw} rule when the continuation is applied.
That rule takes the arguments of the continuation, wraps them with a
call to \va{values}, and puts them back into the place where the
original call to \va{call/cc} occurred, replacing the current
context with the context returned by the $\mathscr{T}$ metafunction.
The $\mathscr{T}$ (for ``trim'') metafunction accepts two $D$ contexts and
builds a context that matches its second argument, the destination
context, except that additional calls to the \var{before} and
\var{after} procedures
from \sy{dw} expressions in the context have been added.
The first clause of the $\mathscr{T}$ metafunction exploits the
$H$ context, a context that contains everything except
\sy{dw} expressions. It ensures that shared parts of the
\va{dynamic-wind} context are ignored, recurring deeper into the
two expression contexts as long as the first \sy{dw} expression in
each have matching identifiers ($x_1$). The final rule is a
catchall; it only applies when all the others fail and thus applies
either when there are no \sy{dw}s in the context, or when the
\sy{dw} expressions do not match. It calls the two other
metafunctions defined in figure~\ref{fig:Call-cc--and--dynamic-wind} and
puts their results together into a \sy{begin} expression.
The $\mathscr{R}$ metafunction extracts all of the \var{before}
procedures from its argument and the $\mathscr{S}$ metafunction extracts all of the \var{after} procedures from its argument. They each construct new contexts and exploit
$H$ to work through their arguments, one \sy{dw} at a time.
In each case, the metafunctions are careful to keep the right
\sy{dw} context around each of the procedures in case a continuation
jump occurs during one of their evaluations.
Since $\mathscr{R}$,
receives the destination context, it keeps the intermediate
parts of the context in its result.
In contrast
$\mathscr{S}$ discards all of the context except the \sy{dw}s,
since that was the context where the call to the
continuation occurred.
\section{Letrec}
\beginfig
\begin{center}
\input{r6-fig-Letrec.tex}
\end{center}
\caption{Letrec and letrec*}
\label{fig:Letrec}
\endfig
Figre~\ref{fig:Letrec} shows the rules that handle \sy{letrec} and \sy{letrec*} and the supplementary expressions that they produce, \sy{l!} and \sy{reinit}. As a first approximation, both \va{letrec} and \va{letrec*} reduce by allocating locations in the store to hold the values of the init expressions, initializing those locations to \sy{bh} (for ``black hole''), evaluating the init expressions, and then using \va{l!} to update the locations in the store with the value of the init expressions. They also use \va{reinit} to detect when an init expression in a letrec is reentered via a continuation.
Before considering how \sy{letrec} and \sy{letrec*} use \sy{l!} and \sy{reinit}, first consider how \sy{l!} and \sy{reinit} behave. The first two rules in figure~\ref{fig:Letrec} cover \sy{l!}. It behaves very much like \sy{set!}, but it initializes both ordinary variables, and variables that are current bound to the black hole (\sy{bh}).
The next two rules cover ordinary \sy{set!} when applied to a variable
that is currently bound to a black hole. This situation can arise when
the program assigns to a variable before letrec initializes it, eg
\verb|(letrec ((x (set! x 5))) x)|. The report specifies that either
an implementation should perform the assignment, as reflected in the
\rulename{6setdt} rule or it raise an exception, as reflected in the \rulename{6setdte} rule.
The \rulename{6dt} rule covers the case where a variable is referred
to before the value of a init expression is filled in, which must
always raise an exception.
A \va{reinit} expression is used to detect a program that captures a continuation in an initialization expression and returns to it, as shown in the three rules \rulename{6init}, \rulename{6reinit}, and \rulename{6reinite}. The \va{reinit} form accepts an identifier that is bound in the store to a boolean as its argument. Those are identifiers are initially \semfalse{}. When \va{reinit} is evaluated, it checks the value of the variable and, if it is still \semfalse{}, it changes it to \semtrue{}. If it is already \semtrue{}, then \va{reinit} either just does nothing, or it raises an exception, in keeping with the two legal behaviors of \va{letrec} and \va{letrec*}.
The last two rules in figure~\ref{fig:Letrec} put together \sy{l!} and \sy{reinit}. The \rulename{6letrec} rule reduces a \sy{letrec} expression to an application expression, in order to capture the unspecified order of evaluation of the init expressions. Each init expression is wrapped in a \sy{begin0} that records the value of the init and then uses \sy{reinit} to detect continuations that return to the init expression. Once all of the init expressions have been evaluated, the procedure on the right-hand side of the rule is invoked, causing the value of the init expression to be filled in the store, and evaluation continues with the body of the original \sy{letrec} expression.
The \rulename{6letrec*} rule behaves similarly, but uses a \sy{begin} expression rather than an application, since the init expressions are evaluated from left to right. Moreover, each init expression is filled into the store as it is evaluated, so that subsequent init expressions can refer to its value.
\section{Underspecification}\label{sec:semantics:underspecification}
\beginfig
\begin{center}
\input{r6-fig-Underspecification.tex}
\end{center}
\caption{Explicitly unspecified behavior}\label{fig:Underspecification}
\endfig
The rules in figure~\ref{fig:Underspecification} cover aspects of the
semantics that are explicitly unspecified. Implementations can replace
the rules \rulename{6ueqv}, \rulename{6uval} and with different rules that cover the left-hand sides and, as long as they follow the informal specification, any replacement is valid. Those three situations correspond to the case when \va{eqv?} applied to two procedures and when multiple values are used in a single-value context.
The remaining rules in figure~\ref{fig:Underspecification} cover the results from the assignment operations, \sy{set!}, \va{set-car!}, and \va{set-cdr!}. An implementation does not adjust those rules, but instead renders them useless by adjusting the rules that insert \va{unspecified}: \rulename{6setcar}, \rulename{6setcdr}, \rulename{6set}, and \rulename{6setd}. Those rules can be adjusted by replacing \va{unspecified} with any number of values in those rules.
So, the remaining rules just specify the minimal behavior that we know that a value or values must have and otherwise reduce to an \textbf{unknown:} state. The rule \rulename{6udemand} drops \va{unspecified} in the \sy{U} context. See figure~\ref{fig:ec-grammar} for the precise definition of \sy{U}, but intuitively it is a context that is only a single expression layer deep that contains expressions whose value depends on the value of their subexpressions, like the first subexpression of a \sy{if}. Following that are rules that discard \va{unspecified} in expressions that discard the results of some of their subexpressions. The \rulename{6ubegin} shows how \sy{begin} discards its first expression when there are more expressions to evaluate. The next two rules, \rulename{6uhandlers} and \rulename{6udw} propagate \va{unspecified} to their context, since they also return any number of values to their context. Finally, the two \va{begin0} rules preserve \va{unspecified} until the rule \rulename{6begin01} can return it to its context.
%\section*{Acknowledgments}
%Thanks to Michael Sperber for many helpful discussions of specific points in the semantics, for spotting many mistakes and places where the formal semantics diverged from the informal semantics, and for generally making it possible for us to keep up with changes to the informal semantics as it developed. Thanks also to Will Clinger for a careful reading of the semantics and its explanation.
%%% Local Variables:
%%% mode: latex
%%% TeX-master: "r6rs"
%%% End: