Man or boy test: Difference between revisions
m →Knuth's example: remove self reference |
Citation bot (talk | contribs) Add: authors 1-1. Removed URL that duplicated identifier. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Dominic3203 | Category:Programming language design | #UCB_Category 13/15 |
||
(3 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description|Computer algorithm for evaluating compiler implementations}} |
{{Short description|Computer algorithm for evaluating compiler implementations}} |
||
The '''man or boy test''' was proposed by computer scientist [[Donald Knuth]] as a means of evaluating implementations of the [[ALGOL 60]] programming language. The aim of the test was to distinguish [[compiler]]s that correctly implemented "[[recursion (computer science)|recursion]] and [[non-local reference]]s" from those that did not. |
The '''man or boy test''' was proposed by computer scientist [[Donald Knuth]] as a means of evaluating implementations of the [[ALGOL 60]] programming language. The aim of the test was to distinguish [[compiler]]s that correctly implemented "[[recursion (computer science)|recursion]] and [[non-local reference]]s" from those that did not.<ref>{{Cite journal |last1=Ardö |first1=Anders |last2=Philipson |first2=Lars |date=March 1984 |title=A simple Ada compiler invalidation test |journal=ACM SIGAda Ada Letters |language=en |volume=III |issue=5 |pages=69–74 |doi=10.1145/998382.998385 |issn=1094-3641|doi-access=free }}</ref> |
||
<!-- scoping and [[call by name]] (the "men") from those that did not (the "boys"). --> |
<!-- scoping and [[call by name]] (the "men") from those that did not (the "boys"). --> |
||
Line 25: | Line 25: | ||
end; |
end; |
||
if k ≤ 0 then A := x4 + x5 else B |
if k ≤ 0 then A := x4 + x5 else B |
||
end |
end; |
||
outreal(1, A(10, 1, -1, -1, 1, 0)) |
outreal(1, A(10, 1, -1, -1, 1, 0)) |
||
end |
end |
||
</syntaxhighlight> |
</syntaxhighlight> |
||
This creates a tree of ''B'' call frames that refer to each other and to the containing ''A'' call frames, each of which has its own copy of ''k'' that changes every time the associated ''B'' is called. Trying to work it through on paper is probably fruitless, but for ''k'' = 10, the correct answer is −67, despite the fact that in the original article Knuth conjectured it to be −121 |
This creates a tree of ''B'' call frames that refer to each other and to the containing ''A'' call frames, each of which has its own copy of ''k'' that changes every time the associated ''B'' is called. Trying to work it through on paper is probably fruitless, but for ''k'' = 10, the correct answer is −67, despite the fact that in the original article Knuth conjectured it to be −121. Even modern machines quickly run out of [[call stack|stack]] space for larger values of ''k'', which are tabulated below ({{OEIS2C|A132343}}). |
||
{| class="wikitable" style="text-align:right" |
{| class="wikitable" style="text-align:right" |
||
Line 119: | Line 119: | ||
==Explanation== |
==Explanation== |
||
{{unreferenced section|date=April 2021}} |
|||
There are three Algol features used in this program that can be difficult to implement properly in a compiler: |
There are three Algol features used in this program that can be difficult to implement properly in a compiler: |
||
# '''[[Nested function]] definitions''': Since ''B'' is being defined in the local context of ''A'', the body of ''B'' has access to symbols that are local to ''A'' — most notably ''k'', which it modifies, but also ''x1'', ''x2'', ''x3'', ''x4'', and ''x5''. This is straightforward in the Algol descendant [[Pascal (programming language)|Pascal]], but not possible in the other major Algol descendant [[C (programming language)|C]] (without manually simulating the mechanism by using C's address-of operator, passing around pointers to local variables between the functions). |
# '''[[Nested function]] definitions''': Since ''B'' is being defined in the local context of ''A'', the body of ''B'' has access to symbols that are local to ''A'' — most notably ''k'', which it modifies, but also ''x1'', ''x2'', ''x3'', ''x4'', and ''x5''. This is straightforward in the Algol descendant [[Pascal (programming language)|Pascal]], but not possible in the other major Algol descendant [[C (programming language)|C]] (without manually simulating the mechanism by using C's address-of operator, passing around pointers to local variables between the functions). |
||
# '''[[Function reference]]s''': The ''B'' in the recursive call <code>A(k, B, x1, x2, x3, x4)</code> is not a call to ''B'', but a reference to ''B'', which will be called only when ''k'' is greater than zero. This is straightforward in standard Pascal ([[ISO 7185]]), and also in C. Some variants of Pascal (e.g. older versions of [[Turbo Pascal]]) do not support procedure references, but when the set of functions that may be referenced is known beforehand (in this program it is only ''B''), this can be worked around. |
# '''[[Function reference]]s''': The ''B'' in the recursive call <code>A(k, B, x1, x2, x3, x4)</code> is not a call to ''B'', but a reference to ''B'', which will be called only when ''k'' is greater than zero. This is straightforward in standard Pascal ([[ISO 7185]]), and also in C. Some variants of Pascal (e.g. older versions of [[Turbo Pascal]]) do not support procedure references, but when the set of functions that may be referenced is known beforehand (in this program it is only ''B''), this can be worked around. |
||
# '''Constant/function dualism''': The ''x1'' through ''x5'' parameters of ''A'' may be numeric constants or references to the function ''B'' — the <code>x4 + x5</code> expression must be prepared to handle both cases as if the formal parameters ''x4'' and ''x5'' had been replaced by the corresponding actual parameter ([[call by name]]). This is probably more of a problem in [[statically typed]] languages than in dynamically typed languages, but the standard workaround is to reinterpret the constants 1, 0, and −1 in the main call to ''A'' as functions without arguments that return these values. |
# '''Constant/function dualism''': The ''x1'' through ''x5'' parameters of ''A'' may be numeric constants or references to the function ''B'' — the <code>x4 + x5</code> expression must be prepared to handle both cases as if the formal parameters ''x4'' and ''x5'' had been replaced by the corresponding actual parameter ([[call by name]]).<ref>{{Cite journal |last=Wichmann |first=B. A. |date=1972-02-01 |title=Five ALGOL Compilers |url=https://academic.oup.com/comjnl/article-lookup/doi/10.1093/comjnl/15.1.8 |journal=The Computer Journal |language=en |volume=15 |issue=1 |page=8 |doi=10.1093/comjnl/15.1.8 |issn=0010-4620}}</ref> This is probably more of a problem in [[statically typed]] languages than in dynamically typed languages, but the standard workaround is to reinterpret the constants 1, 0, and −1 in the main call to ''A'' as functions without arguments that return these values. |
||
These things are, however, not what the test is about; they are merely prerequisites for the test to at all be meaningful. What the test is ''about'' is whether the different references to ''B'' resolve to the ''correct'' instance of ''B'' — one that has access to the same ''A''-local symbols as the ''B'' that created the reference. A "boy" compiler might, for example, instead compile the program so that ''B'' always accesses the topmost ''A'' call frame. |
These things are, however, not what the test is about; they are merely prerequisites for the test to at all be meaningful. What the test is ''about'' is whether the different references to ''B'' resolve to the ''correct'' instance of ''B'' — one that has access to the same ''A''-local symbols as the ''B'' that created the reference. A "boy" compiler might, for example, instead compile the program so that ''B'' always accesses the topmost ''A'' call frame. |
Latest revision as of 12:58, 29 November 2024
The man or boy test was proposed by computer scientist Donald Knuth as a means of evaluating implementations of the ALGOL 60 programming language. The aim of the test was to distinguish compilers that correctly implemented "recursion and non-local references" from those that did not.[1]
There are quite a few ALGOL60 translators in existence which have been designed to handle recursion and non-local references properly, and I thought perhaps a little test-program may be of value. Hence I have written the following simple routine, which may separate the man-compilers from the boy-compilers.
Knuth's example
[edit]In ALGOL 60:
begin
real procedure A(k, x1, x2, x3, x4, x5);
value k; integer k;
real x1, x2, x3, x4, x5;
begin
real procedure B;
begin k := k - 1;
B := A := A(k, B, x1, x2, x3, x4)
end;
if k ≤ 0 then A := x4 + x5 else B
end;
outreal(1, A(10, 1, -1, -1, 1, 0))
end
This creates a tree of B call frames that refer to each other and to the containing A call frames, each of which has its own copy of k that changes every time the associated B is called. Trying to work it through on paper is probably fruitless, but for k = 10, the correct answer is −67, despite the fact that in the original article Knuth conjectured it to be −121. Even modern machines quickly run out of stack space for larger values of k, which are tabulated below (OEIS: A132343).
k | |
---|---|
0 | 1 |
1 | 0 |
2 | −2 |
3 | 0 |
4 | 1 |
5 | 0 |
6 | 1 |
7 | −1 |
8 | −10 |
9 | −30 |
10 | −67 |
11 | −138 |
12 | −291 |
13 | −642 |
14 | −1446 |
15 | −3250 |
16 | −7244 |
17 | −16065 |
18 | −35601 |
19 | −78985 |
20 | −175416 |
21 | −389695 |
22 | −865609 |
23 | −1922362 |
24 | −4268854 |
25 | −9479595 |
26 | −21051458 |
Explanation
[edit]There are three Algol features used in this program that can be difficult to implement properly in a compiler:
- Nested function definitions: Since B is being defined in the local context of A, the body of B has access to symbols that are local to A — most notably k, which it modifies, but also x1, x2, x3, x4, and x5. This is straightforward in the Algol descendant Pascal, but not possible in the other major Algol descendant C (without manually simulating the mechanism by using C's address-of operator, passing around pointers to local variables between the functions).
- Function references: The B in the recursive call
A(k, B, x1, x2, x3, x4)
is not a call to B, but a reference to B, which will be called only when k is greater than zero. This is straightforward in standard Pascal (ISO 7185), and also in C. Some variants of Pascal (e.g. older versions of Turbo Pascal) do not support procedure references, but when the set of functions that may be referenced is known beforehand (in this program it is only B), this can be worked around. - Constant/function dualism: The x1 through x5 parameters of A may be numeric constants or references to the function B — the
x4 + x5
expression must be prepared to handle both cases as if the formal parameters x4 and x5 had been replaced by the corresponding actual parameter (call by name).[3] This is probably more of a problem in statically typed languages than in dynamically typed languages, but the standard workaround is to reinterpret the constants 1, 0, and −1 in the main call to A as functions without arguments that return these values.
These things are, however, not what the test is about; they are merely prerequisites for the test to at all be meaningful. What the test is about is whether the different references to B resolve to the correct instance of B — one that has access to the same A-local symbols as the B that created the reference. A "boy" compiler might, for example, instead compile the program so that B always accesses the topmost A call frame.
See also
[edit]References
[edit]- ^ Ardö, Anders; Philipson, Lars (March 1984). "A simple Ada compiler invalidation test". ACM SIGAda Ada Letters. III (5): 69–74. doi:10.1145/998382.998385. ISSN 1094-3641.
- ^ Donald Knuth (July 1964). "Man or boy?". ALGOL Bulletin. 17: 7. "AB17.2.4 Donald Knuth: Man or boy?, page 7". archive.computerhistory.org. See also: "Algol Bulletin". Computing at Chilton: 1961–2000. Retrieved Dec 25, 2009.
- ^ Wichmann, B. A. (1972-02-01). "Five ALGOL Compilers". The Computer Journal. 15 (1): 8. doi:10.1093/comjnl/15.1.8. ISSN 0010-4620.
External links
[edit]- Man or boy test examples in many programming languages