Standard ML
Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of theorem provers.
SML is a modern descendant of the ML programming language used in the LCF theorem-proving project. It is unique among widely used languages in that it has a formal specification, given as typing rules and an operational semantics in The Definition of Standard ML (1990, revised and simplified as The Definition of Standard ML (Revised) in 1997).
Language
Standard ML is a mostly functional programming language. Programs written in Standard ML mostly consist of expressions whose values are to be calculated.
Like all functional programming languages, a key feature of Standard ML is the function which is used for abstraction. For instance, the factorial function can be expressed as:
fun factorial x = if x = 0 then 1 else x * factorial (x-1)
A Standard ML compiler is required to infer the static type int -> int of this function without user-supplied type annotations. I.e., it has to deduce that x is only used with integer expressions, and must therefore itself be an integer, and that all value-producing expressions within the function return integers.
The same function can be expressed with clausal function definitions where the if-then-else conditional is replaced by a sequence of templates of the factorial function evaluated for specific values, separated by '|', which are tried one by one in the order written until a match is found:
fun factorial 0 = 1 | factorial n = n * factorial (n - 1)
Using a local function, this function can be rewritten to use tail recursion:
fun factorial x = let fun tail_fact p 0 = p | tail_fact p n = tail_fact (p * n) (n - 1) in tail_fact 1 x end
The value of a let-expression is the expression between in and end.
Code examples
Snippets of SML code are most easily studied by entering them into a "top-level". This is an interactive session that prints the inferred types of resulting or defined expressions. Many SML implementations provide an interactive top-level, including SML/NJ:
$ sml Standard ML of New Jersey v110.52 [built: Fri Jan 21 16:42:10 2005] -
Code can then be entered at the "-" prompt. For example, to calculate 1+2*3:
- 1 + 2 * 3; val it = 7 : int
The top-level infers the type of the expression to be "int" (a machine-precision integer) and gives the result "7".
Hello World
The following program "hello.sml":
print "Hello world!\n";
can be compiled with MLton:
$ mlton hello.sml
and executed:
$ ./hello Hello world! $
Arbitrary-precision factorial function (libraries)
In SML, the IntInf module provides arbitrary-precision integer arithmetic. Moreover, integer literals may be used as arbitrary-precision integers without the programmer having to do anything.
The following program "fact.sml" implements an arbitrary-precision factorial function and prints 120!:
fun fact n : IntInf.int = if n=0 then 1 else n * fact(n - 1) val () = print (IntInf.toString (fact 120)^"\n")
and can be compiled and run with:
$ mlton fact.sml $ ./fact 66895029134491270575881180540903725867527463331380298102956713523016335 57244962989366874165271984981308157637893214090552534408589408121859898 481114389650005964960521256960000000000000000000000000000
Numerical derivative (higher-order functions)
As a functional programming language, it is easy to create and pass around functions in SML programs. This capability has an enormous number of applications. Calculating the numerical derivative of a function is one such application. The following SML function "d" computes the numerical derivative of a given function "f" at a given point "x":
- fun d delta f x = (f (x + delta) - f (x - delta)) / (2.0 * delta); val d = fn : real -> (real -> real) -> real -> real
This function requires a small value "delta". A good choice for delta is the square root of the machine epsilon.
The type of the function "d" indicates that it maps a "float" onto another function with the type "(real -> real) -> real -> real". This allows us to partially apply arguments. This functional style is known as currying. In this case, it is useful to partially apply the first argument "delta" to "d", to obtain a more specialised function:
- val d = d 1E~8; val d = fn : (real -> real) -> real -> real
Note that the inferred type indicates that the replacement "d" is expecting a function with the type "real -> real" as its first argument. We can compute a numerical approximation to the derivative of x^3-x-1 at x=3 with:
- d (fn x => x * x * x - x - 1.0) 3.0; val it = 25.9999996644 : real
The correct answer is f'(x) = 3x^2-1 => f'(3) = 27-1 = 26.
The function "d" is called a "higher-order function" because it accepts another function ("f") as an argument.
The concepts of curried and higher-order functions are clearly useful in mathematical programs. In fact, these concepts are equally applicable to most other forms of programming and can be used to factor code much more aggresively, resulting in shorter programs and fewer bugs.
Discrete Wavelet Transform (pattern matching)
The 1D Haar wavelet transform of an integer-power-of-two-length list of numbers can be implemented very succinctly in SML and is an excellent example of the use of pattern matching over lists, taking pairs of elements ("h1" and "h2") off the front and storing their sums and differences on the lists "s" and "d", respectively:
- fun haar l = let fun aux [s] [] d = s :: d | aux [] s d = aux s [] d | aux (h1::h2::t) s d = aux t (h1 + h2 :: s) (h1 - h2 :: d) | aux _ _ _ = raise Empty in aux l [] [] end; val haar = fn : int list -> int list
For example:
- haar [1, 2, 3, 4, ~4, ~3, ~2, ~1]; val it = [0,20,4,4,~1,~1,~1,~1] : int list
Pattern matching is an incredibly useful construct that allows complicated transformations to be represented clearly and succintly. Moreover, SML compilers turn pattern matches into very efficient code, resulting in programs that are not only much shorter but also much faster.
Implementations
Some SML implementations include:
- MLton is a whole-program optimizing compiler that produces very fast code compared to other ML implementations. [1]
- Standard ML of New Jersey (abbreviated SML/NJ) is a full compiler, with associated libraries, tools, an interactive shell, and documentation. [2]
- Moscow ML is a light-weight implementation, based on the CAML Light runtime engine. It implements the full SML language, including SML Modules, and much of the SML Basis Library. [3]
- Poly/ML is a full implementation of Standard ML.
- TILT is a fully fledged certifying compiler for SML. It uses typed intermediate languages to optimize code and ensure correctness, and can compile to typed assembly language.
- HaMLet is an SML interpreter that aims to be an accurate and accessible reference implementation of the standard.
- The ML Kit is meant as a starting point for language extensions and tinkering. Its implementation is based very closely on the Definition.
- SML.NET allows compilation to the Microsoft CLR and has extensions for linking with other .NET code.
- Alice ML is an extension of SML for concurrency, distributed programming, and constraint programming. In particular, it supports type-safe, platform-independent higher-order serialisation.
- SML2c is a batch compiler and compiles only module-level declarations (i.e. signatures, structures and functors) into C. It is based on SML/NJ version 0.67 and shares the front end and most of its run-time system, but does not support SML/NJ style debugging and profiling. Module-level programs that run on SML/NJ can be compiled by sml2c without any changes.
All of the implementations above are open-source and freely available. Most are implemented themselves in SML. There are no longer any commercial SML implementations; although Harlequin once produced a commercial IDE and compiler for SML called MLWorks, the company is now defunct. MLWorks is believed to have passed to Xanalys.