Jump to content

Reference (computer science): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Address analogy: Added a bt more clarity to the example
 
(265 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
{{Short description|Data type which allows a program to indirectly access a particular value in memory}}
:''This article discusses a general notion of reference in computing. See also the more specific notion of [[reference (C++)|reference]] used in [[C++]].''
{{About|the general concept in computing|the more specific concept in C++|Reference (C++)}}
----
{{More citations needed|date=November 2010}}


In [[computer science]], a '''reference''' is a small [[object (computer science)|object]] containing information which refers to data elsewhere, as opposed to containing the data itself. Accessing the [[value (computer science)|value]] that a reference refers to is called '''dereferencing''' it. References are fundamental in constructing many data structures and in exchanging information between different parts of a program.
In [[computer programming]], a '''reference''' is a value that enables a program to indirectly access a particular [[datum]], such as a [[variable (computer science)|variable]]'s value or a [[record (computer science)|record]], in the [[computer]]'s [[memory (computing)|memory]] or in some other [[Data storage device|storage device]]. The reference is said to '''refer''' to the datum, and accessing the datum is called '''[[Dereference operator|dereferencing]]''' the reference. A reference is distinct from the datum itself.


A reference is an [[abstract data type]] and may be implemented in many ways. Typically, a reference refers to data stored in memory on a given system, and its internal value is the [[memory address]] of the data, i.e. a reference is implemented as a [[Pointer (computer programming)|pointer]]. For this reason a reference is often said to "point to" the data. Other implementations include an offset (difference) between the datum's address and some fixed "base" address, an [[array index|index]], or [[identifier]] used in a [[lookup]] operation into an [[Array data structure|array]] or [[table (database)|table]], an operating system [[Handle (computing)|handle]], a [[physical address]] on a storage device, or a network address such as a [[URL]].
== Address analogy ==


==Formal representation==
A reference may be compared to the address of a house. It is a small identifier which helps you to find a potentially much larger object with more information in it. For example, although looking at a house will tell you its color, the address will not; it only enables you to find it. However, if you want to find out the color of a house, the address is enough information, because you can use it to find the house, then look at the house itself. Finding a house based on its address is analogous to dereferencing a reference.
A reference ''R'' is a value that admits one operation, <kbd>dereference</kbd>(''R''), which yields a value. Usually the reference is typed so that it returns values of a specific type, e.g.:<ref name=Sherman>{{cite book |last1=Sherman |first1=Mark S. |title=Paragon: A Language Using Type Hierarchies for the Specification, Implementation, and Selection of Abstract Data Types |date=April 1985 |publisher=Springer Science & Business Media |isbn=978-3-540-15212-5 |page=175 |url=https://books.google.com/books?id=tXzuooE8EVsC |language=en}}</ref><ref>{{cite web |title=Reference (Java Platform SE 7) |url=https://docs.oracle.com/javase/7/docs/enwiki/api/java/lang/ref/Reference.html |website=docs.oracle.com |access-date=10 May 2022}}</ref>
<syntaxhighlight lang="java">
interface Reference<T> {
T value();
}
</syntaxhighlight>


Often the reference also admits an assignment operation <kbd>store</kbd>(''R'', ''x''), meaning it is an [[Abstract data type#Abstract variable|abstract variable]].<ref name=Sherman/>
In a more complicated example, suppose you leave a forwarding address in your old house each time you move. A person could visit your first house, then follow the forwarding address to the next house, and so on until they finally find your current house. This is analogous to how references are used in singly [[linked list]]s.


==Use==
Another benefit of addresses is that they're much easier to deal with than actual houses. Say you want to be able to easily locate people on your street based on their last name. One way to do this is to use a large crane to physically pick up and rearrange all the houses based on the last names of the residents. A much easier solution is to make a list of addresses of people on your street and sort it by their last names. References have the same benefit: you can manipulate references to data without actually having to modify the data itself, which in some cases can be much more efficient.
References are widely used in [[computer programming|programming]], especially to efficiently pass large or mutable data as [[argument (computer science)|arguments]] to [[subroutine|procedures]], or to share such data among various uses. In particular, a reference may point to a variable or record that contains references to other data. This idea is the basis of [[indirect addressing]] and of many [[linked data structure]]s, such as [[linked list]]s. References increase flexibility in where objects can be stored, how they are allocated, and how they are passed between areas of [[code]]. As long as one can access a reference to the data, one can access the data through it, and the data itself need not be moved. They also make sharing of data between different code areas easier; each keeps a reference to it.


References can cause significant complexity in a program, partially due to the possibility of [[dangling reference|dangling]] and [[wild reference]]s and partially because the [[topology]] of data with references is a [[directed graph]], whose analysis can be quite complicated. Nonetheless, references are still simpler to analyze than [[Pointer (computer programming)|pointer]]s due to the absence of [[pointer arithmetic]].
Other examples of references abound in everyday life: telephone numbers, e-mail addresses, [[Uniform Resource Locator|URL]]s, and so on. Each refers to and facilitates access to a remote resource.

== Benefits of references ==

References increase flexibility in where objects can be stored, how they are allocated, and how they are passed between areas of code. As long as we can access a reference to the data, we can access the data through it, and the data itself need not be moved. They also make sharing of data between different code areas easier; each keeps a reference to it.


The mechanism of references, if varying in implementation, is a fundamental programming language feature common to nearly all modern programming languages. Even some languages that support no direct use of references have some internal or implicit use. For example, the [[Evaluation strategy|call by reference]] calling convention can be implemented with either explicit or implicit use of references.
The mechanism of references, if varying in implementation, is a fundamental programming language feature common to nearly all modern programming languages. Even some languages that support no direct use of references have some internal or implicit use. For example, the [[Evaluation strategy|call by reference]] calling convention can be implemented with either explicit or implicit use of references.


==Examples==
[[Pointer]]s are the most primitive and error-prone but also one of the most powerful and efficient types of references, storing only the address of an object in memory. [[Smart pointer]]s are [[opaque pointer|opaque data structures]] that act like pointers but can only be accessed through particular methods.
[[Pointer (computer programming)|Pointer]]s are the most primitive type of reference. Due to their intimate relationship with the underlying hardware, they are one of the most powerful and efficient types of references. However, also due to this relationship, pointers require a strong understanding by the programmer of the details of memory architecture. Because pointers store a memory location's address, instead of a value directly, inappropriate use of pointers can lead to [[undefined behavior]] in a program, particularly due to [[dangling pointer]]s or [[wild pointer]]s. [[Smart pointer]]s are [[opaque pointer|opaque data structures]] that act like pointers but can only be accessed through particular methods.


A '''file handle''', or '''handle''' is a type of reference used to abstract file content. It usually represents both the file itself, as when requesting a [[lock (computer science)|lock]] on the file, and a specific position within the file's content, as when reading a file.
A [[Handle (computing)|handle]] is an abstract reference, and may be represented in various ways. A common example are [[file handle]]s (the FILE data structure in the [[stdio|C standard I/O library]]), used to abstract file content. It usually represents both the file itself, as when requesting a [[lock (computer science)|lock]] on the file, and a specific position within the file's content, as when reading a file.


In [[distributed computing]], the reference may contain more than an address or identifier; it may also include an embedded specification of the network protocols used to locate and access the referenced object, the way information is encoded or serialized. Thus, for example, a [[web services description language|WSDL]] description of a remote web service can be viewed as a form of reference; it includes a complete specification of how to locate and bind to a particular [[web service]]. A reference to a [[live distributed object]] is another example: it is a complete specification for how to construct a small software component called a ''proxy'' that will subsequently engage in a peer-to-peer interaction, and through which the local machine may gain access to data that is replicated or exists only as a weakly consistent message stream. In all these cases, the reference includes the full set of instructions, or a recipe, for how to access the data; in this sense, it serves the same purpose as an identifier or address in memory.
== Formal representation ==


More generally, a reference can be considered as a piece of data that allows unique retrieval of another piece of data. This includes [[primary key]]s in [[database]]s and keys in an [[associative array]]. If we have a set of data ''D'', any well-defined (single-valued) function from ''D'' onto ''D'' &cup; {[[Null (computer)|null]]} defines a type of reference, where ''null'' is the image of a piece of data not referring to anything meaningful.
If we have a set of keys ''K'' and a set of data objects ''D'', any well-defined (single-valued) function from ''K'' to ''D'' {[[Nullable type|null]]} defines a type of reference, where ''null'' is the image of a key not referring to anything meaningful.


An alternative representation of such a function is a directed graph called a [[reachability graph]]. Here, each datum is represented by a vertex and there is an edge from ''u'' to ''v'' if the datum in ''u'' refers to the datum in ''v''. The maximum [[out-degree]] is one. These graphs are valuable in [[garbage collection (computer science)|garbage collection]], where they can be used to separate accessible from [[Unreachable object|inaccessible objects]].
An alternative representation of such a function is a directed graph called a [[reachability graph]]. Here, each datum is represented by a vertex and there is an edge from ''u'' to ''v'' if the datum in ''u'' refers to the datum in ''v''. The maximum [[out-degree]] is one. These graphs are valuable in [[garbage collection (computer science)|garbage collection]], where they can be used to separate accessible from [[Unreachable object|inaccessible objects]].


== External and internal storage ==
==External and internal storage==

In many data structures, large, complex objects are composed of smaller objects. These objects are typically stored in one of two ways:
In many data structures, large, complex objects are composed of smaller objects. These objects are typically stored in one of two ways:


Line 39: Line 43:
Internal storage is usually more efficient, because there is a space cost for the references and [[dynamic memory allocation|dynamic allocation]] metadata, and a time cost associated with dereferencing a reference and with allocating the memory for the smaller objects. Internal storage also enhances [[locality of reference]] by keeping different parts of the same large object close together in memory. However, there are a variety of situations in which external storage is preferred:
Internal storage is usually more efficient, because there is a space cost for the references and [[dynamic memory allocation|dynamic allocation]] metadata, and a time cost associated with dereferencing a reference and with allocating the memory for the smaller objects. Internal storage also enhances [[locality of reference]] by keeping different parts of the same large object close together in memory. However, there are a variety of situations in which external storage is preferred:


* If the data structure is recursive, meaning it may contain itself. This cannot be represented in the internal way.
* If the [[Recursive data type|data structure is recursive]], meaning it may contain itself. This cannot be represented in the internal way.
* If the larger object is being stored in an area with limited space, such as the stack, then we can prevent running out of storage by storing large component objects in another memory region and referring to them using references.
* If the larger object is being stored in an area with limited space, such as the stack, then we can prevent running out of storage by storing large component objects in another memory region and referring to them using references.
* If the smaller objects may vary in size, it's often inconvenient or expensive to resize the larger object so that it can still contain them.
* If the smaller objects may vary in size, it is often inconvenient or expensive to resize the larger object so that it can still contain them.
* References are often easier to work with and adapt better to new requirements.
* References are often easier to work with and adapt better to new requirements.


Some languages, such as [[Java programming language|Java]], do not support internal storage. In these languages, all objects are uniformly accessed through references.
Some languages, such as [[Java (programming language)|Java]], [[Smalltalk]], [[Python (programming language)|Python]], and [[Scheme (programming language)|Scheme]], do not support internal storage. In these languages, all objects are uniformly accessed through references.


== Language support ==
==Language support==


=== Assembly ===
In [[assembly language]]s, the first languages used, it is typical to express references using either raw memory addresses or indexes into tables. These work, but are somewhat tricky to use, because an address tells you nothing about the value it points to, not even how large it is or how to interpret it; such information is encoded in the program logic. The result is that misinterpretations can occur in incorrect programs, causing bewildering errors.
In [[assembly language]], it is typical to express references using either raw memory addresses or indexes into tables. These work, but are somewhat tricky to use, because an address tells you nothing about the value it points to, not even how large it is or how to interpret it; such information is encoded in the program logic. The result is that misinterpretations can occur in incorrect programs, causing bewildering errors.


=== Lisp ===
One of the earliest opaque references was that of the [[Lisp programming language]] [[cons|cons cell]], which is simply a [[object composition|record]] containing two references to other Lisp objects, including possibly other cons cells. This simple structure is most commonly used to build singly [[linked list]]s, but can also be used to build simple [[binary tree]]s and so-called "dotted lists", which terminate not with a null reference but a value.


One of the earliest opaque references was that of the [[Lisp (programming language)|Lisp]] language [[cons|cons cell]], which is simply a [[object composition|record]] containing two references to other Lisp objects, including possibly other cons cells. This simple structure is most commonly used to build singly [[linked list]]s, but can also be used to build simple [[binary tree]]s and so-called "dotted lists", which terminate not with a null reference but a value.
Another early language, Fortran, does not have an explicit representation of references, but does use them implicitly in its [[call-by-reference]] calling semantics.


=== C/C++ ===
The [[C programming language]] introduced the [[pointer]], still one of the most popular types of references today. It is similar to the assembly representation of a raw address, except that it carries a static [[datatype]] which can be used at compile-time to ensure that the data it refers to is not misinterpreted. However, because C has a [[weak typing|weak type system]] which can be violated using [[Cast (computer science)|casts]] (explicit conversions between various pointer types and between pointer types and integers), misinterpretation is still possible, if more difficult. Its successor [[C++]] tried to increase [[type safety]] of pointers with new cast operators and smart pointers in its standard [[library (computer science)|library]], but still retained the ability to circumvent these safety mechanisms for compatibility.
{{Further|Reference (C++)}}


The [[Pointer (computer programming)|pointer]] is still one of the most popular types of references today. It is similar to the assembly representation of a raw address, except that it carries a static [[datatype]] which can be used at compile-time to ensure that the data it refers to is not misinterpreted. However, because C has a [[weak typing|weak type system]] which can be violated using [[Cast (computer science)|casts]] (explicit conversions between various pointer types and between pointer types and integers), misinterpretation is still possible, if more difficult. Its successor [[C++]] tried to increase [[type safety]] of pointers with new cast operators, a [[reference type]] <code>&</code>, and smart pointers in [[C++ standard library|its standard library]], but still retained the ability to circumvent these safety mechanisms for compatibility.
A number of popular mainstream languages today such as Java, [[C sharp|C#]], and [[Visual Basic]] have adopted a much more opaque type of reference, usually referred to as simply a ''reference''. These references have types like C pointers indicating how to interpret the data they reference, but they are typesafe in that they cannot be interpreted as a raw address and unsafe conversions are not permitted. In those ''managed'' languages, the references are actually pointers of pointers of the referred data. In C/C++, the reference concept of managed languages means two-step pointing. The [[Garbage Collector]] is the sole actor that can directly access the mid-step pointers, which cause the opacity. Typically pointer arithmetic is also not supported.


=== Fortran ===
===Fortran===
Fortran does not have an explicit representation of references, but does use them implicitly in its [[call-by-reference]] calling semantics. A [[Fortran]] reference is best thought of as an ''alias'' of another object, such as a scalar variable or a row or column of an array. There is no syntax to dereference the reference or manipulate the contents of the referent directly. Fortran references can be null. As in other languages, these references facilitate the processing of dynamic structures, such as linked lists, queues, and trees.


===Object-oriented languages===
A [[Fortran]] reference is best thought of as an ''alias'' of another object, such as a scalar variable or a row or column of an array. There is no syntax to dereference the reference or manipulate the contents of the referent directly. Fortran references can be null. As in other languages, these references facilitate the processing of dynamic structures, such as linked lists, queues, and trees.
A number of object-oriented languages such as [[Eiffel (programming language)|Eiffel]], [[Java (programming language)|Java]], [[C Sharp (programming language)|C#]], and [[Visual Basic]] have adopted a much more opaque type of reference, usually referred to as simply a ''reference''. These references have types like C pointers indicating how to interpret the data they reference, but they are typesafe in that they cannot be interpreted as a raw address and unsafe conversions are not permitted. References are extensively used to access and [[Assignment (computer science)#Assignment in object oriented languages|assign]] objects. References are also used in function/[[Method (computer programming)|method]] calls or message passing, and [[Reference counting|reference counts]] are frequently used to perform [[Garbage collection (computer science)|garbage collection]] of unused objects.


=== Functional languages ===
===Functional languages===
In [[Standard ML]], [[OCaml]], and many other functional languages, most values are persistent: they cannot be modified by assignment. Assignable "reference cells" provide [[mutable|mutable variables]], data that can be modified. Such reference cells can hold any value, and so are given the [[polymorphism (computer science)|polymorphic]] type <code>α ref</code>, where <code>α</code> is to be replaced with the type of value pointed to. These mutable references can be pointed to different objects over their lifetime. For example, this permits building of circular data structures. The reference cell is functionally equivalent to a mutable array of length 1.


To preserve safety and efficient implementations, references cannot be [[Type conversion|type-cast]] in ML, nor can pointer arithmetic be performed. In the functional paradigm, many structures that would be represented using pointers in a language like C are represented using other facilities, such as the powerful [[algebraic datatype]] mechanism. The programmer is then able to enjoy certain properties (such as the guarantee of immutability) while programming, even though the compiler often uses machine pointers "under the hood".
In all of the above settings, the concept of [[mutable variable]]s, data that can be modified, often makes implicit use of references. In [[Standard ML]], [[O'Caml]], and many other functional languages, most values are persistent: they cannot be modified by assignment. Assignable "reference cells" serve the unavoidable purposes of mutable references in imperative languages, and make the capability to be modified explicit. Such reference cells can hold any value, and so are given the [[polymorphism (computer science)|polymorphic]] type <code>&alpha; ref</code>, where <code>&alpha;</code> is to be replaced with the type of value pointed to. These mutable references can be pointed to different objects over their lifetime. For example, this permits building of circular data structures.


=== Perl/PHP ===
To preserve safety and efficient implementations, references cannot be type-cast in ML, nor can pointer arithmetic be performed. It is important to note that in the functional paradigm, many structures that would be represented using pointers in a language like C are represented using other facilities, such as the powerful [[algebraic datatype]] mechanism. The programmer is then able to enjoy certain properties (such as the guarantee of immutability) while programming, even though the compiler often uses machine pointers "under the hood".
[[Perl]] supports hard references, which function similarly to those in other languages, and '''symbolic references''', which are just string values that contain the names of variables. When a value that is not a hard reference is dereferenced, Perl considers it to be a symbolic reference and gives the variable with the name given by the value.<ref>{{cite web|url=http://perldoc.perl.org/perlref.html#Symbolic-references |title=perlref |publisher=perldoc.perl.org |access-date=2013-08-19}}</ref> [[PHP]] has a similar feature in the form of its <code>$$var</code> syntax.<ref>{{cite web|url=http://www.php.net/manual/en/language.variables.variable.php |title=Variable variables - Manual |publisher=PHP |access-date=2013-08-19}}</ref>

== See also ==


==See also==
* [[Abstraction (computer science)]]
* [[Abstraction (computer science)]]
* [[Autovivification]]
* [[Bounded pointer]]
* [[Linked data]]
* [[Magic cookie]]
* [[Magic cookie]]
* [[Weak reference]]
* [[Weak reference]]


==References==
== External links ==
{{Reflist}}


==External links==
* [http://cslibrary.stanford.edu/104/ Pointer Fun With Binky] Introduction to pointers in a 3 minute educational video - Stanford Computer Science Education Library
{{Wiktionary|dereference}}
* [http://cslibrary.stanford.edu/104/ Pointer Fun With Binky] Introduction to pointers in a 3-minute educational video Stanford Computer Science Education Library


[[Category:Data types]]
{{Data types}}
{{Semantic Web}}
[[Category:Programming constructs]]
{{Web syndication}}


[[pl:Referencja]]
[[Category:Data types]]
[[Category:Programming language concepts]]
[[Category:Primitive types]]

Latest revision as of 13:12, 26 November 2024

In computer programming, a reference is a value that enables a program to indirectly access a particular datum, such as a variable's value or a record, in the computer's memory or in some other storage device. The reference is said to refer to the datum, and accessing the datum is called dereferencing the reference. A reference is distinct from the datum itself.

A reference is an abstract data type and may be implemented in many ways. Typically, a reference refers to data stored in memory on a given system, and its internal value is the memory address of the data, i.e. a reference is implemented as a pointer. For this reason a reference is often said to "point to" the data. Other implementations include an offset (difference) between the datum's address and some fixed "base" address, an index, or identifier used in a lookup operation into an array or table, an operating system handle, a physical address on a storage device, or a network address such as a URL.

Formal representation

[edit]

A reference R is a value that admits one operation, dereference(R), which yields a value. Usually the reference is typed so that it returns values of a specific type, e.g.:[1][2]

interface Reference<T> {
    T value();
}

Often the reference also admits an assignment operation store(R, x), meaning it is an abstract variable.[1]

Use

[edit]

References are widely used in programming, especially to efficiently pass large or mutable data as arguments to procedures, or to share such data among various uses. In particular, a reference may point to a variable or record that contains references to other data. This idea is the basis of indirect addressing and of many linked data structures, such as linked lists. References increase flexibility in where objects can be stored, how they are allocated, and how they are passed between areas of code. As long as one can access a reference to the data, one can access the data through it, and the data itself need not be moved. They also make sharing of data between different code areas easier; each keeps a reference to it.

References can cause significant complexity in a program, partially due to the possibility of dangling and wild references and partially because the topology of data with references is a directed graph, whose analysis can be quite complicated. Nonetheless, references are still simpler to analyze than pointers due to the absence of pointer arithmetic.

The mechanism of references, if varying in implementation, is a fundamental programming language feature common to nearly all modern programming languages. Even some languages that support no direct use of references have some internal or implicit use. For example, the call by reference calling convention can be implemented with either explicit or implicit use of references.

Examples

[edit]

Pointers are the most primitive type of reference. Due to their intimate relationship with the underlying hardware, they are one of the most powerful and efficient types of references. However, also due to this relationship, pointers require a strong understanding by the programmer of the details of memory architecture. Because pointers store a memory location's address, instead of a value directly, inappropriate use of pointers can lead to undefined behavior in a program, particularly due to dangling pointers or wild pointers. Smart pointers are opaque data structures that act like pointers but can only be accessed through particular methods.

A handle is an abstract reference, and may be represented in various ways. A common example are file handles (the FILE data structure in the C standard I/O library), used to abstract file content. It usually represents both the file itself, as when requesting a lock on the file, and a specific position within the file's content, as when reading a file.

In distributed computing, the reference may contain more than an address or identifier; it may also include an embedded specification of the network protocols used to locate and access the referenced object, the way information is encoded or serialized. Thus, for example, a WSDL description of a remote web service can be viewed as a form of reference; it includes a complete specification of how to locate and bind to a particular web service. A reference to a live distributed object is another example: it is a complete specification for how to construct a small software component called a proxy that will subsequently engage in a peer-to-peer interaction, and through which the local machine may gain access to data that is replicated or exists only as a weakly consistent message stream. In all these cases, the reference includes the full set of instructions, or a recipe, for how to access the data; in this sense, it serves the same purpose as an identifier or address in memory.

If we have a set of keys K and a set of data objects D, any well-defined (single-valued) function from K to D ∪ {null} defines a type of reference, where null is the image of a key not referring to anything meaningful.

An alternative representation of such a function is a directed graph called a reachability graph. Here, each datum is represented by a vertex and there is an edge from u to v if the datum in u refers to the datum in v. The maximum out-degree is one. These graphs are valuable in garbage collection, where they can be used to separate accessible from inaccessible objects.

External and internal storage

[edit]

In many data structures, large, complex objects are composed of smaller objects. These objects are typically stored in one of two ways:

  1. With internal storage, the contents of the smaller object are stored inside the larger object.
  2. With external storage, the smaller objects are allocated in their own location, and the larger object only stores references to them.

Internal storage is usually more efficient, because there is a space cost for the references and dynamic allocation metadata, and a time cost associated with dereferencing a reference and with allocating the memory for the smaller objects. Internal storage also enhances locality of reference by keeping different parts of the same large object close together in memory. However, there are a variety of situations in which external storage is preferred:

  • If the data structure is recursive, meaning it may contain itself. This cannot be represented in the internal way.
  • If the larger object is being stored in an area with limited space, such as the stack, then we can prevent running out of storage by storing large component objects in another memory region and referring to them using references.
  • If the smaller objects may vary in size, it is often inconvenient or expensive to resize the larger object so that it can still contain them.
  • References are often easier to work with and adapt better to new requirements.

Some languages, such as Java, Smalltalk, Python, and Scheme, do not support internal storage. In these languages, all objects are uniformly accessed through references.

Language support

[edit]

Assembly

[edit]

In assembly language, it is typical to express references using either raw memory addresses or indexes into tables. These work, but are somewhat tricky to use, because an address tells you nothing about the value it points to, not even how large it is or how to interpret it; such information is encoded in the program logic. The result is that misinterpretations can occur in incorrect programs, causing bewildering errors.

Lisp

[edit]

One of the earliest opaque references was that of the Lisp language cons cell, which is simply a record containing two references to other Lisp objects, including possibly other cons cells. This simple structure is most commonly used to build singly linked lists, but can also be used to build simple binary trees and so-called "dotted lists", which terminate not with a null reference but a value.

C/C++

[edit]

The pointer is still one of the most popular types of references today. It is similar to the assembly representation of a raw address, except that it carries a static datatype which can be used at compile-time to ensure that the data it refers to is not misinterpreted. However, because C has a weak type system which can be violated using casts (explicit conversions between various pointer types and between pointer types and integers), misinterpretation is still possible, if more difficult. Its successor C++ tried to increase type safety of pointers with new cast operators, a reference type &, and smart pointers in its standard library, but still retained the ability to circumvent these safety mechanisms for compatibility.

Fortran

[edit]

Fortran does not have an explicit representation of references, but does use them implicitly in its call-by-reference calling semantics. A Fortran reference is best thought of as an alias of another object, such as a scalar variable or a row or column of an array. There is no syntax to dereference the reference or manipulate the contents of the referent directly. Fortran references can be null. As in other languages, these references facilitate the processing of dynamic structures, such as linked lists, queues, and trees.

Object-oriented languages

[edit]

A number of object-oriented languages such as Eiffel, Java, C#, and Visual Basic have adopted a much more opaque type of reference, usually referred to as simply a reference. These references have types like C pointers indicating how to interpret the data they reference, but they are typesafe in that they cannot be interpreted as a raw address and unsafe conversions are not permitted. References are extensively used to access and assign objects. References are also used in function/method calls or message passing, and reference counts are frequently used to perform garbage collection of unused objects.

Functional languages

[edit]

In Standard ML, OCaml, and many other functional languages, most values are persistent: they cannot be modified by assignment. Assignable "reference cells" provide mutable variables, data that can be modified. Such reference cells can hold any value, and so are given the polymorphic type α ref, where α is to be replaced with the type of value pointed to. These mutable references can be pointed to different objects over their lifetime. For example, this permits building of circular data structures. The reference cell is functionally equivalent to a mutable array of length 1.

To preserve safety and efficient implementations, references cannot be type-cast in ML, nor can pointer arithmetic be performed. In the functional paradigm, many structures that would be represented using pointers in a language like C are represented using other facilities, such as the powerful algebraic datatype mechanism. The programmer is then able to enjoy certain properties (such as the guarantee of immutability) while programming, even though the compiler often uses machine pointers "under the hood".

Perl/PHP

[edit]

Perl supports hard references, which function similarly to those in other languages, and symbolic references, which are just string values that contain the names of variables. When a value that is not a hard reference is dereferenced, Perl considers it to be a symbolic reference and gives the variable with the name given by the value.[3] PHP has a similar feature in the form of its $$var syntax.[4]

See also

[edit]

References

[edit]
  1. ^ a b Sherman, Mark S. (April 1985). Paragon: A Language Using Type Hierarchies for the Specification, Implementation, and Selection of Abstract Data Types. Springer Science & Business Media. p. 175. ISBN 978-3-540-15212-5.
  2. ^ "Reference (Java Platform SE 7)". docs.oracle.com. Retrieved 10 May 2022.
  3. ^ "perlref". perldoc.perl.org. Retrieved 2013-08-19.
  4. ^ "Variable variables - Manual". PHP. Retrieved 2013-08-19.
[edit]
  • Pointer Fun With Binky Introduction to pointers in a 3-minute educational video – Stanford Computer Science Education Library