Indentation style
In computer programming, an indent style is a convention governing the indentation of blocks of code to convey the program's structure. This article largely addresses the free-form languages, such as C programming language and its descendants, but can be (and frequently is) applied to most other programming languages (especially those in the curly bracket family), where whitespace is otherwise insignificant. Indent style is just one aspect of programming style.
Indentation is not a requirement of most programming languages, where it is used as secondary notation. Rather, programmers indent to better convey the structure of their programs to human readers. In particular, indentation is used to show the relationship between control flow constructs such as conditions or loops and code contained within and outside them. However, some programming languages (such as Python and Occam) use the indentation to determine the structure instead of using braces or keywords; this is known as the off-side rule, and in these languages indentation is meaningful to the compiler, not just a matter of style.
Note that this article uses "brackets" to refer to what are known as "parentheses" in American English, and "braces" to refer to what are known as "curly brackets" in American English.
Tabs, spaces, and size of indent
The size of the indent is usually independent of the style. Many early programs used tab characters for indentation, for simplicity and to save on source file size. Unix editors generally view tabs as equivalent to eight characters, while Macintosh and Microsoft Windows environments would set them to four, creating confusion when code was transferred back and forth. Modern programming editors are now often able to set arbitrary indentation sizes, and will insert the appropriate combination of tabs and spaces. For Ruby, many shell programming languages, and some forms of HTML formatting, two spaces per indent level is generally used.[citation needed]
The issue of using hard tabs or spaces is an ongoing debate in the programming community. Some programmers such as Jamie Zawinski[1] feel that spaces instead of tabs increase cross-platform functionality. Others, such as the writers of the WordPress coding standards,[2] believe the opposite, that hard tabs increase cross-platform functionality.
Tools
There are a number of computer programs that automatically correct indent styles (according to the preferences of the program author) as well as the length of indents associated with tabs. A famous one among them is indent
, a program included with many Unix-like operating systems.
In Emacs, various commands are available to automatically fix indentation problems, including just hitting Tab on a given line (in the default configuration). "M-x indent-region" can be used to properly indent large sections of code. Depending on the mode, Emacs can also replace leading indentation spaces with the appropriate number of tabs followed by spaces, which results in the minimal number of characters for indenting each source line.
Elastic tabstops is a tabulation style which requires support from the text editor, where entire blocks of text are kept automatically aligned when the length of one line in the block changes.
Styles
Kernel style
The kernel style is known for its extensive usage in the source tree of the Linux kernel. Linus Torvalds strongly advises all contributors to follow it. A detailed description of the style (which not only considers indentation, but naming conventions, comments and various other aspects as well) can be found on kernel.org. The style borrows some elements from K&R, below.
The kernel style utilizes tabs (with tab stops set at 8 characters) for indentation. Opening curly braces of a function go to the beginning of the line following the function header. Any other opening curly braces are to be put on the same line as the corresponding statement, separated by a space. Labels in a "switch" statement are aligned with the enclosing block (there's only one level of indentation). A single-statement body of a compound statement (such as if, while and do-while) need not be surrounded by curly braces. If, however, at least one of the sub-statements in an "if-else" statement requires braces, then both sub-statements should be wrapped inside curly braces. Line length is limited to 80 characters.
int power(int x, int y)
{
int result;
if (y < 0) {
result = 0;
} else {
for (result = 1; y; y--)
result *= x;
}
return result;
}
K&R style
The K&R style, so named because it was used in Kernighan and Ritchie's book The C Programming Language, is commonly used in C. It is also used for C++ and other curly brace programming languages.
When adhering to K&R, each function has its opening brace at the next line on the same indentation level as its header, the statements within the braces are indented, and the closing brace at the end is on the same indentation level as the header of the function at a line of its own. The blocks inside a function, however, have their opening braces at the same line as their respective control statements; closing braces remain in a line of their own, unless followed by an else or while keyword.
int main(int argc, char *argv[])
{
...
while (x == y) {
something();
somethingelse();
if (some_error) {
do_correct();
} else {
continue_as_usual();
}
}
finalthing();
...
}
In old versions of the C programming language, the functions, however, were braced distinctly. The opening function brace of a function was placed on the line following after the declaration section and at the same indentation level as the declaration (header of the function). This is because in the original C language, argument types needed to be declared on the subsequent line (i. e., just after the header of the function), whereas when no arguments were necessary, the opening brace would not appear in the same line with the function declaration. The opening brace for function declarations was an exception to the currently basic rule stating that the statements and blocks of a function are all enclosed in the function braces.[citation needed]
/* Original pre-ISO C style without function prototypes */
int main(argc, argv)
int argc;
char *argv[];
{
...
}
Variant: 1TBS
Advocates of this style sometimes refer to it as "the one true brace style"[3] (abbreviated as 1TBS or OTBS) because of the precedent set by C (although advocates of other styles have been known to use similarly strong language). The source code of both the Unix[4] and Linux[5] kernels is written in this style. The main difference from the K&R style is that the braces are not omitted for a control statement with only a single statement in its scope.
In this style, the constructs that allow insertions of new code lines are on separate lines, and constructs that prohibit insertions are on a single line. This principle is amplified by bracing every if, else, while, etc.—including single-line conditionals—so that insertion of a new line of code anywhere is always "safe" (i.e., such an insertion will not make the flow of execution disagree with the source code indentation).
Suggested advantages of this style are that the beginning brace does not require an extra line by itself; and the ending brace lines up with the statement it conceptually belongs to. One cost of this style is that the ending brace of a block takes up an entire line by itself, which can be partially resolved in if/else blocks and do/while blocks:
//...
if (x < 0) {
puts("Negative");
negative(x);
} else {
puts("Non-negative");
nonnegative(x);
}
It is not usually the opening brace itself that is interesting, but rather the controlling statement that introduced the block, and as such a suggested advantage with this style is that it makes it easier to find them.
While Java is sometimes written in other styles, a significant body of Java code uses a minor variant of the K&R style in which the opening brace is on the same line as the class or method declaration, largely because Sun's original style guides[6][7][8] used this K&R variant, and as a result most of the standard source code for the Java API is written in this style. It is also a popular indent style for ActionScript and JavaScript, along with the Allman style.
It should be noted that The C Programming Language does not explicitly specify this style, though it is followed consistently throughout the book. Of note from the book:
The position of braces is less important, although people hold passionate beliefs. We have chosen one of several popular styles. Pick a style that suits you, then use it consistently.
Variant: Stroustrup
Stroustrup style is Bjarne Stroustrup's adaptation of K&R style for C++, as used in his books, such as Programming: Principles and Practice using C++ and The C++ Programming Language.[9]
Unlike the variants above, Stroustrup does not use a “cuddled else”. Thus, Stroustrup would write[9]
if (x < 0) {
puts("Negative");
negative(x);
}
else {
puts("Non-negative");
nonnegative(x);
}
Stroustrup extends K&R style for classes, writing them as follows:
class Vector {
public:
Vector(int s) :elem(new double[s]), sz(s) { } // construct a Vector
double& operator[](int i) { return elem[i]; } // element access: subscripting
int size() { return sz; }
private:
double elem[lowast]; // pointer to the elements
int sz; // the number of elements
};
Note that Stroustrup does not indent the labels public:
and private:
. Also note that in Stroustrup style, even though the opening brace of a function starts on a new line, the opening brace of a class is on the same line as the class name.
Also note that Stroustrup is okay with writing short functions all on one line. Stroustrup style is a named indentation style available in the editor Emacs.
Allman style
The Allman style is named after Michael Loder. It has been incorrectly referred to as "ANSI style"[10] supposedly for its use in the documents describing the ANSI C standard (later adopted as the ISO C international standard), though in fact those documents use K&R style.[11] It is also sometimes known as "BSD style" since Allman wrote many of the utilities for BSD Unix (although this should not be confused with the different "BSD KNF style"; see below).
This style puts the brace associated with a control statement on the next line, indented to the same level as the control statement. Statements within the braces are indented to the next level.
while (x == y)
{
something();
somethingelse();
}
finalthing();
This style is similar to the standard indentation used by the Pascal programming language and Transact-SQL, where the braces are equivalent to the begin
and end
keywords.
(* Example Allman code indentation style in Pascal *)
procedure dosomething(x: integer, y: integer)
begin
while x = y do
begin
something;
somethingelse
end
end
Suggested advantages of this style are that the indented code is clearly set apart from the containing statement by lines that are almost completely whitespace and the closing brace lines up in the same column as the opening brace. Some people feel this makes it easy to find matching braces. Additionally, the blocking style delineates the actual block of code from the associated control statement itself. Commenting out the control statement, removing the control statement entirely, refactoring, or removing of the block of code is less likely to introduce syntax errors because of dangling or missing braces. Furthermore, it's consistent with brace placement for the outer/function block.
For example, the following is still syntactically correct:
//while (x == y)
{
something();
somethingelse();
}
As is this:
//for (int i=0; i < x; i++)
//while (x == y)
if (x == y)
{
something();
somethingelse();
}
Even like this, with conditional compilation:
int c;
#ifdef HAS_GETCH
while ((c = getch()) != EOF)
#else
while ((c = getchar()) != EOF)
#endif
{
do_something(c);
}
BSD KNF style
Also known as Kernel Normal Form, this is currently the form of most of the code used in the Berkeley Software Distribution operating systems. Although mostly intended for kernel code, it is widely used as well in userland code. It is essentially a thoroughly-documented variant of K&R style as used in the Bell Labs Version 6 & 7 UNIX source code.[citation needed]
The hard tabulator (ts in vi) is kept at eight columns, while a soft tabulator is often defined as a helper as well (sw in vi), and set at four. The hard tabulators are used to indent code blocks, while a soft tabulator (four spaces) of additional indent is used for all continuing lines that must be split over multiple lines.
Moreover, function calls do not use a space before the parenthesis, although C language native statements such as if
, while
, do
, switch
and return
do (in the case where return
is used with parens). Functions that declare no local variables in their top-level block should also leave an empty line after their opening block brace.
Here follow a few samples:
while (x == y) {
something();
somethingelse();
}
finalthing();
if (data != NULL && res > 0) {
if (JS_DefineProperty(cx, o, "data",
STRING_TO_JSVAL(JS_NewStringCopyN(cx, data, res)),
NULL, NULL, JSPROP_ENUMERATE) != 0) {
QUEUE_EXCEPTION("Internal error!");
goto err;
}
PQfreemem(data);
} else {
if (JS_DefineProperty(cx, o, "data", OBJECT_TO_JSVAL(NULL),
NULL, NULL, JSPROP_ENUMERATE) != 0) {
QUEUE_EXCEPTION("Internal error!");
goto err;
}
}
static JSBool
pgresult_constructor(JSContext *cx, JSObject *obj, uintN argc,
jsval *argv, jsval *rval)
{
QUEUE_EXCEPTION("PGresult class not user-instantiable");
return (JS_FALSE);
}
Whitesmiths style
The Whitesmiths style, also called Wishart style, to a lesser extent was originally used in the documentation for the first commercial C compiler, the Whitesmiths Compiler. It was also popular in the early days of Windows, since it was used in three influential Windows programming books, Programmer's Guide to Windows by Durant, Carlson & Yao, Programming Windows by Petzold, and Windows 3.0 Power Programming Techniques by Norton & Yao.
Whitesmiths along with Allman have been the most common bracing styles with equal mind shares according to the Jargon File.[12]
This style puts the brace associated with a control statement on the next line, indented. Statements within the braces are indented to the same level as the braces.
while (x == y)
{
something();
somethingelse();
}
finalthing();
The suggested advantages of this style are similar to those of the Allman style in that blocks are clearly set apart from control statements. Another suggested advantage is the alignment of the braces with the block that some people feel emphasizes the fact that the entire block is conceptually (as well as programmatically) a single compound statement. Furthermore, indenting the braces emphasizes that they are subordinate to the control statement. A suggested disadvantage of this style is that the ending brace no longer lines up with the statement it conceptually belongs to.
An example:
if (data != NULL && res > 0)
{
if (!JS_DefineProperty(cx, o, "data", STRING_TO_JSVAL(JS_NewStringCopyN(cx, data, res)),
NULL, NULL, JSPROP_ENUMERATE))
{
QUEUE_EXCEPTION("Internal error!");
goto err;
}
PQfreemem(data);
}
else if (!JS_DefineProperty(cx, o, "data", OBJECT_TO_JSVAL(NULL),
NULL, NULL, JSPROP_ENUMERATE))
{
QUEUE_EXCEPTION("Internal error!");
goto err;
}
However, if one adopts the styling rule that braces will be provided to every level of "scope", then the above code could be written to replace the "else if" with a separated "if" in the scope of a clearly roped-off "else" portion of the statement.
if (data != NULL && res > 0)
{
if (!JS_DefineProperty(cx, o, "data", STRING_TO_JSVAL(JS_NewStringCopyN(cx, data, res)),
NULL, NULL, JSPROP_ENUMERATE))
{
QUEUE_EXCEPTION("Internal error!");
goto err;
}
PQfreemem(data);
}
else
{
if (!JS_DefineProperty(cx, o, "data", OBJECT_TO_JSVAL(NULL),
NULL, NULL, JSPROP_ENUMERATE))
{
QUEUE_EXCEPTION("Internal error!");
goto err;
}
}
Following the strategy shown above, some[who?] would argue the code is inherently more readable, however issues arise in readability as more conditions are added, shown in this pseudo-code.
else
{
if (stuff is true)
{
Do stuff
}
else
{
if (other stuff is true)
{
Do other stuff
}
else
{
if (stuff is still not true)
{
Do even more other stuff
}
}
}
}
GNU style
Like the Allman and Whitesmiths styles, GNU style puts braces on a line by themselves, indented by two spaces, except when opening a function definition, where they are not indented.[13] In either case, the contained code is indented by two spaces from the braces.
Popularised by Richard Stallman, the layout may be influenced by his background of writing Lisp code.[13] In Lisp the equivalent to a block (a progn) is a first-class data entity, and giving it its own indent level helps to emphasize that, whereas in C a block is just syntax. Although not directly related to indentation, GNU coding style also includes a space before the bracketed list of arguments to a function.
static char *
concat (char *s1, char *s2)
{
while (x == y)
{
something ();
somethingelse ();
}
finalthing ();
}
This style combines the advantages of Allman and Whitesmiths, thereby removing the possible Whitesmiths disadvantage of braces not standing out from the block. One disadvantage is that the ending brace no longer lines up with the statement it conceptually belongs to. Another disadvantage is that the style wastes space resources by using two visual levels of indentation for one conceptual level of indentation.
The GNU Coding Standards recommend this style, and nearly all maintainers of GNU project software use it.[citation needed]
The GNU Emacs text editor and the GNU systems' indent command will reformat code according to this style by default [dubious – discuss]. Those who do not use GNU Emacs, or similarly extensible/customisable editors, may find that the automatic indenting settings of their editor are unhelpful for this style. However, many editors defaulting to KNF style cope well with the GNU style when the tab width is set to two spaces; likewise, GNU Emacs adapts well to KNF style just by setting the tab width to eight spaces. In both cases, automatic reformatting will destroy the original spacing, but automatic line indentation will work correctly.
Steve McConnell, in his book Code Complete, advises against using this style: he marks a code sample which uses it with a "Coding Horror" icon, symbolizing especially dangerous code, and states that it impedes readability.[14]
Horstmann style
The 1997 edition of Computing Concepts with C++ Essentials by Cay S. Horstmann adapts Allman by placing the first statement of a block on the same line as the opening brace.
while (x == y)
{ something();
somethingelse();
//...
if (x < 0)
{ printf("Negative");
negative(x);
}
else
{ printf("Non-negative");
nonnegative(x);
}
}
finalthing();
This style combines the advantages of Allman by keeping the vertical alignment of the braces for readability and easy identification of blocks, with the saving of a line of the K&R style. However the 2003 edition now uses Allman style throughout. [1]
Pico style
The style used most commonly in the Pico programming language by its designers is different from the aforementioned styles. The lack of return statements and the fact that semicolons are used in Pico as statement separators, instead of terminators, leads to the following syntax:
stuff(n): { x: 3 * n; y: doStuff(x); y + x }
The advantages and disadvantages are similar to those of saving screen real estate with K&R style. One additional advantage is that the beginning and closing braces are consistent in application (both share space with a line of code), as opposed to K&R style where one brace shares space with a line of code and one brace has a line to itself.
Banner style
The banner style[citation needed] can make visual scanning easier for some, since the "headers" of any block are the only thing extented at that level (the theory being that the closing control of the previous block interferes with the header of the next block in the K&R and Allman styles). In this style, which is to Whitesmiths as K&R is to Allman, the closing control is indented as the last item in the list (and thus appropriately loses salience).
function1 () {
do stuff
do more stuff
}
function2 () {
etc
}
or, in a markup language...
<table>
<tr>
<td> lots of stuff...
more stuff
</td>
<td> alternative for short lines </td>
<td> etc. </td>
</tr>
</table>
<table>
<tr> ... etc
</table>
Lisp style
A programmer may even go as far as to insert closing brackets in the last line of a block. This style makes indentation the only way of distinguishing blocks of code, but has the advantage of containing no uninformative lines. This could easily be called the Lisp style (because this style is very common in Lisp code) or the Python style (Python has no brackets, but the layout looks very similar, as evidenced by the following two code blocks).[citation needed]
// In C
for (i = 0; i < 10; i++) {
if (i % 2 == 0) {
doSomething(i); }
else {
doSomethingElse(i); } }
# In Python
for i in range(10):
if i % 2 == 0:
do_something(i)
else:
do_something_else(i)
;; In Lisp
(dotimes (i 10)
(if (evenp i)
(do-something i)
(do-something-else i)))
Ratliff style
In the book "Programmers at Work",[15] C. Wayne Ratliff discussed using the style below. The style begins much like 1TBS but then the closing brace lines up with the indentation of the nested block. Ratliff was the original programmer behind the popular dBase-II and dBase-III fourth-generation languages. He indicated that it was originally documented in material from Digital Research Inc.
// In C
for (i = 0; i < 10; i++) {
if (i % 2 == 0) {
doSomething(i);
}
else {
doSomethingElse(i);
}
}
Other considerations
Losing track of blocks
In certain situations, there is a risk of losing track of block boundaries. This is often seen in large sections of code containing many compound statements nested to many levels of indentation. By the time the programmer scrolls to the bottom of a huge set of nested statements, he may have lost track of which control statements go where. However, overly long code could have other issues such as being too complex, and the programmer should consider whether refactoring the code would help in the longer term.
Programmers who rely on counting the opening braces may have difficulty with indentation styles such as K&R, where the beginning brace is not visually separated from its control statement. Programmers who rely more on indentation will gain more from styles that are vertically compact, such as K&R, because the blocks are shorter.
To avoid losing track of control statements such as for, one can use a large indent, such as an 8-unit wide hard tab, along with breaking up large functions into smaller and more readable functions. Linux is done this way, as well as using the K&R style.
In text editors of the vi family, one method for tracking block boundaries is to position the text cursor over one of the braces, and pressing the "%" key. Vi or vim will then bounce the cursor to the opposing brace. Since the text cursor's "next" key (viz., the "n" key) retained directional positioning information (whether the "up" or "down" key was previously pressed), the dot macro (the "." key) could then be used to place the text cursor on the next brace,[16] given an appropriate coding style. Alternatively, inspection of the block boundaries using the "%" key can be used to enforce a coding standard.
Another way is to use inline comments added after the closing brace:
for (int i = 0; i < total; i++) {
foo(bar);
} //for (i)
if (x < 0) {
bar(foo);
} //if (x < 0)
However, maintaining duplicate code in multiple locations is the major disadvantage of this method.
Another solution is implemented in a folding editor, which lets the developer hide or reveal blocks of code by their indentation level or by their compound statement structure. Many editors will also highlight matching brackets or braces when the caret is positioned next to one.
Statement insertion
K&R style prevents another common error suffered when using the standard UNIX line editor, ed. A statement mistakenly inserted between the control statement and the opening brace of the loop block turns the body of the loop into a single trip.
for (int i = 0; i < 10; i++)
whoops(bar); /* repeated 10 times, with i from 0 to 9 */
{
only_once(); /* Programmer intended this to be done 10 times */
} //for (i) <-- This comment is no longer valid, and is very misleading!
K&R style avoids this problem by keeping the control statement and the opening brace on the same line.
See also
References
- ^ "Tabs versus Spaces: An Eternal Holy War. by Jamie Zawinski 2000
- ^ "WordPress Coding Standards"
- ^ "The Jargon File". Retrieved 18 August 2014.
- ^ J. Lions (June 1977). "Unix Operating System Source Code Level Six" (PDF). University of New South Wales.
- ^ https://www.kernel.org/doc/Documentation/CodingStyle
- ^ Reddy, Achut (2000-03-30). "Java Coding Style Guide" (PDF). Sun Microsystems. Retrieved 2008-05-30.[dead link ]
- ^ "Java Code Conventions" (PDF). Sun Microsystems. 1997-09-12. Retrieved 2008-05-30.[dead link ]
- ^ "Code Conventions for the Java Programming Language". Sun Microsystems. 1997-03-20. Retrieved 2008-05-30.
- ^ a b Bjarne Stroustrup (September 2010). "PPP Style Guide" (PDF).
- ^ "Artistic Style". Retrieved 2008-05-21.
- ^ "Rationale for International Standard Programming Languages C (Revision 2)" (PDF). Retrieved 2010-11-06.
- ^ "The Jargon File 4.4.8: indent style". Retrieved 2014-03-31.
- ^ a b c "Formatting Your Source Code". GNU Coding Standards. Cite error: The named reference "gnu.org" was defined multiple times with different content (see the help page).
- ^ McConnell, Steve (2004). Code Complete: A practical handbook of software construction. Redmond, WA: Microsoft Press. pp. 746–747. ISBN 0-7356-1967-0.
- ^ Lammers, Susan (1986). Programmers at Work. Microsoft Press. ISBN 0-914845-71-3.
- ^ Linda Lamb, Learning the vi editor. O'Reilly
External links
- C Style: Standards and Guidelines: Defining Programming Standards for Professional C Programmers, Prentice Hall, ISBN 0-13-116898-3 / ISBN 978-0-13-116898-5 (complete text is also on-line). Straker, David (1992).
- Contextual Indent
- GNU Coding Standards
- Jargon File article on indent style
- Template:Dmoz