Jump to content

Leaky abstraction: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
c/e
 
(156 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
{{Short description|Faulty software abstraction}}
A '''leaky abstraction''' is an unsatisfactory [[implementation]] of an [[abstraction]]. Unsatisfactory means any case when specific implementation details manifest themselves in some obstructive or counter-productive way, thus interfering with the abstraction. The implementation details are said to "leak through" and interfere with the simplifying assumptions supposedly enabled by the abstraction.
{{Refimprove|date=March 2011}}


A '''leaky abstraction''' in [[software development]] refers to a design flaw where an [[Abstraction (computer science)|abstraction]], intended to simplify and hide the underlying complexity of a system, fails to completely do so. This results in some of the implementation details becoming exposed or 'leaking' through the abstraction, forcing users to have knowledge of these underlying complexities to effectively use or troubleshoot the system.<ref>{{cite book|author=Seibel, Peter |title=Practical Common Lisp|url=https://books.google.com/books?id=dVJvr8PtCMUC&pg=PA96|date=1 November 2006|publisher=Apress|isbn=978-1-4302-0017-8|page=96}}</ref>
Within the [[software industry]], leaky abstractions are a common source of [[software bug]]s.


The concept was popularized by [[Joel Spolsky]], who coined the term '''Law of Leaky Abstractions''' which states:<ref name="joel" />
==Overview==
The term is widely attributed to software commentator [[Joel Spolsky]], who published the concept as
'''The Law of Leaky Abstractions''':


"All non-trivial abstractions, to some degree, are leaky."
{{Cquote|All non-trivial abstractions, to some degree, are leaky.
}}


This means that even well-designed abstractions may not fully conceal their inner workings, and as computer systems grow more complex, the likelihood of such leaks increases. These leaks can lead to performance issues, unexpected behavior, and increased cognitive load on [[software developers]], who are forced to understand both the abstraction and the underlying details it was meant to hide. This highlights a cause of software defects: the reliance of the software developer on an abstraction's infallibility. Despite their imperfections, abstractions are crucial in software development for managing complexity, even though they are not always flawless.
In the publication, Spolsky expresses this viewpoint, and supports it with reference to examples in [[software engineering]], which is why this term is customarily used in the context of computer software and hardware.


===Basic notion===
==History==
The term "leaky abstraction" was popularized in 2002 by [[Joel Spolsky]].<ref name="joel">{{cite web|last=Spolsky|first=Joel|url=https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/|year=2002|accessdate=2010-09-22|title=The Law of Leaky Abstractions}}</ref><ref>{{Cite web|last=arvindpdmn|date=2019-08-23|title=Leaky Abstractions|url=https://devopedia.org/leaky-abstractions|access-date=2020-07-07|website=Devopedia|language=en-gb}}</ref> A 1992 paper by [[Gregor Kiczales|Kiczales]] describes some of the issues with imperfect abstractions and presents a potential solution to the problem by allowing for the customization of the abstraction itself.<ref>{{cite web|last=Kiczales|first=Gregor|url=http://www2.parc.com/csl/groups/sda/publications/papers/Kiczales-IMSA92/for-web.pdf|year=1992|accessdate=2010-02-03|title=Towards a New Model of Abstraction in the Engineering of Software|archive-url=https://web.archive.org/web/20110604013045/http://www2.parc.com/csl/groups/sda/publications/papers/Kiczales-IMSA92/for-web.pdf|archive-date=2011-06-04|url-status=dead}}</ref>
In a basic sense, the term is a [[misnomer]] because it is the implementation details that "leak" through, not the abstraction itself.


==Effect on software development==
===Extended notion===
As systems become more complex, software developers must rely upon more abstractions. Each abstraction tries to hide complexity, letting a developer write software that "handles" the many variations of modern computing.
In an extended sense, the term is not a misnomer, because, according to Spolsky, *all* non-trivial abstractions resist complete implementation ''by their very nature''. Consequently, implementation details will always 'leak through', regardless of how well-concieved, and no matter how rigorously they attempt to faithfully represent the abstraction. Sometimes the "leaks" are minor, other times they are significant, but they will always be present, because there is no such thing as a "perfect" implementation of an abstraction.


However, this law claims that developers of ''reliable'' software must learn the abstraction's underlying details anyway.
It is most likely this extended notion Spolsky intended to convey by expressing the viewpoint as a "law" and attributing the "leakiness" to abstractions, rather than to imprecise implementations.
<!--Not sure wh/ this needs to be in here
===Philosophical implications===
In an [[epistemology|epistemologic]]al sense the statement is a [[tautology (logic)]], since the human concept of abstraction is ''itself'' an "implementation," (represented in mental concepts and verbal statements used to convey them).
-->


==Examples==
==Examples==
Spolsky's [https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/ article] cites many examples of leaky abstractions that create problems for software development:
===Monetary Value===
Paper currency implements the abstract [[economic]] concept of monetary value. Conceptually, monetary value cannot be destroyed; yet a paper dollar can be. The physical nature of the implementation (the paper dollar) corrupts the conceptual nature of the abstraction (its monetary value). If a dollar is burned, the value is lost, even though such a result shouldn't be possible from the perspective of monetary ownership.


* The [[TCP/IP]] protocol stack is the combination of [[Transmission Control Protocol|TCP]], which tries to provide reliable delivery of information, running on top of [[Internet Protocol|IP]], which provides only 'best-effort' service. When IP loses a packet, TCP has to retransmit it, which takes additional time. Thus TCP provides the abstraction of a reliable connection, but the implementation details leak through in the form of potentially variable performance (throughput and latency both suffer when data has to be retransmitted), and the connection can still break entirely.
===Computers===
* [[Iterator|Iterating]] over a large two-dimensional [[Array data structure|array]] can have radically different performance if done horizontally rather than vertically, depending on the order in which elements are stored in memory. One direction may vastly increase [[CPU cache#Cache_miss|cache misses]] and [[page fault]]s, both of which greatly delay access to memory.
Computer [[hardware]] and [[software]] is heavily reliant on abstraction, and therefore subject to the consequences of leaky abstraction.
* The [[SQL]] language abstracts away the procedural steps for querying a [[database]], allowing one to merely define what one wants. But certain SQL queries are thousands of times slower than other logically equivalent queries. On an even higher level of abstraction, [[Object-relational mapping|ORM]] systems, which isolate object-oriented code from the implementation of object persistence using a relational database, still force the programmer to think in terms of databases, tables, and native SQL queries as soon as performance of ORM-generated queries becomes a concern.
* Although network file systems like [[Network File System|NFS]] and [[Server Message Block|SMB]] let one treat files on remote machines as if they were local, the connection to the remote machine may slow down or break, and the file stops acting as if it were local.
* The [[ASP.NET]] web forms programming platform, not to be confused with ASP.NET MVC, abstracts away the difference between compiled back-end code to handle clicking on a hyperlink (<code><a></code>) and code to handle clicking on a button. However, ASP.NET needs to hide the fact that in HTML there is no way to submit a form from a hyperlink. It does this by generating a few lines of JavaScript and attaching an [[DOM_events#Common.2FW3C_events|onclick]] handler to the hyperlink. However, if the end user has JavaScript disabled, the ASP.NET application malfunctions. Furthermore, one cannot naively think of event handlers in ASP.NET in the same way as in a desktop GUI framework such as [[Windows Forms]]; due to the asynchronous nature of the Web, processing event handlers in ASP.NET requires exchanging data with the server and reloading the form.
<!-- For this entire section, the citation is Spolsky's article. Recall the header is "Spolsky's article cites many examples of leaky abstractions that create problems for software development:". No need to add a cite needed template. If it isn't in his article it should be removed! -->


In 2020, [[Massachusetts Institute of Technology]] computing science teaching staff Anish, Jose, and Jon argued that the command line interface for [[git]] is a leaky abstraction, in which the underlying "beautiful design" of the git data model needs to be understood for effective usage of git.<ref>{{Cite web|title=Version Control (Git)|url=https://missing.csail.mit.edu/2020/version-control/|access-date=2020-07-31|website=the missing semester of your cs education|language=en}}</ref>
For example, many different brands of [[sound card]] exist, and each has different capabilities and methods of operation. It is the role of the [[computer]]'s [[operating system]] to implement a sound card abstraction for programs, so that a program need not have knowledge about every possible sound card that may be present. The program tells the operating system's sound card abstraction what sound to make, and the operating system then tells the sound card to make the sound using the mechanisms and capabilities unique to that sound card. If the abstraction isn't leaky, then a program can take advantage of any sound card with no problems. If the abstraction is leaky, however, the program may run into trouble - the sound may come out differently depending on the sound card, requiring the program to compensate by incorporating logic specific to the sound card installed in the computer.


==See also==
Any [[API]] which behaves differently depending on the underlying implementation is considered leaky. If a program fails to compensate for leaky APIs, bugs can result.
* [[Abstraction inversion]]
* [[Dependency inversion principle]]
* [[Essential complexity]]
* [[Modular programming]]
* [[Separation of concerns]]


==References==
Obscure [[error message]]s are a common observable effect of leaky abstractions in software. A newfangled sound card, for example, may be able to reproduce the same sounds as old sound cards, but it also may have new ways of failing. If the operating system's sound card abstraction didn't anticipate the new [[failure mode]]s, the program will encounter a failure it couldn't anticipate and can't understand. This can result in an "unknown error" or similar error message being presented to the user.
{{Reflist}}

== External links ==
* [http://www.joelonsoftware.com/articles/LeakyAbstractions.html The Law of Leaky Abstractions] - the original article by Joel Spolsky, who coined the term


{{DEFAULTSORT:Leaky Abstraction}}
[[Category:Abstraction]]
[[Category:Abstraction]]
[[Category:Programming bugs]]

Latest revision as of 19:28, 1 October 2024

A leaky abstraction in software development refers to a design flaw where an abstraction, intended to simplify and hide the underlying complexity of a system, fails to completely do so. This results in some of the implementation details becoming exposed or 'leaking' through the abstraction, forcing users to have knowledge of these underlying complexities to effectively use or troubleshoot the system.[1]

The concept was popularized by Joel Spolsky, who coined the term Law of Leaky Abstractions which states:[2]

All non-trivial abstractions, to some degree, are leaky.

This means that even well-designed abstractions may not fully conceal their inner workings, and as computer systems grow more complex, the likelihood of such leaks increases. These leaks can lead to performance issues, unexpected behavior, and increased cognitive load on software developers, who are forced to understand both the abstraction and the underlying details it was meant to hide. This highlights a cause of software defects: the reliance of the software developer on an abstraction's infallibility. Despite their imperfections, abstractions are crucial in software development for managing complexity, even though they are not always flawless.

History

[edit]

The term "leaky abstraction" was popularized in 2002 by Joel Spolsky.[2][3] A 1992 paper by Kiczales describes some of the issues with imperfect abstractions and presents a potential solution to the problem by allowing for the customization of the abstraction itself.[4]

Effect on software development

[edit]

As systems become more complex, software developers must rely upon more abstractions. Each abstraction tries to hide complexity, letting a developer write software that "handles" the many variations of modern computing.

However, this law claims that developers of reliable software must learn the abstraction's underlying details anyway.

Examples

[edit]

Spolsky's article cites many examples of leaky abstractions that create problems for software development:

  • The TCP/IP protocol stack is the combination of TCP, which tries to provide reliable delivery of information, running on top of IP, which provides only 'best-effort' service. When IP loses a packet, TCP has to retransmit it, which takes additional time. Thus TCP provides the abstraction of a reliable connection, but the implementation details leak through in the form of potentially variable performance (throughput and latency both suffer when data has to be retransmitted), and the connection can still break entirely.
  • Iterating over a large two-dimensional array can have radically different performance if done horizontally rather than vertically, depending on the order in which elements are stored in memory. One direction may vastly increase cache misses and page faults, both of which greatly delay access to memory.
  • The SQL language abstracts away the procedural steps for querying a database, allowing one to merely define what one wants. But certain SQL queries are thousands of times slower than other logically equivalent queries. On an even higher level of abstraction, ORM systems, which isolate object-oriented code from the implementation of object persistence using a relational database, still force the programmer to think in terms of databases, tables, and native SQL queries as soon as performance of ORM-generated queries becomes a concern.
  • Although network file systems like NFS and SMB let one treat files on remote machines as if they were local, the connection to the remote machine may slow down or break, and the file stops acting as if it were local.
  • The ASP.NET web forms programming platform, not to be confused with ASP.NET MVC, abstracts away the difference between compiled back-end code to handle clicking on a hyperlink (<a>) and code to handle clicking on a button. However, ASP.NET needs to hide the fact that in HTML there is no way to submit a form from a hyperlink. It does this by generating a few lines of JavaScript and attaching an onclick handler to the hyperlink. However, if the end user has JavaScript disabled, the ASP.NET application malfunctions. Furthermore, one cannot naively think of event handlers in ASP.NET in the same way as in a desktop GUI framework such as Windows Forms; due to the asynchronous nature of the Web, processing event handlers in ASP.NET requires exchanging data with the server and reloading the form.

In 2020, Massachusetts Institute of Technology computing science teaching staff Anish, Jose, and Jon argued that the command line interface for git is a leaky abstraction, in which the underlying "beautiful design" of the git data model needs to be understood for effective usage of git.[5]

See also

[edit]

References

[edit]
  1. ^ Seibel, Peter (1 November 2006). Practical Common Lisp. Apress. p. 96. ISBN 978-1-4302-0017-8.
  2. ^ a b Spolsky, Joel (2002). "The Law of Leaky Abstractions". Retrieved 2010-09-22.
  3. ^ arvindpdmn (2019-08-23). "Leaky Abstractions". Devopedia. Retrieved 2020-07-07.
  4. ^ Kiczales, Gregor (1992). "Towards a New Model of Abstraction in the Engineering of Software" (PDF). Archived from the original (PDF) on 2011-06-04. Retrieved 2010-02-03.
  5. ^ "Version Control (Git)". the missing semester of your cs education. Retrieved 2020-07-31.