Jump to content

Leaky abstraction: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
better summary, if we're talking about Spolsky's paper
Examples: spelling
Line 22: Line 22:
Spolsky's article cites many examples of leaky abstractions that create problems for software development:
Spolsky's article cites many examples of leaky abstractions that create problems for software development:
*The [[TCP/IP]] protocol stack is the combination of the [[Transmission Control Protocol|TCP]] protocol, which tries to provide reliable delivery of information, running on top of the [[Internet Protocol|IP]] protocol, which provides only 'best-effort' service. When IP loses a packet TCP has to retransmit it, which takes additional time. Thus TCP provides the abstraction of a reliable connection, but the implementation details leak through in the form of potentially variable performance (throughput and latency both suffer when data has to be retransmitted).
*The [[TCP/IP]] protocol stack is the combination of the [[Transmission Control Protocol|TCP]] protocol, which tries to provide reliable delivery of information, running on top of the [[Internet Protocol|IP]] protocol, which provides only 'best-effort' service. When IP loses a packet TCP has to retransmit it, which takes additional time. Thus TCP provides the abstraction of a reliable connection, but the implementation details leak through in the form of potentially variable performance (throughput and latency both suffer when data has to be retransmitted).
*[[Iterator|iterating]] over a large two-dimensional [[Array data structure|array]] can have radically different performance if done horizontally rather than vertically, depending on the order in which elements are stored in memory. One direction may vastly increase [[CPU cache#Cache_miss|cache misses]] and [[page fault]]s, both of which greatly delay access to memory.
*[[Iterator|Iterating]] over a large two-dimensional [[Array data structure|array]] can have radically different performance if done horizontally rather than vertically, depending on the order in which elements are stored in memory. One direction may vastly increase [[CPU cache#Cache_miss|cache misses]] and [[page fault]]s, both of which greatly delay access to memory.
*The [[SQL]] language abstracts away the procedural steps for querying a [[database]], allowing one to merely define what one wants. But certain SQL queries are thousands of times slower than other logically equivalent queries. On an even higher level of abstraction, [[Object-relational mapping|ORM]] systems, which isolate object-oriented code from the implementation of object persistence using a relational database, still force the programmer to think in terms of databases, tables, and native SQL queries as soon as performance of ORM-generated queries becomes a concern.
*The [[SQL]] language abstracts away the procedural steps for querying a [[database]], allowing one to merely define what one wants. But certain SQL queries are thousands of times slower than other logically equivalent queries. On an even higher level of abstraction, [[Object-relational mapping|ORM]] systems, which isolate object-oriented code from the implementation of object persistence using a relational database, still force the programmer to think in terms of databases, tables, and native SQL queries as soon as performance of ORM-generated queries becomes a concern.
*Although network file systems like [[Network File System|NFS]] and [[Server Message Block|SMB]] let one treat files on remote machines as if they were local, the connection to the remote machine may slow down or break, and the file stops acting as if it was local.
*Although network file systems like [[Network File System|NFS]] and [[Server Message Block|SMB]] let one treat files on remote machines as if they were local, the connection to the remote machine may slow down or break, and the file stops acting as if it was local.

Revision as of 14:20, 4 March 2014

In software development, a leaky abstraction is an implemented abstraction where details and limitations of the implementation leak through.

History

The term "leaky abstraction" was popularized in 2002 by Joel Spolsky.[1] An earlier paper by Kiczales describes some of the issues with imperfect abstractions and presents a potential solution to the problem by allowing for the customization of the abstraction itself.[2]

The Law of Leaky Abstractions

As coined by Spolsky, the Law of Leaky Abstractions states:

All non-trivial abstractions, to some degree, are leaky.

This statement highlights a particularly problematic cause of software defects: the reliance of the software developer on an abstraction's infallibility.

Spolsky's article gives examples of an abstraction that works most of the time, but where a detail of the underlying complexity cannot be ignored, thus leaking complexity out of the abstraction back into the software that uses the abstraction.

Effect on software development

As systems become more complex, software developers must rely upon more abstractions. Each abstraction tries to hide complexity, letting a developer write software that "handles" the many variations of modern computing.

However, this law claims that developers of reliable software must learn the abstraction's underlying details anyway.

Examples

Spolsky's article cites many examples of leaky abstractions that create problems for software development:

  • The TCP/IP protocol stack is the combination of the TCP protocol, which tries to provide reliable delivery of information, running on top of the IP protocol, which provides only 'best-effort' service. When IP loses a packet TCP has to retransmit it, which takes additional time. Thus TCP provides the abstraction of a reliable connection, but the implementation details leak through in the form of potentially variable performance (throughput and latency both suffer when data has to be retransmitted).
  • Iterating over a large two-dimensional array can have radically different performance if done horizontally rather than vertically, depending on the order in which elements are stored in memory. One direction may vastly increase cache misses and page faults, both of which greatly delay access to memory.
  • The SQL language abstracts away the procedural steps for querying a database, allowing one to merely define what one wants. But certain SQL queries are thousands of times slower than other logically equivalent queries. On an even higher level of abstraction, ORM systems, which isolate object-oriented code from the implementation of object persistence using a relational database, still force the programmer to think in terms of databases, tables, and native SQL queries as soon as performance of ORM-generated queries becomes a concern.
  • Although network file systems like NFS and SMB let one treat files on remote machines as if they were local, the connection to the remote machine may slow down or break, and the file stops acting as if it was local.
  • The ASP.NET web programming platform abstracts away the difference between HTML code to handle clicking on a hyperlink (<a>) and code to handle clicking on a button. However, ASP.NET needs to hide the fact that in HTML there is no way to submit a form from a hyperlink. It does this by generating a few lines of JavaScript and attaching an onclick handler to the hyperlink. However, if the end user has JavaScript disabled, the ASP.NET application malfunctions. Furthermore, one cannot naively think of event handlers in ASP.NET in the same way as in a desktop GUI framework such as Windows Forms; due to the fundamental limitations of the Web, processing event handlers in ASP.NET requires exchanging data with the server and reloading the form.

See also

References

  1. ^ Spolsky, Joel (2002). "The Law of Leaky Abstractions". Retrieved 2010-09-22. - a blog post by Spolsky that asserts that all non-trivial abstractions are 'leaky' and therefore problematic.
  2. ^ Kiczales, Gregor (1992). "Towards a New Model of Abstraction in the Engineering of Software" (PDF). Retrieved 2010-02-03. - a paper by Gregor Kiczales that describes the problem of imperfect abstractions and suggests a programming model for coping with them.