Jump to content

Scalability: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
BG19bot (talk | contribs)
m Weak versus strong scaling: WP:CHECKWIKI error fix for #61. Punctuation goes before References. Do general fixes if a problem exists. - using AWB (9985)
Xseany (talk | contribs)
 
(189 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
{{short description|Ability of a system to handle an increasing amount of work}}
{{refimprove|date=March 2012}}
{{Refimprove|date=March 2012}}
{{Complex systems}}


'''Scalability''' is the ability of a system, network, or process to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth.<ref>{{Cite journal|doi=10.1145/350391.350432|chapter=Characteristics of scalability and their impact on performance|title=Proceedings of the second international workshop on Software and performance - WOSP '00|year=2000|last1=Bondi|first1=André B.|isbn=158113195X|page=195}}</ref> For example, it can refer to the capability of a system to increase its total output under an increased load when resources (typically hardware) are added. An analogous meaning is implied when the word is used in an [[economics|economic]] context, where scalability of a company implies that the underlying [[business model]] offers the potential for [[economic growth]] within the company.
'''Scalability''' is the property of a system to handle a growing amount of work. One definition for software systems<!-- by Andre Bondi --> specifies that this may be done by adding resources to the system.<ref>{{Cite conference|doi=10.1145/350391.350432|title=Characteristics of scalability and their impact on performance|conference=Proceedings of the second international workshop on Software and performance WOSP '00|year=2000|last1=Bondi|first1=André B.|isbn=158113195X|page=195}}</ref>


Scalability, as a property of systems, is generally difficult to define<ref>See for instance, {{Cite journal|doi=10.1145/121973.121975|title=What is scalability?|year=1990|last1=Hill|first1=Mark D.|journal=ACM SIGARCH Computer Architecture News|volume=18|issue=4|page=18}} and {{Cite journal|doi=10.1145/1134285.1134460|chapter=A framework for modelling and analysis of software systems scalability|title=Proceeding of the 28th international conference on Software engineering - ICSE '06|year=2006|last1=Duboc|first1=Leticia|last2=Rosenblum|first2=David S.|last3=Wicks|first3=Tony|isbn=1595933751|page=949}}</ref> and in any particular case it is necessary to define the specific requirements for scalability on those dimensions that are deemed important. It is a highly significant issue in electronics systems, databases, routers, and networking. A system whose performance improves after adding hardware, proportionally to the capacity added, is said to be a '''scalable system'''.
In an [[economics|economic]] context, a scalable [[business model]] implies that a company can increase sales given increased resources. For example, a package delivery system is scalable because more packages can be delivered by adding more delivery vehicles. However, if all packages had to first pass through a single warehouse for sorting, the system would not be as scalable, because one warehouse can handle only a limited number of packages.<ref>{{cite journal|doi=10.1145/121973.121975|title=What is scalability?|year=1990|last1=Hill|first1=Mark D.|journal=ACM SIGARCH Computer Architecture News|volume=18|issue=4|page=18|s2cid=1232925|url=https://minds.wisconsin.edu/bitstream/1793/9676/1/file_1.pdf}} and <br />{{cite conference|doi=10.1145/1134285.1134460|title=A framework for modelling and analysis of software systems scalability|conference=Proceedings of the 28th international conference on Software engineering ICSE '06|year=2006|last1=Duboc|first1=Leticia|last2=Rosenblum|first2=David S.|last3=Wicks|first3=Tony|isbn=1595933751|page=949|url=http://discovery.ucl.ac.uk/4990/1/4990.pdf}}</ref>


In computing, scalability is a characteristic of computers, networks, [[algorithm]]s, [[Protocol (computing)|networking protocols]], [[Computer program|programs]] and applications. An example is a [[search engine]], which must support increasing numbers of users, and the number of topics it [[Web indexing|indexes]].<ref>{{cite book|url={{google books |plainurl=y |id=n4bUGAAACAAJ}}|title=E-commerce: Business, Technology, Society|first1=Kenneth Craig|last1=Laudon|first2=Carol Guercio|last2=Traver|publisher=Pearson Prentice Hall/Pearson Education|year=2008|isbn=9780136006459}}</ref> '''Webscale''' is a computer architectural approach that brings the capabilities of large-scale cloud computing companies into enterprise data centers.<ref>{{Cite news|url=https://www.networkworld.com/article/3199205/why-web-scale-is-the-future.html|title=Why web-scale is the future|work=Network World |access-date=2017-06-01|date=2020-02-13|language=en-US}}</ref>
An [[algorithm]], design, [[Protocol (computing)|networking protocol]], [[Computer program|program]], or other system is said to ''scale'' if it is suitably [[Algorithmic efficiency|efficient]] and practical when applied to large situations (e.g. a large input data set, a large number of outputs or users, or a large number of participating nodes in the case of a distributed system). If the design or system fails when a quantity increases, it ''does not scale''. In practice, if there are a large number of things ''n'' that affect scaling, then ''n'' must grow less than ''n''<sup>2</sup>. An example is a search engine, that must scale not only for the number of users, but for the number of objects it indexes.
Scalability refers to the ability of a site to increase in size as demand warrants.<ref>{{Cite book|url=http://books.google.com/books/about/E_commerce.html?id=n4bUGAAACAAJ|
title=E-commerce: Business, Technology, Society
|first1=Kenneth Craig|last1= Laudon|first2=Carol Guercio |last2=Traver
|publisher=Pearson Prentice Hall/Pearson Education|year=2008|isbn=9780136006459 }}</ref>
The concept of scalability is desirable in technology as well as [[business]] settings. The base concept is consistent – the ability for a business or technology to accept increased volume without impacting the [[contribution margin]] (= [[revenue]] &minus; [[variable cost]]s). For example, a given piece of equipment may have capacity from 1–1000 users, and beyond 1000 users, additional equipment is needed or performance will decline (variable costs will increase and reduce contribution margin).


In [[distributed system|distributed systems]], there are several definitions according to the authors, some considering the concepts of scalability a sub-part of [[Elasticity (system resource)|elasticity]], others as being distinct. According to Marc Brooker: "a system is scalable in the range where [[marginal cost]] of additional workload is nearly constant." [[Serverless computing|Serverless]] technologies fit this definition but you need to consider total cost of ownership not just the infra cost. <ref>{{Cite book |title=Building Serverless Applications on Knative |publisher=O'Reilly Media |isbn=9781098142049}}</ref>
==Measures==

Scalability can be measured in various dimensions, such as:
In mathematics, scalability mostly refers to [[closure (mathematics)|closure]] under [[scalar multiplication]].
* ''Administrative scalability'': The ability for an increasing number of organizations or users to easily share a single distributed system.

* ''Functional scalability'': The ability to enhance the system by adding new functionality at minimal effort.
In [[industrial engineering]] and manufacturing, scalability refers to the capacity of a process, system, or organization to handle a growing workload, adapt to increasing demands, and maintain operational efficiency. A scalable system can effectively manage increased production volumes, new product lines, or expanding markets without compromising quality or performance. In this context, scalability is a vital consideration for businesses aiming to meet customer expectations, remain competitive, and achieve sustainable growth. Factors influencing scalability include the flexibility of the production process, the adaptability of the workforce, and the integration of advanced technologies. By implementing scalable solutions, companies can optimize resource utilization, reduce costs, and streamline their operations. Scalability in industrial engineering and manufacturing enables businesses to respond to fluctuating market conditions, capitalize on emerging opportunities, and thrive in an ever-evolving global landscape.{{cn|date=April 2023}}
* ''Geographic scalability'': The ability to maintain performance, usefulness, or usability regardless of expansion from concentration in a local area to a more distributed geographic pattern.
* ''Load scalability'': The ability for a [[distributed system]] to easily expand and contract its resource pool to accommodate heavier or lighter loads or number of inputs. Alternatively, the ease with which a system or component can be modified, added, or removed, to accommodate changing load.
* ''Generation scalability'' refers to the ability of a system to scale up by using new generations of components. Thereby, [[Open architecture|''heterogeneous scalability'']] is the ability to use the components from different vendors.<ref name="parallel_arch">{{cite book |author= By Hesham El-Rewini and Mostafa Abd-El-Barr |title=Advanced Computer Architecture and Parallel Processing |url=http://books.google.ee/books?id=7JB-u6D5Q7kC&pg=PA63&dq=parallel+architectures+scalability&hl=et&sa=X&ei=bQZtUtTKC6SO4gT27oC4Ag&ved=0CC4Q6AEwAA#v=onepage&q=parallel%20architectures%20scalability&f=false |location= |publisher=John Wiley & Son |date=Apr 2005 |isbn=978-0-471-47839-3|page=66 |accessdate=Oct 2013 }}</ref>


==Examples==
==Examples==
The [[Incident Command System]] (ICS) is used by [[emergency response]] agencies in the United States. ICS can scale resource coordination from a single-engine roadside brushfire to an interstate wildfire. The first resource on scene establishes command, with authority to order resources and delegate responsibility (managing five to seven officers, who will again delegate to up to seven, and on as the incident grows). As an incident expands, more senior officers assume command.<ref>{{Cite journal|last1=Bigley|first1=Gregory A.|last2=Roberts|first2=Karlene H.|date=2001-12-01|title=The Incident Command System: High-Reliability Organizing for Complex and Volatile Task Environments|journal=Academy of Management Journal|volume=44|issue=6|pages=1281–1299|doi=10.5465/3069401|doi-broken-date=1 November 2024 |issn=0001-4273}}</ref>
* A [[routing protocol]] is considered scalable with respect to network size, if the size of the necessary [[routing table]] on each node grows as [[Big O notation|O]](log ''N''), where ''N'' is the number of nodes in the network.
* A scalable [[online transaction processing]] system or [[database management system]] is one that can be upgraded to process more transactions by adding new processors, devices and storage, and which can be upgraded easily and transparently without shutting it down.
* Some early [[peer-to-peer]] (P2P) implementations of [[Gnutella]] had scaling issues. Each node query [[Query flooding|flooded]] its requests to all peers. The demand on each peer would increase in proportion to the total number of peers, quickly overrunning the peers' limited capacity. Other P2P systems like [[BitTorrent (protocol)|BitTorrent]] scale well because the demand on each peer is independent of the total number of peers. There is no centralized bottleneck, so the system may expand indefinitely without the addition of supporting resources (other than the peers themselves).
* The distributed nature of the [[Domain Name System]] allows it to work efficiently even when all [[server (computing)|hosts]] on the worldwide [[Internet]] are served, so it is said to "scale well".


==Dimensions==
== {{Anchor|HORIZONTAL-SCALING|VERTICAL-SCALING}}Horizontal and vertical scaling ==
Scalability can be measured over multiple dimensions, such as:<ref name="parallel_arch">{{cite book|author=Hesham El-Rewini and Mostafa Abd-El-Barr|title=Advanced Computer Architecture and Parallel Processing|url=https://books.google.com/books?id=7JB-u6D5Q7kC&q=parallel+architectures+scalability&pg=PA63|publisher=[[John Wiley & Sons]]|date=April 2005|isbn=978-0-471-47839-3|page=66}}</ref>
Methods of adding more resources for a particular application fall into two broad categories: horizontal and vertical scaling.<ref>{{ cite journal | url = http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4228359 | title = 2007 IEEE International Parallel and Distributed Processing Symposium | date = March 26, 2007 |doi=10.1109/IPDPS.2007.370631 | chapter = Scale-up x Scale-out: A Case Study using Nutch/Lucene | last1 = Michael | first1 = Maged | last2 = Moreira | first2 = Jose E. | last3 = Shiloach | first3 = Doron | last4 = Wisniewski | first4 = Robert W. | isbn = 1-4244-0909-8 | page = 1}}</ref>


*''Administrative scalability'': The ability for an increasing number of organizations or users to access a system.
To ''scale horizontally'' (or ''scale out'') means to add more nodes to a system, such as adding a new computer to a distributed software application. An example might be scaling out from one Web server system to three. As computer prices have dropped and performance continues to increase, low cost "[[commodity server|commodity]]" systems have been used for high performance computing applications such as seismic analysis and biotechnology workloads that could in the past only be handled by [[supercomputer]]s. Hundreds of small computers may be configured in a [[computer cluster|cluster]] to obtain aggregate computing power that often exceeds that of computers based on a single traditional processor. This model was further fueled by the availability of high performance interconnects such as [[Gigabit Ethernet]], [[InfiniBand]] and [[Myrinet]]. Its growth has also led to demand for software that allows efficient management and maintenance of multiple nodes, as well as hardware such as shared data storage with much higher I/O performance. ''Size scalability'' is the maximum number of processors that a system can accommodate.<ref name="parallel_arch"/>
*''Functional scalability'': The ability to enhance the system by adding new functionality without disrupting existing activities.
*''Geographic scalability'': The ability to maintain effectiveness during expansion from a local area to a larger region.
*''Load scalability'': The ability for a [[distributed system]] to expand and contract to accommodate heavier or lighter loads, including, the ease with which a system or component can be modified, added, or removed, to accommodate changing loads.
*''Generation scalability'': The ability of a system to scale by adopting new generations of components.
*[[Open architecture|''Heterogeneous scalability'']] is the ability to adopt components from different vendors.


==Domains==
To ''scale vertically'' (or ''scale up'') means to add resources to a single node in a system, typically involving the addition of CPUs or memory to a single computer. Such vertical scaling of existing systems also enables them to use [[platform virtualization|virtualization]] technology more effectively, as it provides more resources for the hosted set of [[operating system]] and [[application software|application]] modules to share. Taking advantage of such resources can also be called "scaling up", such as expanding the number of [[Apache HTTP Server|Apache]] daemon processes currently running. ''Application scalability'' refers to the improved performance of running applications on a scaled-up version of the system.<ref name="parallel_arch"/>
* A [[routing protocol]] is considered scalable with respect to network size, if the size of the necessary [[routing table]] on each node grows as [[Big O notation|O]](log ''N''), where ''N'' is the number of nodes in the network. Some early [[peer-to-peer]] (P2P) implementations of [[Gnutella]] had scaling issues. Each node query [[Query flooding|flooded]] its requests to all nodes. The demand on each peer increased in proportion to the total number of peers, quickly overrunning their capacity. Other P2P systems like [[BitTorrent (protocol)|BitTorrent]] scale well because the demand on each peer is independent of the number of peers. Nothing is centralized, so the system can expand indefinitely without any resources other than the peers themselves.
* A scalable [[online transaction processing]] system or [[database management system]] is one that can be upgraded to process more transactions by adding new processors, devices and storage, and which can be upgraded easily and transparently without shutting it down.
* The distributed nature of the [[Domain Name System]] (DNS) allows it to work efficiently, serving billions of [[server (computing)|hosts]] on the worldwide [[Internet]].


==Horizontal (scale out) and vertical scaling (scale up){{Anchor|HORIZONTAL-SCALING|VERTICAL-SCALING|Horizontal and vertical scaling}}==
There are tradeoffs between the two models. Larger numbers of computers means increased management complexity, as well as a more complex programming model and issues such as throughput and latency between nodes; also, [[Amdahl's Law|some applications do not lend themselves to a distributed computing model]]. In the past, the price difference between the two models has favored "scale up" computing for those applications that fit its paradigm, but recent advances in virtualization technology have blurred that advantage, since deploying a new virtual system over a [[hypervisor]] (where possible) is almost always less expensive than actually buying and installing a real one.{{Dubious|date=October 2011}} Configuring an existing idle system has always been less expensive than buying, installing, and configuring a new one, regardless of the model.
Resources fall into two broad categories: horizontal and vertical.<ref>{{cite conference|conference=2007 IEEE International Parallel and Distributed Processing Symposium|date=March 26, 2007|doi=10.1109/IPDPS.2007.370631|title=Scale-up x Scale-out: A Case Study using Nutch/Lucene|last1=Michael|first1=Maged|last2=Moreira|first2=Jose E.|last3=Shiloach|first3=Doron|last4=Wisniewski|first4=Robert W.|isbn=978-1-4244-0909-9|page=1}}</ref>


===Horizontal or scale out{{anchor|Horizontal}}===
==Database scalability==
Scaling horizontally (out/in) means adding or removing nodes, such as adding a new computer to a distributed software application. An example might involve scaling out from one web server to three. [[High-performance computing]] applications, such as [[seismic analysis]] and [[biotechnology]], scale workloads horizontally to support tasks that once would have required expensive [[supercomputer]]s. Other workloads, such as large social networks, exceed the capacity of the largest supercomputer and can only be handled by scalable systems. Exploiting this scalability requires software for efficient resource management and maintenance.<ref name="parallel_arch" />
A number of different approaches enable [[database]]s to grow to very large size while supporting an ever-increasing rate of [[Transactions Per Second|transactions per second]]. Not to be discounted, of course, is the rapid pace of hardware advances in both the speed and capacity of [[mass storage]] devices, as well as similar advances in CPU and networking speed.


===Vertical or scale up{{anchor|Vertical}}===
One technique supported by most of the major [[Database management system|database management system (DBMS)]] products is the [[Partition (database)|partitioning]] of large tables, based on ranges of values in a key field. In this manner, the database can be ''scaled out'' across a cluster of separate [[database server]]s. Also, with the advent of 64-bit [[microprocessor]]s, [[Multi-core (computing)|multi-core]] CPUs, and large [[Symmetric multiprocessing|SMP multiprocessors]], DBMS vendors have been at the forefront of supporting [[Thread (computer science)|multi-threaded]] implementations that substantially ''scale up'' [[transaction processing]] capacity.
Scaling vertically (up/down) means adding resources to (or removing resources from) a single node, typically involving the addition of CPUs, memory or storage to a single computer.<ref name="parallel_arch" />


Benefits to scale-up include avoiding increased management complexity, more sophisticated programming to allocate tasks among resources and handling issues such as throughput, latency, and synchronization across nodes. Moreover some [[Amdahl's law|applications do not scale horizontally]].
[[Network-attached storage|Network-attached storage (NAS)]] and [[Storage area network|Storage area networks (SANs)]] coupled with fast local area networks and [[Fibre Channel]] technology enable still larger, more loosely coupled configurations of databases and distributed computing power. The widely supported [[X/Open XA]] standard employs a global transaction monitor to coordinate [[distributed transaction]]s among semi-autonomous XA-compliant database resources. [[Oracle RAC]] uses a different model to achieve scalability, based on a "shared-everything" architecture that relies upon high-speed connections between servers.


==Network scalability==
While DBMS vendors debate the relative merits of their favored designs, some companies and researchers question the inherent limitations of [[relational database management system]]s. [[GigaSpaces]], for example, contends that an entirely different model of distributed data access and transaction processing, [[Space based architecture]], is required to achieve the highest performance and scalability. On the other hand, [[Base One]] makes the case for extreme scalability without departing from mainstream relational database technology.<ref>{{cite web|author=Base One|url=http://www.boic.com/scalability.htm|title= Database Scalability - Dispelling myths about the limits of database-centric architecture|year=2007|accessdate=May 23, 2007}}</ref> For specialized applications, [[NoSQL]] architectures such as Google's [[BigTable]] can further enhance scalability. Google's massively distributed [[Spanner (distributed database technology)|Spanner]] technology, positioned as a successor to BigTable, supports general-purpose [[database transaction]]s and provides a more conventional [[SQL]]-based query language.<ref>{{Cite journal|url=http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/spanner-osdi2012.pdf |title=Spanner: Google's Globally-Distributed Database|year= 2012|accessdate= September 30, 2012|isbn=978-1-931971-96-6|series=OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation|pages= 251–264 }}</ref>
[[Network function virtualization]] defines these terms differently: scaling out/in is the ability to scale by adding/removing resource instances (e.g., virtual machine), whereas scaling up/down is the ability to scale by changing allocated resources (e.g., memory/CPU/storage capacity).<ref>{{cite web|title=Network Functions Virtualisation (NFV); Terminology for Main Concepts in NFV|url=http://www.etsi.org/deliver/etsi_gs/NFV/001_099/001/01.01.01_60/gs_NFV003v010201p.pdf|access-date=2016-01-12|archive-date=2020-05-11|archive-url=https://web.archive.org/web/20200511090646/https://www.etsi.org/|url-status=dead}}</ref>

==Database scalability==
{{Main|Database scalability}}
Scalability for databases requires that the database system be able to perform additional work given greater hardware resources, such as additional servers, processors, memory and storage. Workloads have continued to grow and demands on databases have followed suit.

Algorithmic innovations include row-level locking and table and index partitioning. Architectural innovations include [[Shared-nothing architecture|shared-nothing]] and shared-everything architectures for managing multi-server configurations.


==Strong versus eventual consistency (storage)==
==Strong versus eventual consistency (storage)==
In the context of scale-out [[Computer data storage|data storage]], scalability is defined as the maximum storage cluster size which guarantees full data consistency, meaning there is only ever one valid version of stored data in the whole cluster, independently from the number of redundant physical data copies. Clusters which provide "lazy" redundancy by updating copies in an asynchronous fashion are called [[Eventual consistency|'eventually consistent']]. This type of scale-out design is suitable when availability and responsiveness are rated higher than consistency, which is true for many web file hosting services or web caches (''if you want the latest version, wait some seconds for it to propagate''). For all classical transaction-oriented applications, this design should be avoided.<ref>{{cite web|title=Eventual consistency by Werner Vogels|url=http://www.infoq.com/news/2008/01/consistency-vs-availability}}</ref>
In the context of scale-out [[Computer data storage|data storage]], scalability is defined as the maximum storage cluster size which guarantees full data consistency, meaning there is only ever one valid version of stored data in the whole cluster, independently from the number of redundant physical data copies. Clusters which provide "lazy" redundancy by updating copies in an asynchronous fashion are called [[Eventual consistency|'eventually consistent']]. This type of scale-out design is suitable when availability and responsiveness are rated higher than consistency, which is true for many web file-hosting services or web caches (''if you want the latest version, wait some seconds for it to propagate''). For all classical transaction-oriented applications, this design should be avoided.<ref>{{cite news|title=Eventual consistency by Werner Vogels|author=Sadek Drobi|url=http://www.infoq.com/news/2008/01/consistency-vs-availability|date=January 11, 2008|access-date=April 8, 2017|publisher=InfoQ}}</ref>


Many open source and even commercial scale-out storage clusters, especially those built on top of standard PC hardware and networks, provide [[eventual consistency]] only. Idem some NoSQL databases like [[CouchDB]] and others mentioned above. Write operations invalidate other copies, but often don't wait for their acknowledgements. Read operations typically don't check every redundant copy prior to answering, potentially missing the preceding write operation. The large amount of metadata signal traffic would require specialized hardware and short distances to be handled with acceptable performance (i.e. act like a non-clustered storage device or database).
Many open-source and even commercial scale-out storage clusters, especially those built on top of standard PC hardware and networks, provide eventual consistency only, such as some NoSQL databases like [[CouchDB]] and others mentioned above. Write operations invalidate other copies, but often don't wait for their acknowledgements. Read operations typically don't check every redundant copy prior to answering, potentially missing the preceding write operation. The large amount of metadata signal traffic would require specialized hardware and short distances to be handled with acceptable performance (i.e., act like a non-clustered storage device or database).{{cn|date=May 2023}}


Whenever strong data consistency is expected, look for these indicators:
Whenever strong data consistency is expected, look for these indicators:{{cn|date=May 2023}}
* the use of InfiniBand, Fibrechannel or similar low-latency networks to avoid performance degradation with increasing cluster size and number of redundant copies.
* the use of InfiniBand, Fibrechannel or similar low-latency networks to avoid performance degradation with increasing cluster size and number of redundant copies.
* short cable lengths and limited physical extent, avoiding signal runtime performance degradation.
* short cable lengths and limited physical extent, avoiding signal runtime performance degradation.
* majority / quorum mechanisms to guarantee data consistency whenever parts of the cluster become inaccessible.
* majority / quorum mechanisms to guarantee data consistency whenever parts of the cluster become inaccessible.


Indicators for [[Eventual consistency|eventually consistent]] designs (not suitable for transactional applications!) are:
Indicators for eventually consistent designs (not suitable for transactional applications!) are:{{cn|date=May 2023}}
* marketing buzzwords like "unlimited scalabiliy..." and "worldwide..."
* write performance increases linearly with the number of connected devices in the cluster.
* write performance increases linearly with the number of connected devices in the cluster.
* while the storage cluster is partitioned, all parts remain responsive. There is a risk of conflicting updates.
* while the storage cluster is partitioned, all parts remain responsive. There is a risk of conflicting updates.


==Performance tuning versus hardware scalability==
==Performance tuning versus hardware scalability==
It is often advised to focus system design on hardware scalability rather than on capacity. It is typically cheaper to add a new node to a system in order to achieve improved performance than to partake in [[performance tuning]] to improve the capacity that each node can handle. But this approach can have diminishing returns (as discussed in [[performance engineering]]). For example: suppose 70% of a program can be sped up if parallelized and run on multiple CPUs instead of one. If <math>\alpha</math> is the fraction of a calculation that is sequential, and <math>1-\alpha</math> is the fraction that can be parallelized, the maximum [[speedup]] that can be achieved by using P processors is given according to [[Amdahl's Law]]: <math>\frac{1}{\alpha+\frac{1-\alpha}{P}}</math>. Substituting the value for this example, using 4 processors we get <math>\frac{1}{0.3+\frac{1-0.3}{4}} = 2.105</math>. If we double the compute power to 8 processors we get <math>\frac{1}{0.3+\frac{1-0.3}{8}} = 2.581</math>. Doubling the processing power has only improved the speedup by roughly one-fifth. If the whole problem was parallelizable, we would, of course, expect the speed up to double also. Therefore, throwing in more hardware is not necessarily the optimal approach.
It is often advised to focus system design on hardware scalability rather than on capacity. It is typically cheaper to add a new node to a system in order to achieve improved performance than to partake in [[performance tuning]] to improve the capacity that each node can handle. But this approach can have diminishing returns (as discussed in [[performance engineering]]). For example: suppose 70% of a program can be sped up if parallelized and run on multiple CPUs instead of one. If <math>\alpha</math> is the fraction of a calculation that is sequential, and <math>1-\alpha</math> is the fraction that can be parallelized, the maximum [[speedup]] that can be achieved by using P processors is given according to [[Amdahl's Law]]:


: <math>\frac 1 {\alpha+\frac{1-\alpha} P}.</math>
==Weak versus strong scaling==

In the context of [[high performance computing]] there are two common notions of scalability:
Substituting the value for this example, using 4 processors gives
* The first is ''strong scaling'', which is defined as how the solution time varies with the number of processors for a fixed ''total'' problem size.

* The second is ''weak scaling'', which is defined as how the solution time varies with the number of processors for a fixed problem size ''per processor''.<ref>{{cite web|url=http://www.stfc.ac.uk/cse/25052.aspx |title=CSE - CSE - The Weak Scaling of DL_POLY 3 |publisher=Stfc.ac.uk |date= |accessdate=2014-03-08}}</ref>
: <math>\frac 1 {0.3+\frac{1-0.3} 4} = 2.105.</math>

Doubling the computing power to 8 processors gives

: <math>\frac 1 {0.3+\frac{1-0.3} 8} = 2.581.</math>

Doubling the processing power has only sped up the process by roughly one-fifth. If the whole problem was parallelizable, the speed would also double. Therefore, throwing in more hardware is not necessarily the optimal approach.

== Universal Scalability Law ==
In [[distributed systems]], you can use [[Neil J. Gunther#Universal Scalability Law|Universal Scalability Law]] (USL) to model and to optimize scalability of your system. USL is coined by [[Neil J. Gunther]] and quantifies scalability based on parameters such as contention and coherency. Contention refers to delay due to waiting or queueing for shared resources. Coherence refers to delay for data to become consistent. For example, having a high contention indicates sequential processing that could be parallelized, while having a high coherency suggests excessive dependencies among processes, prompting you to minimize interactions. Also, with help of USL, you can, in advance, calculate the maximum effective capacity of your system: scaling up your system beyond that point is a waste. <ref>{{Cite book |last=Gunther |first=Neil |title=Guerrilla Capacity Planning: A Tactical Approach to Planning for Highly Scalable Applications and Services |year=2007 |isbn=978-3540261384}}</ref>

== Weak versus strong scaling ==
[[High performance computing]] has two common notions of scalability:
* ''Strong scaling'' is defined as how the solution time varies with the number of processors for a fixed ''total'' problem size.
* ''Weak scaling'' is defined as how the solution time varies with the number of processors for a fixed problem size ''per processor''.<ref>{{cite web|url=http://www.stfc.ac.uk/cse/25052.aspx|title=The Weak Scaling of DL_POLY 3|publisher=STFC Computational Science and Engineering Department|access-date=March 8, 2014|archive-url=https://web.archive.org/web/20140307224104/http://www.stfc.ac.uk/cse/25052.aspx|archive-date=March 7, 2014|url-status=dead}}</ref>


==See also==
==See also==
{{Div col||25em}}
{{Div col|colwidth=25em}}
* [[Asymptotic complexity]]
*[[Computational complexity theory]]
*[[Extensibility]]
* [[Computational complexity theory]]
* [[Data Defined Storage]]
*[[Gustafson's law]]
*[[List of system quality attributes]]
* [[Extensibility]]
*[[Load balancing (computing)]]
* [[Gustafson's law]]
*[[Lock (computer science)]]
* [[List of system quality attributes]]
*[[NoSQL]]
* [[Load balancing (computing)]]
*[[Scalable Video Coding]] (SVC)
* [[Lock (computer science)]]
* [[NoSQL]]
*[[Similitude (model)]]
*[[Scale (analytical tool)]]
* [[Parallel computing]]
{{div col end}}
* [[Scalable Video Coding]] (SVC)
* [[Similitude (model)]]
{{Div col end}}


==References==
==References==
Line 89: Line 109:
==External links==
==External links==
{{Wiktionary|scalability}}
{{Wiktionary|scalability}}
*[http://code.google.com/p/memcached/wiki/HowToLearnMoreScalability Links to diverse learning resources] – page curated by the [[memcached]] project.
* [http://today.java.net/pub/a/today/2007/02/13/architecture-of-highly-scalable-nio-server.html Architecture of a Highly Scalable NIO-Based Server] - an article about writing scalable server in Java (java.net).
*[http://www.linfo.org/scalable.html Scalable Definition] – by The Linux Information Project (LINFO)
* [http://code.google.com/p/memcached/wiki/HowToLearnMoreScalability Links to diverse learning resources] - page curated by the [[memcached]] project.
*[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.3576 Scale in Distributed Systems] B. Clifford Neuman, In: ''Readings in Distributed Computing Systems'', IEEE Computer Society Press, 1994
* [http://www.linfo.org/scalable.html Scalable Definition] - by The Linux Information Project (LINFO)

* [http://go.nuodb.com/rs/nuodb/images/Greenbook_Final.pdf NuoDB Scale-out Emergent Architecture]
{{Authority control}}
* [http://www.cse.unsw.edu.au/~cs9243/lectures/papers/scale-dist-sys-neuman-readings-dcs.pdf Scale in Distributed Systems] B. Clifford Neumann, In: ''Readings in Distributed Computing Systems'', IEEE Computer Society Press, 1994
{{RAID}}
{{Parallel computing}}
{{Complex systems topics}}
{{Software quality}}


{{DEFAULTSORT:Scalability}}
[[Category:Computer architecture]]
[[Category:Computer architecture]]
[[Category:Computational resources]]
[[Category:Computational resources]]
[[Category:Computer systems]]
[[Category:Engineering concepts]]
[[Category:Software quality]]

Latest revision as of 22:25, 14 December 2024

Scalability is the property of a system to handle a growing amount of work. One definition for software systems specifies that this may be done by adding resources to the system.[1]

In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a package delivery system is scalable because more packages can be delivered by adding more delivery vehicles. However, if all packages had to first pass through a single warehouse for sorting, the system would not be as scalable, because one warehouse can handle only a limited number of packages.[2]

In computing, scalability is a characteristic of computers, networks, algorithms, networking protocols, programs and applications. An example is a search engine, which must support increasing numbers of users, and the number of topics it indexes.[3] Webscale is a computer architectural approach that brings the capabilities of large-scale cloud computing companies into enterprise data centers.[4]

In distributed systems, there are several definitions according to the authors, some considering the concepts of scalability a sub-part of elasticity, others as being distinct. According to Marc Brooker: "a system is scalable in the range where marginal cost of additional workload is nearly constant." Serverless technologies fit this definition but you need to consider total cost of ownership not just the infra cost. [5]

In mathematics, scalability mostly refers to closure under scalar multiplication.

In industrial engineering and manufacturing, scalability refers to the capacity of a process, system, or organization to handle a growing workload, adapt to increasing demands, and maintain operational efficiency. A scalable system can effectively manage increased production volumes, new product lines, or expanding markets without compromising quality or performance. In this context, scalability is a vital consideration for businesses aiming to meet customer expectations, remain competitive, and achieve sustainable growth. Factors influencing scalability include the flexibility of the production process, the adaptability of the workforce, and the integration of advanced technologies. By implementing scalable solutions, companies can optimize resource utilization, reduce costs, and streamline their operations. Scalability in industrial engineering and manufacturing enables businesses to respond to fluctuating market conditions, capitalize on emerging opportunities, and thrive in an ever-evolving global landscape.[citation needed]

Examples

[edit]

The Incident Command System (ICS) is used by emergency response agencies in the United States. ICS can scale resource coordination from a single-engine roadside brushfire to an interstate wildfire. The first resource on scene establishes command, with authority to order resources and delegate responsibility (managing five to seven officers, who will again delegate to up to seven, and on as the incident grows). As an incident expands, more senior officers assume command.[6]

Dimensions

[edit]

Scalability can be measured over multiple dimensions, such as:[7]

  • Administrative scalability: The ability for an increasing number of organizations or users to access a system.
  • Functional scalability: The ability to enhance the system by adding new functionality without disrupting existing activities.
  • Geographic scalability: The ability to maintain effectiveness during expansion from a local area to a larger region.
  • Load scalability: The ability for a distributed system to expand and contract to accommodate heavier or lighter loads, including, the ease with which a system or component can be modified, added, or removed, to accommodate changing loads.
  • Generation scalability: The ability of a system to scale by adopting new generations of components.
  • Heterogeneous scalability is the ability to adopt components from different vendors.

Domains

[edit]
  • A routing protocol is considered scalable with respect to network size, if the size of the necessary routing table on each node grows as O(log N), where N is the number of nodes in the network. Some early peer-to-peer (P2P) implementations of Gnutella had scaling issues. Each node query flooded its requests to all nodes. The demand on each peer increased in proportion to the total number of peers, quickly overrunning their capacity. Other P2P systems like BitTorrent scale well because the demand on each peer is independent of the number of peers. Nothing is centralized, so the system can expand indefinitely without any resources other than the peers themselves.
  • A scalable online transaction processing system or database management system is one that can be upgraded to process more transactions by adding new processors, devices and storage, and which can be upgraded easily and transparently without shutting it down.
  • The distributed nature of the Domain Name System (DNS) allows it to work efficiently, serving billions of hosts on the worldwide Internet.

Horizontal (scale out) and vertical scaling (scale up)

[edit]

Resources fall into two broad categories: horizontal and vertical.[8]

Horizontal or scale out

[edit]

Scaling horizontally (out/in) means adding or removing nodes, such as adding a new computer to a distributed software application. An example might involve scaling out from one web server to three. High-performance computing applications, such as seismic analysis and biotechnology, scale workloads horizontally to support tasks that once would have required expensive supercomputers. Other workloads, such as large social networks, exceed the capacity of the largest supercomputer and can only be handled by scalable systems. Exploiting this scalability requires software for efficient resource management and maintenance.[7]

Vertical or scale up

[edit]

Scaling vertically (up/down) means adding resources to (or removing resources from) a single node, typically involving the addition of CPUs, memory or storage to a single computer.[7]

Benefits to scale-up include avoiding increased management complexity, more sophisticated programming to allocate tasks among resources and handling issues such as throughput, latency, and synchronization across nodes. Moreover some applications do not scale horizontally.

Network scalability

[edit]

Network function virtualization defines these terms differently: scaling out/in is the ability to scale by adding/removing resource instances (e.g., virtual machine), whereas scaling up/down is the ability to scale by changing allocated resources (e.g., memory/CPU/storage capacity).[9]

Database scalability

[edit]

Scalability for databases requires that the database system be able to perform additional work given greater hardware resources, such as additional servers, processors, memory and storage. Workloads have continued to grow and demands on databases have followed suit.

Algorithmic innovations include row-level locking and table and index partitioning. Architectural innovations include shared-nothing and shared-everything architectures for managing multi-server configurations.

Strong versus eventual consistency (storage)

[edit]

In the context of scale-out data storage, scalability is defined as the maximum storage cluster size which guarantees full data consistency, meaning there is only ever one valid version of stored data in the whole cluster, independently from the number of redundant physical data copies. Clusters which provide "lazy" redundancy by updating copies in an asynchronous fashion are called 'eventually consistent'. This type of scale-out design is suitable when availability and responsiveness are rated higher than consistency, which is true for many web file-hosting services or web caches (if you want the latest version, wait some seconds for it to propagate). For all classical transaction-oriented applications, this design should be avoided.[10]

Many open-source and even commercial scale-out storage clusters, especially those built on top of standard PC hardware and networks, provide eventual consistency only, such as some NoSQL databases like CouchDB and others mentioned above. Write operations invalidate other copies, but often don't wait for their acknowledgements. Read operations typically don't check every redundant copy prior to answering, potentially missing the preceding write operation. The large amount of metadata signal traffic would require specialized hardware and short distances to be handled with acceptable performance (i.e., act like a non-clustered storage device or database).[citation needed]

Whenever strong data consistency is expected, look for these indicators:[citation needed]

  • the use of InfiniBand, Fibrechannel or similar low-latency networks to avoid performance degradation with increasing cluster size and number of redundant copies.
  • short cable lengths and limited physical extent, avoiding signal runtime performance degradation.
  • majority / quorum mechanisms to guarantee data consistency whenever parts of the cluster become inaccessible.

Indicators for eventually consistent designs (not suitable for transactional applications!) are:[citation needed]

  • write performance increases linearly with the number of connected devices in the cluster.
  • while the storage cluster is partitioned, all parts remain responsive. There is a risk of conflicting updates.

Performance tuning versus hardware scalability

[edit]

It is often advised to focus system design on hardware scalability rather than on capacity. It is typically cheaper to add a new node to a system in order to achieve improved performance than to partake in performance tuning to improve the capacity that each node can handle. But this approach can have diminishing returns (as discussed in performance engineering). For example: suppose 70% of a program can be sped up if parallelized and run on multiple CPUs instead of one. If is the fraction of a calculation that is sequential, and is the fraction that can be parallelized, the maximum speedup that can be achieved by using P processors is given according to Amdahl's Law:

Substituting the value for this example, using 4 processors gives

Doubling the computing power to 8 processors gives

Doubling the processing power has only sped up the process by roughly one-fifth. If the whole problem was parallelizable, the speed would also double. Therefore, throwing in more hardware is not necessarily the optimal approach.

Universal Scalability Law

[edit]

In distributed systems, you can use Universal Scalability Law (USL) to model and to optimize scalability of your system. USL is coined by Neil J. Gunther and quantifies scalability based on parameters such as contention and coherency. Contention refers to delay due to waiting or queueing for shared resources. Coherence refers to delay for data to become consistent. For example, having a high contention indicates sequential processing that could be parallelized, while having a high coherency suggests excessive dependencies among processes, prompting you to minimize interactions. Also, with help of USL, you can, in advance, calculate the maximum effective capacity of your system: scaling up your system beyond that point is a waste. [11]

Weak versus strong scaling

[edit]

High performance computing has two common notions of scalability:

  • Strong scaling is defined as how the solution time varies with the number of processors for a fixed total problem size.
  • Weak scaling is defined as how the solution time varies with the number of processors for a fixed problem size per processor.[12]

See also

[edit]

References

[edit]
  1. ^ Bondi, André B. (2000). Characteristics of scalability and their impact on performance. Proceedings of the second international workshop on Software and performance – WOSP '00. p. 195. doi:10.1145/350391.350432. ISBN 158113195X.
  2. ^ Hill, Mark D. (1990). "What is scalability?" (PDF). ACM SIGARCH Computer Architecture News. 18 (4): 18. doi:10.1145/121973.121975. S2CID 1232925. and
    Duboc, Leticia; Rosenblum, David S.; Wicks, Tony (2006). A framework for modelling and analysis of software systems scalability (PDF). Proceedings of the 28th international conference on Software engineering – ICSE '06. p. 949. doi:10.1145/1134285.1134460. ISBN 1595933751.
  3. ^ Laudon, Kenneth Craig; Traver, Carol Guercio (2008). E-commerce: Business, Technology, Society. Pearson Prentice Hall/Pearson Education. ISBN 9780136006459.
  4. ^ "Why web-scale is the future". Network World. 2020-02-13. Retrieved 2017-06-01.
  5. ^ Building Serverless Applications on Knative. O'Reilly Media. ISBN 9781098142049.
  6. ^ Bigley, Gregory A.; Roberts, Karlene H. (2001-12-01). "The Incident Command System: High-Reliability Organizing for Complex and Volatile Task Environments". Academy of Management Journal. 44 (6): 1281–1299. doi:10.5465/3069401 (inactive 1 November 2024). ISSN 0001-4273.{{cite journal}}: CS1 maint: DOI inactive as of November 2024 (link)
  7. ^ a b c Hesham El-Rewini and Mostafa Abd-El-Barr (April 2005). Advanced Computer Architecture and Parallel Processing. John Wiley & Sons. p. 66. ISBN 978-0-471-47839-3.
  8. ^ Michael, Maged; Moreira, Jose E.; Shiloach, Doron; Wisniewski, Robert W. (March 26, 2007). Scale-up x Scale-out: A Case Study using Nutch/Lucene. 2007 IEEE International Parallel and Distributed Processing Symposium. p. 1. doi:10.1109/IPDPS.2007.370631. ISBN 978-1-4244-0909-9.
  9. ^ "Network Functions Virtualisation (NFV); Terminology for Main Concepts in NFV". Archived from the original (PDF) on 2020-05-11. Retrieved 2016-01-12.
  10. ^ Sadek Drobi (January 11, 2008). "Eventual consistency by Werner Vogels". InfoQ. Retrieved April 8, 2017.
  11. ^ Gunther, Neil (2007). Guerrilla Capacity Planning: A Tactical Approach to Planning for Highly Scalable Applications and Services. ISBN 978-3540261384.
  12. ^ "The Weak Scaling of DL_POLY 3". STFC Computational Science and Engineering Department. Archived from the original on March 7, 2014. Retrieved March 8, 2014.
[edit]