Jump to content

Queueing theory: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
m rm turd
 
(643 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
'''Queueing theory''' (also commonly spelled '''queuing theory''') is the mathematical study of waiting lines (or [[wikt:queue|queue]]s).
{{short description|Mathematical study of waiting lines, or queues}}
{{redirect|First come, first served|the Kool Keith album|First Come, First Served}}


[[File:ServidorParalelo.jpg|thumb|right|[[#Queueing networks|Queue networks]] are systems in which single queues are connected by a routing network. In this image, servers are represented by circles, queues by a series of rectangles and the routing network by arrows. In the study of queue networks one typically tries to obtain the [[equilibrium distribution]] of the network, although in many applications the study of the [[transient state]] is fundamental.]]
The theory enables mathematical analysis of several related processes, including arriving at the (back of the) queue, waiting in the queue (essentially a storage process), and being served by the server(s) at the front of the queue. The theory permits the derivation and calculation of several performance measures including the average waiting time in the queue or the system, the expected number waiting or receiving service and the probability of encountering the system in certain states, such as empty, full, having an available server or having to wait a certain time to be served.
Queueing theory is generally considered a branch of [[operations research]] because the results are often used when making business decisions about the resources needed to provide service. It is applicable in a wide variety of situations that may be encountered in business, commerce, industry, public service and engineering. Applications are frequently encountered in [[customer service]] situations as well as [[transport]] and [[telecommunication]] (note that something called [[ride theory]] is sometimes mentioned, but it is uncertain whether it is a valid theory or a hoax). Queueing theory is directly applicable to [[intelligent transportation system]]s, [[call center]]s, [[PABX]]s, [[telecommunications network|networks]], [[telecommunication]]s, [[Server (computing)|server]] queueing, [[Mainframe computer|mainframe]] [[computer]] queueing of telecommunications terminals, advanced telecommunications systems, and [[traffic flow]].


'''Queueing theory''' is the mathematical study of [[Queue area|waiting lines]], or [[wikt:queue|queues]].<ref name="sun">{{cite book | title = Probability, Statistics and Queueing Theory | first = V. | last = Sundarapandian | publisher = PHI Learning | year = 2009 | chapter = 7. Queueing Theory | isbn = 978-81-203-3844-9 }}</ref> A queueing model is constructed so that queue lengths and waiting time can be predicted.<ref name="sun" /> Queueing theory is generally considered a branch of [[operations research]] because the results are often used when making business decisions about the resources needed to provide a service.
==Spelling==
The word ''queue'' comes, via [[French language|French]], from the [[Latin language|Latin]] ''cauda'', meaning tail. Most researchers in the field prefer the spelling 'queueing' over 'queuing',<ref name="spelling">[http://www2.uwindsor.ca/~hlynka/qfaq.html Spelling of queueing/queuing]</ref> although the latter is somewhat more common in other contexts.


Queueing theory has its origins in research by [[Agner Krarup Erlang]], who created models to describe the system of incoming calls at the Copenhagen Telephone Exchange Company.<ref name="sun" /> These ideas were seminal to the field of [[teletraffic engineering]] and have since seen applications in [[telecommunications]], [[traffic engineering (transportation)|traffic engineering]], [[computing]],<ref>{{cite web
==History==
| last = Lawrence W. Dowdy, Virgilio A.F. Almeida
[[Agner Krarup Erlang]], a [[Denmark|Danish]] engineer who worked for the Copenhagen Telephone Exchange, published the first paper on queueing theory in [[1909]].
| first = Daniel A. Menasce
| title = Performance by Design: Computer Capacity Planning by Example
| url = http://www.cs.gmu.edu/~menasce/perfbyd/
| access-date = 2009-07-08
| archive-date = 2016-05-06
| archive-url = https://web.archive.org/web/20160506025515/http://cs.gmu.edu/~menasce/perfbyd/
| url-status = live
}}</ref> [[project management]], and particularly [[industrial engineering]], where they are applied in the design of factories, shops, offices, and hospitals.<ref>{{Cite news
| first = Kira
| last = Schlechter
| title = Hershey Medical Center to open redesigned emergency room
| newspaper = The Patriot-News
| date = March 2, 2009
| url = http://www.pennlive.com/midstate/index.ssf/2009/03/hershey_med_to_open_redesigned.html
| access-date = March 12, 2009
| archive-date = June 29, 2016
| archive-url = https://web.archive.org/web/20160629151917/http://www.pennlive.com/midstate/index.ssf/2009/03/hershey_med_to_open_redesigned.html
| url-status = live
}}</ref><ref>{{cite book |url= https://openaccess.city.ac.uk/id/eprint/2309/ |archive-url= https://web.archive.org/web/20210907100556/https://openaccess.city.ac.uk/id/eprint/2309/ |archive-date= September 7, 2021 |access-date= 2008-05-20 |author= Mayhew, Les |author2= Smith, David |date= December 2006 |title= Using queuing theory to analyse completion times in accident and emergency departments in the light of the Government 4-hour target |publisher= [[Cass Business School]] |isbn= 978-1-905752-06-5 }}</ref>


== Spelling ==
[[David George Kendall|David G. Kendall]] introduced an '''''A/B/C''''' queueing notation in [[1953]].


The spelling "queueing" over "queuing" is typically encountered in the academic research field. In fact, one of the flagship journals of the field is ''[[Queueing Systems]]''.
==Notation==
Notation for describing the characteristics of a [[queueing model]] was first suggested by [[David George Kendall|David G. Kendall]] in [[1953]]. [[Kendall's notation]] introduced an '''''A/B/C''''' queueing notation that can be found in all standard modern works on queueing theory, for example,<ref name="tijms">Tijms, H.C, ''Algorithmic Analysis of Queues", Chapter 9 in A First Course in Stochastic Models, Wiley, Chichester, 2003.</ref>
The A/B/C notation designates a queuing system having A as interarrival time distribution, B as service time distribution, and C as number of servers. So, for instance, G/D/1 would indicate a General (may be anything) arrival process, a Deterministic (constant time) service process and a single server. More details on this notation are given in the article about [[queueing model]]s.


== Description ==
==Application to telephony==
Queueing theory is one of the major areas of study in the discipline of [[management science]]. Through management science, businesses are able to solve a variety of problems using different scientific and mathematical approaches. Queueing analysis is the probabilistic analysis of waiting lines, and thus the results, also referred to as the operating characteristics, are probabilistic rather than deterministic.<ref name="taylor 2019">{{Cite book |last=Taylor |first=Bernard W. |title=Introduction to management science |date=2019 |publisher=Pearson |isbn=978-0-13-473066-0 |edition=13th |location=New York}}</ref> The probability that n customers are in the queueing system, the average number of customers in the queueing system, the average number of customers in the waiting line, the average time spent by a customer in the total queuing system, the average time spent by a customer in the waiting line, and finally the probability that the server is busy or idle are all of the different operating characteristics that these queueing models compute.<ref name="taylor 2019" /> The overall goal of queueing analysis is to compute these characteristics for the current system and then test several alternatives that could lead to improvement. Computing the operating characteristics for the current system and comparing the values to the characteristics of the alternative systems allows managers to see the pros and cons of each potential option. These systems help in the final decision making process by showing ways to increase savings, reduce waiting time, improve efficiency, etc. The main queueing models that can be used are the single-server waiting line system and the multiple-server waiting line system, which are discussed further below. These models can be further differentiated depending on whether service times are constant or undefined, the queue length is finite, the calling population is finite, etc.<ref name="taylor 2019" />
The Public Switched Telephone Networks ([[PSTN]]s) are designed to accommodate the offered traffic intensity with only a small loss. The [[Network Performance|performance]] of loss systems is quantified by their [[Grade of service|Grade of Service]] (GoS), driven by the assumption that if insufficient capacity is available, the call is refused and lost.<ref name="flood">Flood, J.E. ''Telecommunications Switching, Traffic and Networks'', Chapter 4: Telecommunications Traffic, New York: Prentice-Hall, 1998.</ref> Alternatively, overflow systems make use of [[Routing in the PSTN|alternative routes]] to divert calls via different paths &mdash; even these systems have a finite or maximum traffic carrying capacity.<ref name="flood"/>


== Single queueing nodes ==
However, the use of queueing in PSTNs allows the systems to queue their customer's requests until free resources become available. This means that if traffic intensity levels exceed available capacity, customer’s calls are here no longer lost; they instead wait until they can be served.<ref name="bose">Bose S.J., ''Chapter 1 - An Introduction to Queueing Systems'', Kluwer/Plenum Publishers, 2002.</ref> This method is used in queueing customers for the next available operator.


A ''queue'' or ''queueing node'' can be thought of as nearly a [[black box]]. ''Jobs'' (also called ''customers'' or ''requests'', depending on the field) arrive to the queue, possibly wait some time, take some time being processed, and then depart from the queue.
A queueing discipline determines the manner in which the exchange handles calls from customers.<ref name="bose"/> It defines the way they will be served, the order in which they are served, and the way in which resources are divided between the customers.<ref name="bose"/><ref name="penttinen">Penttinen A., ''Chapter 8 &ndash; Queueing Systems'', Lecture Notes: S-38.145 - Introduction to Teletraffic Theory.</ref> Here are details of three queueing disciplines:


[[File:Black box queue diagram.png|thumb|350px|center|A black box. Jobs arrive to, and depart from, the queue.]]
*''First In First Out'' &ndash; This principle states that customers are served one at a time and that the customer that has been waiting the longest is served first.<ref name="penttinen"/>
*''Last In First Out'' &ndash; This principle also serves customers one at a time, however the customer with the shortest waiting time will be served first.<ref name="penttinen"/>
*''Processor Sharing'' &ndash; Customers are served equally. Network capacity is shared between customers and they all effectively experience the same delay.<ref name="penttinen"/>


However, the queueing node is not quite a pure black box since some information is needed about the inside of the queueing node. The queue has one or more ''servers'' which can each be paired with an arriving job. When the job is completed and departs, that server will again be free to be paired with another arriving job.
Queueing is handled by control processes within exchanges, which can be modelled using state equations.<ref name="bose"/><ref name="penttinen"/> Queueing systems use a particular form of [[state equation]]s known as [[Markov chain]]s which model the system in each state.<ref name="bose"/> Incoming traffic to these systems is modelled via a [[Poisson distribution]] and is subject to Erlang’s queueing theory assumptions viz.<ref name="flood"/>


[[File:Queueing node service digram.png|thumb|500px|center|A queueing node with 3 servers. Server '''a''' is idle, and thus an arrival is given to it to process. Server '''b''' is currently busy and will take some time before it can complete service of its job. Server '''c''' has just completed service of a job and thus will be next to receive an arriving job.]]
*''Pure-Chance Traffic'' &ndash; Call arrivals and departures are random and independent events.<ref name="flood"/>
*''Statistical Equilibrium'' &ndash; Probabilities within the system do not change.<ref name="flood"/>
*''Full Availability'' &ndash; All incoming traffic can be routed to any other customer within the network.<ref name="flood"/>
*''Congestion is cleared as soon as servers are free''.<ref name="flood"/>


An analogy often used is that of the cashier at a supermarket. (There are other models, but this one is commonly encountered in the literature.) Customers arrive, are processed by the cashier, and depart. Each cashier processes one customer at a time, and hence this is a queueing node with only one server. A setting where a customer will leave immediately if the cashier is busy when the customer arrives, is referred to as a queue with no ''buffer'' (or no ''waiting area''). A setting with a waiting zone for up to ''n'' customers is called a queue with a buffer of size ''n''.
Classic queueing theory involves complex calculations to determine call waiting time, service time, server utilisation and many other metrics which are used to measure queueing performance.<ref name="bose"/><ref name="penttinen"/>


=== Birth-death process ===
==Queueing networks==
{{See also|Survival analysis}}
Queues can be chained to form queueing networks where the departures from one queue enter the next queue. Queueing networks can be classified into two categories: open queueing networks and closed queueing networks. Open queueing networks have an external input and an external final destination. Closed queueing networks are completely contained and the customers circulate continually never leaving the network.
The behaviour of a single queue (also called a ''queueing node'') can be described by a [[birth–death process]], which describes the arrivals and departures from the queue, along with the number of jobs currently in the system. If ''k'' denotes the number of jobs in the system (either being serviced or waiting if the queue has a buffer of waiting jobs), then an arrival increases ''k'' by 1 and a departure decreases ''k'' by 1.


The system transitions between values of ''k'' by "births" and "deaths", which occur at the arrival rates <math>\lambda_i</math> and the departure rates <math>\mu_i</math> for each job <math>i</math>. For a queue, these rates are generally considered not to vary with the number of jobs in the queue, so a single [[average]] rate of arrivals/departures per unit time is assumed. Under this assumption, this process has an arrival rate of <math>\lambda = \text{avg}(\lambda_1,\lambda_2,\dots,\lambda_k)</math> and a departure rate of <math>\mu = \text{avg}(\mu_1, \mu_2, \dots, \mu_k)</math>.
==Role of Poisson process, exponential distributions==


[[File:BD-proces.png|thumb|center|643x643px|A birth–death process. The values in the circles represent the state of the system, which evolves based on arrival rates ''λ<sub>i</sub>'' and departure rates ''μ<sub>i</sub>''.]]
A useful queueing model both (a) represents a real-life system with sufficient accuracy and (b) is analytically tractable. A queuing model based on the Poisson process and its companion exponential probability distribution often meets these two requirements. A Poisson process models random events (such as a customer arrival, a request for action from a web server, or the completion of the actions requested of a web server) as emanating from a memoryless process. That is, the length of the time interval from the current time to the occurrence of the next event does not depend upon the time of occurrence of the last event. In the Poisson probability distribution, the observer records the number of events that occur in a time interval of fixed length. In the (negative) exponential probability distribution, the observer records the length of the time interval between consecutive events. In both, the underlying physical process is memoryless.


[[File:Mm1_queue.svg|thumb|center|250px|A queue with 1 server, arrival rate ''λ'' and departure rate ''μ'']]
Models based on the Poisson process often respond to inputs from the environment in a manner that mimics the response of the system being modeled to those same inputs. The analytically tractable models that result yield both information about the system being modeled and the form of their solution. Even a queuing model based on the Poisson process that does a relatively poor job of mimicking detailed system performance can be useful. The fact that such models often give "worst-case" scenario evaluations appeals to system designers who prefer to include a safety factor in their designs. Also, the form of the solution of models based on the Poisson process often provides insight into the form of the solution to a queuing problem whose detailed behavior is poorly mimicked. As a result, [[queuing model]]s are frequently modeled as [[Poisson process]]es through the use of the [[exponential distribution]].


==== Balance equations ====
==Limitations of mathematical approach==
Classic queueing theory is often too mathematically restrictive to be able to model all real-world situations exactly. This restriction arises because the underlying assumptions of the theory do not always hold in the real world.


The [[steady state]] equations for the birth-and-death process, known as the [[balance equation]]s, are as follows. Here <math>P_n</math> denotes the steady state probability to be in state ''n''.
For example; the mathematical models often assume infinite numbers of customers, or queue capacity, or no bounds on inter-arrival or service times, when it is quite apparent that these bounds must exist in reality. Often, although the bounds do exist, they can be safely ignored because the differences between the real-world and theory is not statistically significant, as the probability that such boundary situations might occur is remote compared to the expected normal situation. In other cases the theoretical solution may either prove intractable or insufficiently informative to be useful.


: <math>\mu_1 P_1 = \lambda_0 P_0</math>
Alternative means of analysis have thus been devised in order to provide some insight into problems which do not fall under the mathematical scope of queueing theory, though they are often scenario-specific since they generally consist of computer [[simulation]]s and/or of analysis of experimental data. See [[network traffic simulation]].
: <math>\lambda_0 P_0 + \mu_2 P_2 = (\lambda_1 + \mu_1) P_1</math>
: <math>\lambda_{n-1} P_{n-1} + \mu_{n+1} P_{n+1} = (\lambda_n + \mu_n) P_n</math>


The first two equations imply
==See also==
: <math>P_1 = \frac{\lambda_0}{\mu_1} P_0</math>
<div class="references-small" style="-moz-column-count:2; column-count:2;">
and
* [[Buzen's algorithm]]
: <math>P_2 = \frac{\lambda_1}{\mu_2} P_1 + \frac{1}{\mu_2} (\mu_1 P_1 - \lambda_0 P_0) = \frac{\lambda_1}{\mu_2} P_1 = \frac{\lambda_1 \lambda_0}{\mu_2 \mu_1} P_0</math>.

By mathematical induction,
: <math>P_n = \frac{\lambda_{n-1} \lambda_{n-2} \cdots \lambda_0}{\mu_n \mu_{n-1} \cdots \mu_1} P_0 = P_0 \prod_{i = 0}^{n-1} \frac{\lambda_i}{\mu_{i+1}}</math>.

The condition <math>\sum_{n = 0}^{\infty} P_n = P_0 + P_0 \sum_{n=1}^\infty \prod_{i=0}^{n-1} \frac{\lambda_i}{\mu_{i+1}} = 1</math> leads to
: <math>P_0 = \frac{1}{1 + \sum_{n=1}^{\infty}\prod_{i=0}^{n-1} \frac{\lambda_i}{\mu_{i+1}} }</math>
which, together with the equation for <math>P_n</math> <math>(n\geq1)</math>, fully describes the required steady state probabilities.

=== Kendall's notation ===
{{Main|Kendall's notation}}
Single queueing nodes are usually described using Kendall's notation in the form A/S/''c'' where ''A'' describes the distribution of durations between each arrival to the queue, ''S'' the distribution of service times for jobs, and ''c'' the number of servers at the node.<ref name="tijms">Tijms, H.C, ''Algorithmic Analysis of Queues'', Chapter 9 in A First Course in Stochastic Models, Wiley, Chichester, 2003</ref><ref>{{Cite journal | last1 = Kendall | first1 = D. G. | author-link1 = David George Kendall| title = Stochastic Processes Occurring in the Theory of Queues and their Analysis by the Method of the Imbedded Markov Chain | doi = 10.1214/aoms/1177728975 | jstor = 2236285| journal = The Annals of Mathematical Statistics | volume = 24 | issue = 3 | pages = 338–354 | year = 1953| doi-access = free }}</ref> For an example of the notation, the [[M/M/1 queue]] is a simple model where a single server serves jobs that arrive according to a [[Poisson process]] (where inter-arrival durations are [[exponentially distributed]]) and have exponentially distributed service times (the M denotes a [[Markov process]]). In an [[M/G/1 queue]], the G stands for "general" and indicates an arbitrary [[probability distribution]] for service times.

=== Example analysis of an M/M/1 queue ===

Consider a queue with one server and the following characteristics:
* ''<math>\lambda</math>'': the arrival rate (the reciprocal of the expected time between each customer arriving, e.g. 10 customers per second)
* ''<math>\mu</math>'': the reciprocal of the mean service time (the expected number of consecutive service completions per the same unit time, e.g. per 30 seconds)
* ''n'': the parameter characterizing the number of customers in the system
* <math>P_n</math>: the probability of there being ''n'' customers in the system in steady state

Further, let <math>E_n</math> represent the number of times the system enters state ''n'', and <math>L_n</math> represent the number of times the system leaves state ''n''. Then <math>\left\vert E_n - L_n \right\vert \in \{0, 1\}</math> for all ''n''. That is, the number of times the system leaves a state differs by at most 1 from the number of times it enters that state, since it will either return into that state at some time in the future (<math>E_n = L_n</math>) or not (<math>\left\vert E_n - L_n \right\vert = 1</math>).

When the system arrives at a steady state, the arrival rate should be equal to the departure rate.

Thus the balance equations
: <math>\mu P_1 = \lambda P_0</math>
: <math>\lambda P_0 + \mu P_2 = (\lambda + \mu) P_1</math>
: <math>\lambda P_{n-1} + \mu P_{n+1} = (\lambda + \mu) P_n</math>
imply
: <math>P_n = \frac{\lambda}{\mu} P_{n-1},\ n=1,2,\ldots</math>
The fact that <math>P_0 + P_1 + \cdots = 1</math> leads to the [[geometric distribution]] formula
: <math>P_n = (1 - \rho) \rho^n</math>
where <math>\rho = \frac{\lambda}{\mu} < 1</math>.

=== Simple two-equation queue ===

A common basic queueing system is attributed to [[Agner Krarup Erlang|Erlang]] and is a modification of [[Little's Law]]. Given an arrival rate ''λ'', a dropout rate ''σ'', and a departure rate ''μ'', length of the queue ''L'' is defined as:

: <math>L = \frac{\lambda - \sigma}{\mu}</math>.

Assuming an exponential distribution for the rates, the waiting time ''W'' can be defined as the proportion of arrivals that are served. This is equal to the exponential survival rate of those who do not drop out over the waiting period, giving:

: <math>\frac{\mu}{\lambda} = e^{-W{\mu}}</math>

The second equation is commonly rewritten as:

: <math>W = \frac{1}{\mu} \mathrm{ln}\frac{\lambda}{\mu}</math>

The two-stage one-box model is common in [[epidemiology]].<ref>{{Cite journal|last=Hernández-Suarez|first=Carlos|date=2010|title=An application of queuing theory to SIS and SEIS epidemic models|journal=Math. Biosci.|volume=7|issue=4|pages=809–823|doi=10.3934/mbe.2010.7.809|pmid=21077709|doi-access=free}}</ref>

== History==
{{anchor|Overview of the development of the theory}}<!--anchored with previous section title-->

In 1909, [[Agner Krarup Erlang]], a Danish engineer who worked for the Copenhagen Telephone Exchange, published the first paper on what would now be called queueing theory.<ref>{{cite web |url=http://pass.maths.org.uk/issue2/erlang/index.html |title=Agner Krarup Erlang (1878-1929) &#124; plus.maths.org |publisher=Pass.maths.org.uk |access-date=2013-04-22 |date=1997-04-30 |archive-date=2008-10-07 |archive-url=https://web.archive.org/web/20081007225944/http://pass.maths.org.uk/issue2/erlang/index.html |url-status=live }}</ref><ref>{{Cite journal | last1 = Asmussen | first1 = S. R. | last2 = Boxma | first2 = O. J. | author-link2 = Onno Boxma| doi = 10.1007/s11134-009-9151-8 | title = Editorial introduction | journal = [[Queueing Systems]] | volume = 63 | issue = 1–4 | pages = 1–2 | year = 2009 | s2cid = 45664707 }}</ref><ref>{{cite journal | author-link = Agner Krarup Erlang | first = Agner Krarup | last = Erlang
| title = The theory of probabilities and telephone conversations | journal = Nyt Tidsskrift for Matematik B | volume = 20 | pages = 33–39 | archive-url = https://web.archive.org/web/20111001212934/http://oldwww.com.dtu.dk/teletraffic/erlangbook/pps131-137.pdf | archive-date = 2011-10-01 | url = http://oldwww.com.dtu.dk/teletraffic/erlangbook/pps131-137.pdf | year = 1909}}</ref> He modeled the number of telephone calls arriving at an exchange by a [[Poisson process]] and solved the [[M/D/1 queue]] in 1917 and [[M/D/k queue|M/D/''k'' queue]]ing model in 1920.<ref name="century">{{Cite journal | last1 = Kingman | first1 = J. F. C. | author-link1 = John Kingman | title = The first Erlang century—and the next | journal = [[Queueing Systems]] | volume = 63 | issue = 1–4 | pages = 3–4 | year = 2009 | doi = 10.1007/s11134-009-9147-4| s2cid = 38588726 }}</ref> In Kendall's notation:

* M stands for "Markov" or "memoryless", and means arrivals occur according to a Poisson process
* D stands for "deterministic", and means jobs arriving at the queue require a fixed amount of service
* ''k'' describes the number of servers at the queueing node (''k'' = 1, 2, 3, ...)

If the node has more jobs than servers, then jobs will queue and wait for service.

The [[M/G/1 |M/G/1 queue]] was solved by [[Felix Pollaczek]] in 1930,<ref>Pollaczek, F., Ueber eine Aufgabe der Wahrscheinlichkeitstheorie, Math. Z. 1930</ref> a solution later recast in probabilistic terms by [[Aleksandr Khinchin]] and now known as the [[Pollaczek–Khinchine formula]].<ref name="century" /><ref name="century1" />

After the 1940s, queueing theory became an area of research interest to mathematicians.<ref name="century1">{{Cite journal | last1 = Whittle | first1 = P. | author-link1 = Peter Whittle (mathematician)| doi = 10.1287/opre.50.1.227.17792 | title = Applied Probability in Great Britain | journal = [[Operations Research (journal)|Operations Research]]| volume = 50 | issue = 1 | pages = 227–239| year = 2002 | jstor = 3088474| doi-access = free }}</ref> In 1953, [[David George Kendall]] solved the GI/M/''k'' queue<ref>Kendall, D.G.:Stochastic processes occurring in the theory of queues and their analysis by the method of the imbedded Markov chain, Ann. Math. Stat. 1953</ref> and introduced the modern notation for queues, now known as [[Kendall's notation]]. In 1957, Pollaczek studied the GI/G/1 using an [[integral equation]].<ref>Pollaczek, F., Problèmes Stochastiques posés par le phénomène de formation d'une queue</ref> [[John Kingman]] gave a formula for the [[Mean sojourn time|mean waiting time]] in a [[G/G/1 queue]], now known as [[Kingman's formula]].<ref>{{Cite journal | last1 = Kingman | first1 = J. F. C. | author-link = John Kingman| doi = 10.1017/S0305004100036094 | author2 = <!-- (exclude bad crossref data) --> | last2 = Atiyah | title = The single server queue in heavy traffic | journal = [[Mathematical Proceedings of the Cambridge Philosophical Society]]| volume = 57 | issue = 4 | page = 902 | date=October 1961 | jstor = 2984229| bibcode = 1961PCPS...57..902K | s2cid = 62590290 }}</ref>

[[Leonard Kleinrock]] worked on the application of queueing theory to [[message switching]] in the early 1960s and [[packet switching]] in the early 1970s. His initial contribution to this field was his doctoral thesis at the [[Massachusetts Institute of Technology]] in 1962, published in book form in 1964. His theoretical work published in the early 1970s underpinned the use of packet switching in the [[ARPANET]], a forerunner to the Internet.

The [[matrix geometric method]] and [[matrix analytic method]]s have allowed queues with [[phase-type distribution|phase-type distributed]] inter-arrival and service time distributions to be considered.<ref>{{Cite journal | last1 = Ramaswami | first1 = V. | doi = 10.1080/15326348808807077 | title = A stable recursion for the steady state vector in markov chains of m/g/1 type | journal = Communications in Statistics. Stochastic Models | volume = 4 | pages = 183–188 | year = 1988 }}</ref>

Systems with coupled orbits are an important part in queueing theory in the application to wireless networks and signal processing.<ref>{{Cite book | last1 = Morozov | first1 = E. |chapter = Stability analysis of a multiclass retrial system withcoupled orbit queues | doi = 10.1007/978-3-319-66583-2_6 | title = Proceedings of 14th European Workshop| series = Lecture Notes in Computer Science | volume = 17| pages = 85–98 | year = 2017 | doi-access = free|isbn=978-3-319-66582-5 }}</ref>

Modern day application of queueing theory concerns among other things [[product development]] where (material) products have a spatiotemporal existence, in the sense that products have a certain volume and a certain duration.<ref>{{cite journal |title=Simulation and queueing network modeling of single-product production campaigns |date=1992 |url=https://www.sciencedirect.com/science/article/abs/pii/0098135492800185 |doi=10.1016/0098-1354(92)80018-5 |last1=Carlson |first1=E.C. |last2=Felder |first2=R.M. |journal=Computers & Chemical Engineering |volume=16 |issue=7 |pages=707–718 }}</ref>

Problems such as performance metrics for the [[M/G/k queue|M/G/''k'' queue]] remain an open problem.<ref name="century" /><ref name="century1" />

== Service disciplines ==
Various scheduling policies can be used at queueing nodes:

; [[FIFO (computing and electronics)|First in, first out]]: [[File:Fifo queue.png|thumb|First in first out (FIFO) queue example]] Also called ''first-come, first-served'' (FCFS),<ref name="Manuel">{{cite book|last1=Manuel|first1=Laguna|title=Business Process Modeling, Simulation and Design|date=2011|publisher=Pearson Education India|isbn=978-81-317-6135-9|page=178|url=https://books.google.com/books?id=d-V8c8YRJikC&q=%22First-come%2C+first-served%22+business&pg=PA178|access-date=6 October 2017|language=en}}</ref> this principle states that customers are served one at a time and that the customer that has been waiting the longest is served first.<ref name="penttinen">Penttinen A., ''Chapter 8 &ndash; Queueing Systems'', Lecture Notes: S-38.145 - Introduction to Teletraffic Theory.</ref>

; [[LIFO (computing)|Last in, first out]]: This principle also serves customers one at a time, but the customer with the shortest [[Mean sojourn time|waiting time]] will be served first.<ref name="penttinen"/> Also known as a [[Stack (data structure)|stack]].

; [[Processor sharing]]: Service capacity is shared equally between customers.<ref name="penttinen"/>

; Priority: Customers with high priority are served first.<ref name="penttinen"/> Priority queues can be of two types: ''non-preemptive'' (where a job in service cannot be interrupted) and ''preemptive'' (where a job in service can be interrupted by a higher-priority job). No work is lost in either model.<ref>{{Cite book | last1 = Harchol-Balter | first1 = M.|author1-link=Mor Harchol-Balter | chapter = Scheduling: Non-Preemptive, Size-Based Policies | doi = 10.1017/CBO9781139226424.039 | title = Performance Modeling and Design of Computer Systems | pages = 499–507 | year = 2012 | isbn = 978-1-139-22642-4 }}</ref>

; [[Shortest job first]]: The next job to be served is the one with the smallest size.<ref>{{cite book|author1=Andrew S. Tanenbaum|author2=Herbert Bos|title=Modern Operating Systems|url=https://books.google.com/books?id=9gqnngEACAAJ|year=2015|publisher=Pearson|isbn=978-0-13-359162-0}}</ref>

; Preemptive shortest job first: The next job to be served is the one with the smallest original size.<ref>{{Cite book | last1 = Harchol-Balter | first1 = M. |author1-link=Mor Harchol-Balter| chapter = Scheduling: Preemptive, Size-Based Policies | doi = 10.1017/CBO9781139226424.040 | title = Performance Modeling and Design of Computer Systems | pages = 508–517 | year = 2012 | isbn = 978-1-139-22642-4 }}</ref>

; [[Shortest remaining processing time]]: The next job to serve is the one with the smallest remaining processing requirement.<ref>{{Cite book | last1 = Harchol-Balter | first1 = M.|author1-link=Mor Harchol-Balter | chapter = Scheduling: SRPT and Fairness | doi = 10.1017/CBO9781139226424.041 | title = Performance Modeling and Design of Computer Systems | pages = 518–530 | year = 2012 | isbn = 978-1-139-22642-4 }}</ref>

; Service facility
* Single server: customers line up and there is only one server
* Several parallel servers (single queue): customers line up and there are several servers
* Several parallel servers (several queues): there are many counters and customers can decide for which to queue

; Unreliable server

Server failures occur according to a stochastic (random) process (usually Poisson) and are followed by setup periods during which the server is unavailable. The interrupted customer remains in the service area until server is fixed.<ref>{{Cite journal | last1 = Dimitriou | first1 = I. | title = A Multiclass Retrial System With Coupled Orbits And Service Interruptions: Verification of Stability Conditions | journal = Proceedings of FRUCT 24 | volume = 7 | pages = 75–82 | year = 2019}}</ref>

; Customer waiting behavior
* Balking: customers decide not to join the queue if it is too long
* Jockeying: customers switch between queues if they think they will get served faster by doing so
* Reneging: customers leave the queue if they have waited too long for service

Arriving customers not served (either due to the queue having no buffer, or due to balking or reneging by the customer) are also known as ''dropouts''. The average rate of dropouts is a significant parameter describing a queue.

== Queueing networks ==

Queue networks are systems in which multiple queues are connected by ''customer routing''. When a customer is serviced at one node, it can join another node and queue for service, or leave the network.

For networks of ''m'' nodes, the state of the system can be described by an ''m''–dimensional vector (''x''<sub>1</sub>, ''x''<sub>2</sub>, ..., ''x''<sub>''m''</sub>) where ''x''<sub>''i''</sub> represents the number of customers at each node.

The simplest non-trivial networks of queues are called [[Jackson network|tandem queues]].<ref>{{Cite web |url=http://www.stats.ox.ac.uk/~winkel/bs3a07l13-14.pdf#page=4 |title=Archived copy |access-date=2018-08-02 |archive-date=2017-03-29 |archive-url=https://web.archive.org/web/20170329085928/http://www.stats.ox.ac.uk/~winkel/bs3a07l13-14.pdf#page=4 |url-status=live }}</ref> The first significant results in this area were [[Jackson network]]s,<ref>{{Cite journal | last1 = Jackson | first1 = J. R. | author-link = James R. Jackson| title = Networks of Waiting Lines | doi = 10.1287/opre.5.4.518 | journal = Operations Research | volume = 5 | issue = 4 | pages = 518–521 | year = 1957 | jstor = 167249}}</ref><ref name="jackson">{{cite journal|title=Jobshop-like Queueing Systems|first=James R.|last=Jackson|journal=[[Management Science: A Journal of the Institute for Operations Research and the Management Sciences|Management Science]]|volume=10|number=1|date=Oct 1963|pages=131–142|doi=10.1287/mnsc.1040.0268|jstor=2627213}}</ref> for which an efficient [[product-form stationary distribution]] exists and the [[mean value analysis]]<ref>{{Cite journal | last1 = Reiser | first1 = M.| last2 = Lavenberg | first2 = S. S. | doi = 10.1145/322186.322195 | title = Mean-Value Analysis of Closed Multichain Queuing Networks | journal = [[Journal of the ACM]]| volume = 27 | issue = 2 | page = 313 | year = 1980 | s2cid = 8694947| doi-access = free }}</ref> (which allows average metrics such as throughput and sojourn times) can be computed.<ref>{{Cite journal | last1 = Van Dijk | first1 = N. M. | title = On the arrival theorem for communication networks | doi = 10.1016/0169-7552(93)90073-D | journal = Computer Networks and ISDN Systems | volume = 25 | issue = 10 | pages = 1135–2013 | year = 1993 | s2cid = 45218280 | url = https://research.vu.nl/ws/files/73611045/Scanjob%20199100081 | access-date = 2019-09-24 | archive-date = 2019-09-24 | archive-url = https://web.archive.org/web/20190924062816/https://research.vu.nl/ws/files/73611045/Scanjob%2520199100081 | url-status = live }}</ref> If the total number of customers in the network remains constant, the network is called a ''closed network'' and has been shown to also have a product–form stationary distribution by the [[Gordon–Newell theorem]].<ref>{{Cite journal | last1 = Gordon | first1 = W. J. | last2 = Newell | first2 = G. F. | author-link2 = Gordon F. Newell| doi = 10.1287/opre.15.2.254 | jstor = 168557| title = Closed Queuing Systems with Exponential Servers | journal = [[Operations Research (journal)|Operations Research]]| volume = 15 | issue = 2 | page = 254 | year = 1967 }}</ref> This result was extended to the [[BCMP network]],<ref>{{Cite journal | last1 = Baskett | first1 = F. | last2 = Chandy | first2 = K. Mani | author2-link = K. Mani Chandy | last3 = Muntz | first3 = R.R. | last4 = Palacios | first4 = F.G. | title = Open, closed and mixed networks of queues with different classes of customers | journal = Journal of the ACM | volume = 22 | issue = 2 | pages = 248&ndash;260 | year = 1975 | doi = 10.1145/321879.321887 | s2cid = 15204199 | doi-access = free }}</ref> where a network with very general service time, regimes, and customer routing is shown to also exhibit a product–form stationary distribution. The [[normalizing constant]] can be calculated with the [[Buzen's algorithm]], proposed in 1973.<ref name="buzen-1973">{{Cite journal | last1 = Buzen | first1 = J. P. | author-link = Jeffrey P. Buzen | title = Computational algorithms for closed queueing networks with exponential servers | doi = 10.1145/362342.362345 | url = http://www-unix.ecs.umass.edu/~krishna/ece673/buzen.pdf | journal = Communications of the ACM | volume = 16 | issue = 9 | pages = 527–531 | year = 1973 | s2cid = 10702 | access-date = 2015-09-01 | archive-date = 2016-05-13 | archive-url = https://web.archive.org/web/20160513192804/http://www-unix.ecs.umass.edu/~krishna/ece673/buzen.pdf | url-status = live }}</ref>

Networks of customers have also been investigated, such as [[Kelly network]]s, where customers of different classes experience different priority levels at different service nodes.<ref>{{Cite journal | last1 = Kelly | first1 = F. P. | author-link1 = Frank Kelly (mathematician)| title = Networks of Queues with Customers of Different Types | journal = Journal of Applied Probability | volume = 12 | issue = 3 | pages = 542–554 | doi = 10.2307/3212869 | jstor = 3212869| year = 1975 | s2cid = 51917794 }}</ref> Another type of network are [[G-networks]], first proposed by [[Erol Gelenbe]] in 1993:<ref>{{cite journal | doi = 10.2307/3214781 | title = G-Networks with Triggered Customer Movement | first = Erol | last = Gelenbe | author-link = Erol Gelenbe | journal = Journal of Applied Probability | volume = 30 | issue = 3 | date = Sep 1993 | pages = 742–748 | jstor = 3214781 | s2cid = 121673725 }}</ref> these networks do not assume exponential time distributions like the classic Jackson network.

=== Routing algorithms ===
{{See also|Stochastic scheduling}}
In discrete-time networks where there is a constraint on which service nodes can be active at any time, the max-weight scheduling algorithm chooses a service policy to give optimal throughput in the case that each job visits only a single-person service node.<ref name="Manuel" /> In the more general case where jobs can visit more than one node, [[backpressure routing]] gives optimal throughput. A [[network scheduler]] must choose a [[queueing algorithm]], which affects the characteristics of the larger network.<ref>{{Cite journal |last=Newell |first=G. F. |date=1982 |title=Applications of Queueing Theory |url=https://doi.org/10.1007/978-94-009-5970-5 |journal=SpringerLink |language=en |doi=10.1007/978-94-009-5970-5|isbn=978-94-009-5972-9 }}</ref>

=== Mean-field limits ===

[[Mean-field model]]s consider the limiting behaviour of the [[empirical measure]] (proportion of queues in different states) as the number of queues ''m'' approaches infinity. The impact of other queues on any given queue in the network is approximated by a differential equation. The deterministic model converges to the same stationary distribution as the original model.<ref>{{Cite book | last1 = Bobbio | first1 = A. | last2 = Gribaudo | first2 = M. | last3 = Telek | first3 = M. S. | doi = 10.1109/QEST.2008.47 | chapter = Analysis of Large Scale Interacting Systems by Mean Field Method | title = 2008 Fifth International Conference on Quantitative Evaluation of Systems | page = 215 | year = 2008 | isbn = 978-0-7695-3360-5 | s2cid = 2714909 }}</ref>

=== Heavy traffic/diffusion approximations ===
{{Main|Heavy traffic approximation}}
In a system with high occupancy rates (utilisation near 1), a heavy traffic approximation can be used to approximate the queueing length process by a [[reflected Brownian motion]],<ref>{{Cite journal | last1 = Chen | first1 = H. | last2 = Whitt | first2 = W. | doi = 10.1007/BF01149260 | title = Diffusion approximations for open queueing networks with service interruptions | journal = [[Queueing Systems]]| volume = 13 | issue = 4 | page = 335 | year = 1993 | s2cid = 1180930 }}</ref> [[Ornstein–Uhlenbeck process]], or more general [[diffusion process]].<ref>{{Cite journal | last1 = Yamada | first1 = K. | title = Diffusion Approximation for Open State-Dependent Queueing Networks in the Heavy Traffic Situation | doi = 10.1214/aoap/1177004602 | journal = The Annals of Applied Probability | volume = 5 | issue = 4 | pages = 958–982 | year = 1995 | jstor = 2245101| doi-access = free }}</ref> The number of dimensions of the Brownian process is equal to the number of queueing nodes, with the diffusion restricted to the non-negative [[orthant]].

=== Fluid limits ===
{{main|Fluid limit}}
Fluid models are continuous deterministic analogs of queueing networks obtained by taking the limit when the process is scaled in time and space, allowing heterogeneous objects. This scaled trajectory converges to a deterministic equation which allows the stability of the system to be proven. It is known that a queueing network can be stable but have an unstable fluid limit.<ref>{{Cite journal | last1 = Bramson | first1 = M. | title = A stable queueing network with unstable fluid model | doi = 10.1214/aoap/1029962815 | journal = The Annals of Applied Probability | volume = 9 | issue = 3 | pages = 818–853 | year = 1999 | jstor = 2667284| doi-access = free }}</ref>

=== Queueing Applications ===
Queueing theory finds widespread application in computer science and information technology. In networking, for instance, queues are integral to routers and switches, where packets queue up for transmission. By applying queueing theory principles, designers can optimize these systems, ensuring responsive performance and efficient resource utilization.
Beyond the technological realm, queueing theory is relevant to everyday experiences. Whether waiting in line at a supermarket or for public transportation, understanding the principles of queueing theory provides valuable insights into optimizing these systems for enhanced user satisfaction. At some point, everyone will be involved in an aspect of queuing. What some may view to be an inconvenience could possibly be the most effective method.
Queueing theory, a discipline rooted in applied mathematics and computer science, is a field dedicated to the study and analysis of queues, or waiting lines, and their implications across a diverse range of applications. This theoretical framework has proven instrumental in understanding and optimizing the efficiency of systems characterized by the presence of queues. The study of queues is essential in contexts such as traffic systems, computer networks, telecommunications, and service operations.
Queueing theory delves into various foundational concepts, with the arrival process and service process being central. The arrival process describes the manner in which entities join the queue over time, often modeled using stochastic processes like Poisson processes. The efficiency of queueing systems is gauged through key performance metrics. These include the average queue length, average wait time, and system throughput. These metrics provide insights into the system's functionality, guiding decisions aimed at enhancing performance and reducing wait times.
References:
Gross, D., & Harris, C. M. (1998). Fundamentals of Queueing Theory. John Wiley & Sons.
Kleinrock, L. (1976). Queueing Systems: Volume I - Theory. Wiley.
Cooper, B. F., & Mitrani, I. (1985). Queueing Networks: A Fundamental Approach. John Wiley & Sons

== See also ==
{{cols|colwidth=21em}}
* [[Ehrenfest model]]
* [[Erlang unit]]
* [[Erlang unit]]
* [[Jackson network]]
* [[Line management]]
* [[Little's law]]
* [[Network simulation]]
* [[Markovian arrival processes]]
* [[Project production management]]
* [[Queue area]]
* [[Queue area]]
* [[Queueing delay]]
* [[Queueing delay]]
* [[Queueing model]]
* [[Queue management system]]
* [[Random early detection]] (RED)
* [[Queuing Rule of Thumb]]
* [[Random early detection]]
* [[Renewal theory]]
* [[Renewal theory]]
* [[Throughput]]
* [[Throughput]]
* [[Scheduling (computing)]]
</div>
* [[Traffic jam]]
* [[Traffic generation model]]
* [[Flow network]]
{{colend}}


==References==
== References ==
{{Reflist|30em}}
<!--See http://en.wikipedia.org/wiki/Wikipedia:Footnotes for an explanation of how to generate footnotes using the <ref(erences/)> tags-->
<references/>


==Further reading==
== Further reading ==
* {{cite book | first=Donald | last=Gross | coauthors=Carl M. Harris | title=Fundamentals of Queueing Theory | publisher=Wiley | year=1998}}
* {{cite book | first=Donald | last=Gross |author2=Carl M. Harris | title=Fundamentals of Queueing Theory | publisher=Wiley | year=1998 | isbn=978-0-471-32812-4}} [https://books.google.com/books?id=K3lQGeCtAJgC Online]
* {{cite book | last=Zukerman | first=Moshe | title=Introduction to Queueing Theory and Stochastic Teletraffic Models | year=2013 | arxiv=1307.2968 | url=http://www.ee.cityu.edu.hk/~zukerman/classnotes.pdf}}
* {{cite book |last=Deitel |first=Harvey M. |title=An introduction to operating systems |origyear=1982 |url=http://portal.acm.org/citation.cfm?id=79046&dl=GUIDE&coll=GUIDE |edition=revisited first edition |year=1984 |publisher=Addison-Wesley |id=ISBN 0-201-14502-2 |pages=673}} chap.15, pp.380-412
* {{cite book |last=Deitel |first=Harvey M. |title=An introduction to operating systems |orig-date=1982 |url=https://archive.org/details/introductiontoopdeit00deit/page/673 |edition=revisited first |year=1984 |publisher=Addison-Wesley |isbn=978-0-201-14502-1 |page=[https://archive.org/details/introductiontoopdeit00deit/page/673 673] }} chap.15, pp.&nbsp;380–412
* {{cite book | first=Erol | last=Gelenbe |author2=Isi Mitrani | title=Analysis and Synthesis of Computer Systems | publisher=World Scientific 2nd Edition | year=2010| isbn=978-1-908978-42-4| url=https://www.researchgate.net/publication/345903225}}
* {{cite book | last= Newell | first=Gordron F. | title= Applications of Queueing Theory | publisher = Chapman and Hall | date= 1 June 1971}}
* Leonard Kleinrock, [http://www.lk.cs.ucla.edu/bibliography-public_reports.html Information Flow in Large Communication Nets], (MIT, Cambridge, May 31, 1961) Proposal for a Ph.D. Thesis
* Leonard Kleinrock. ''Information Flow in Large Communication Nets'' (RLE Quarterly Progress Report, July 1961)
* Leonard Kleinrock. ''Communication Nets: Stochastic Message Flow and Delay'' (McGraw-Hill, New York, 1964)
*{{cite book |first=Leonard |last=Kleinrock |author-link=Leonard Kleinrock |title=Queueing Systems: Volume I – Theory |url=https://archive.org/details/queueingsystems02klei |url-access=registration |publisher=Wiley Interscience |location=New York |date=2 January 1975 |pages=[https://archive.org/details/queueingsystems02klei/page/417 417] |isbn=978-0-471-49110-1}}
*{{cite book |first=Leonard |last=Kleinrock |author-link=Leonard Kleinrock |title=Queueing Systems: Volume II – Computer Applications |publisher=Wiley Interscience |location=New York |date=22 April 1976 |pages=[https://archive.org/details/queueingsystems00klei/page/576 576] |isbn=978-0-471-49111-8 |url=https://archive.org/details/queueingsystems00klei/page/576 }}
*{{cite book | last=Lazowska | first=Edward D. | author2=John Zahorjan | author3=G. Scott Graham | author4=Kenneth C. Sevcik | publisher=Prentice-Hall, Inc | year=1984 | title=Quantitative System Performance: Computer System Analysis Using Queueing Network Models | url=https://archive.org/details/quantitativesyst00lazo | isbn=978-0-13-746975-8 }}
*{{cite book|author1=Jon Kleinberg|author2=Éva Tardos|title=Algorithm Design|url=https://books.google.com/books?id=ROiUngEACAAJ|date=30 June 2013|publisher=Pearson|isbn=978-1-292-02394-6}}


==External links==
== External links ==
{{Wiktionary|queueing|queuing}}
* [http://www.shmula.com/series-on-queueing-theory/ Shmula's Queueing Theory Page]
* [http://www2.uwindsor.ca/~hlynka/queue.html Myron Hlynka's Queueing Theory Page]
* [http://people.revoledu.com/kardi/tutorial/Queuing/index.html Teknomo's Queueing theory tutorial and calculators]
* [http://www.eventhelix.com/RealtimeMantra/CongestionControl/queueing_theory.htm Queueing Theory Basics]
* [http://www.netlab.tkk.fi/opetus/s383143/kalvot/english.shtml Virtamo's Queueing Theory Course]
* [http://web2.uwindsor.ca/math/hlynka/queue.html Myron Hlynka's Queueing Theory Page]
[[Category:Stochastic processes]]
* [http://line-solver.sf.net LINE: a general-purpose engine to solve queueing models]
[[Category:Production and manufacturing]]

[[Category:Services management and marketing]]

{{Queueing theory}}

{{Authority control}}

[[Category:Queueing theory| ]]
[[Category:Production planning]]
[[Category:Customer experience]]
[[Category:Operations research]]
[[Category:Operations research]]
[[Category:Queueing theory|*]]
[[Category:Formal sciences]]
[[Category:Rationing and licensing]]
[[Category:Rationing]]
[[Category:Network performance]]
[[Category:Network performance]]
[[Category:Markov models]]

[[de:Warteschlangentheorie]]
[[es:Teoría de colas]]
[[fr:Théorie des files d'attente]]
[[it:Teoria delle code]]
[[nl:Wachtrijtheorie]]
[[ja:待ち行列理論]]
[[pl:Teoria kolejek]]
[[ru:Теория массового обслуживания]]
[[zh:等候理論]]

Latest revision as of 15:40, 23 September 2024

Queue networks are systems in which single queues are connected by a routing network. In this image, servers are represented by circles, queues by a series of rectangles and the routing network by arrows. In the study of queue networks one typically tries to obtain the equilibrium distribution of the network, although in many applications the study of the transient state is fundamental.

Queueing theory is the mathematical study of waiting lines, or queues.[1] A queueing model is constructed so that queue lengths and waiting time can be predicted.[1] Queueing theory is generally considered a branch of operations research because the results are often used when making business decisions about the resources needed to provide a service.

Queueing theory has its origins in research by Agner Krarup Erlang, who created models to describe the system of incoming calls at the Copenhagen Telephone Exchange Company.[1] These ideas were seminal to the field of teletraffic engineering and have since seen applications in telecommunications, traffic engineering, computing,[2] project management, and particularly industrial engineering, where they are applied in the design of factories, shops, offices, and hospitals.[3][4]

Spelling

[edit]

The spelling "queueing" over "queuing" is typically encountered in the academic research field. In fact, one of the flagship journals of the field is Queueing Systems.

Description

[edit]

Queueing theory is one of the major areas of study in the discipline of management science. Through management science, businesses are able to solve a variety of problems using different scientific and mathematical approaches. Queueing analysis is the probabilistic analysis of waiting lines, and thus the results, also referred to as the operating characteristics, are probabilistic rather than deterministic.[5] The probability that n customers are in the queueing system, the average number of customers in the queueing system, the average number of customers in the waiting line, the average time spent by a customer in the total queuing system, the average time spent by a customer in the waiting line, and finally the probability that the server is busy or idle are all of the different operating characteristics that these queueing models compute.[5] The overall goal of queueing analysis is to compute these characteristics for the current system and then test several alternatives that could lead to improvement. Computing the operating characteristics for the current system and comparing the values to the characteristics of the alternative systems allows managers to see the pros and cons of each potential option. These systems help in the final decision making process by showing ways to increase savings, reduce waiting time, improve efficiency, etc. The main queueing models that can be used are the single-server waiting line system and the multiple-server waiting line system, which are discussed further below. These models can be further differentiated depending on whether service times are constant or undefined, the queue length is finite, the calling population is finite, etc.[5]

Single queueing nodes

[edit]

A queue or queueing node can be thought of as nearly a black box. Jobs (also called customers or requests, depending on the field) arrive to the queue, possibly wait some time, take some time being processed, and then depart from the queue.

A black box. Jobs arrive to, and depart from, the queue.

However, the queueing node is not quite a pure black box since some information is needed about the inside of the queueing node. The queue has one or more servers which can each be paired with an arriving job. When the job is completed and departs, that server will again be free to be paired with another arriving job.

A queueing node with 3 servers. Server a is idle, and thus an arrival is given to it to process. Server b is currently busy and will take some time before it can complete service of its job. Server c has just completed service of a job and thus will be next to receive an arriving job.

An analogy often used is that of the cashier at a supermarket. (There are other models, but this one is commonly encountered in the literature.) Customers arrive, are processed by the cashier, and depart. Each cashier processes one customer at a time, and hence this is a queueing node with only one server. A setting where a customer will leave immediately if the cashier is busy when the customer arrives, is referred to as a queue with no buffer (or no waiting area). A setting with a waiting zone for up to n customers is called a queue with a buffer of size n.

Birth-death process

[edit]

The behaviour of a single queue (also called a queueing node) can be described by a birth–death process, which describes the arrivals and departures from the queue, along with the number of jobs currently in the system. If k denotes the number of jobs in the system (either being serviced or waiting if the queue has a buffer of waiting jobs), then an arrival increases k by 1 and a departure decreases k by 1.

The system transitions between values of k by "births" and "deaths", which occur at the arrival rates and the departure rates for each job . For a queue, these rates are generally considered not to vary with the number of jobs in the queue, so a single average rate of arrivals/departures per unit time is assumed. Under this assumption, this process has an arrival rate of and a departure rate of .

A birth–death process. The values in the circles represent the state of the system, which evolves based on arrival rates λi and departure rates μi.
A queue with 1 server, arrival rate λ and departure rate μ

Balance equations

[edit]

The steady state equations for the birth-and-death process, known as the balance equations, are as follows. Here denotes the steady state probability to be in state n.

The first two equations imply

and

.

By mathematical induction,

.

The condition leads to

which, together with the equation for , fully describes the required steady state probabilities.

Kendall's notation

[edit]

Single queueing nodes are usually described using Kendall's notation in the form A/S/c where A describes the distribution of durations between each arrival to the queue, S the distribution of service times for jobs, and c the number of servers at the node.[6][7] For an example of the notation, the M/M/1 queue is a simple model where a single server serves jobs that arrive according to a Poisson process (where inter-arrival durations are exponentially distributed) and have exponentially distributed service times (the M denotes a Markov process). In an M/G/1 queue, the G stands for "general" and indicates an arbitrary probability distribution for service times.

Example analysis of an M/M/1 queue

[edit]

Consider a queue with one server and the following characteristics:

  • : the arrival rate (the reciprocal of the expected time between each customer arriving, e.g. 10 customers per second)
  • : the reciprocal of the mean service time (the expected number of consecutive service completions per the same unit time, e.g. per 30 seconds)
  • n: the parameter characterizing the number of customers in the system
  • : the probability of there being n customers in the system in steady state

Further, let represent the number of times the system enters state n, and represent the number of times the system leaves state n. Then for all n. That is, the number of times the system leaves a state differs by at most 1 from the number of times it enters that state, since it will either return into that state at some time in the future () or not ().

When the system arrives at a steady state, the arrival rate should be equal to the departure rate.

Thus the balance equations

imply

The fact that leads to the geometric distribution formula

where .

Simple two-equation queue

[edit]

A common basic queueing system is attributed to Erlang and is a modification of Little's Law. Given an arrival rate λ, a dropout rate σ, and a departure rate μ, length of the queue L is defined as:

.

Assuming an exponential distribution for the rates, the waiting time W can be defined as the proportion of arrivals that are served. This is equal to the exponential survival rate of those who do not drop out over the waiting period, giving:

The second equation is commonly rewritten as:

The two-stage one-box model is common in epidemiology.[8]

History

[edit]

In 1909, Agner Krarup Erlang, a Danish engineer who worked for the Copenhagen Telephone Exchange, published the first paper on what would now be called queueing theory.[9][10][11] He modeled the number of telephone calls arriving at an exchange by a Poisson process and solved the M/D/1 queue in 1917 and M/D/k queueing model in 1920.[12] In Kendall's notation:

  • M stands for "Markov" or "memoryless", and means arrivals occur according to a Poisson process
  • D stands for "deterministic", and means jobs arriving at the queue require a fixed amount of service
  • k describes the number of servers at the queueing node (k = 1, 2, 3, ...)

If the node has more jobs than servers, then jobs will queue and wait for service.

The M/G/1 queue was solved by Felix Pollaczek in 1930,[13] a solution later recast in probabilistic terms by Aleksandr Khinchin and now known as the Pollaczek–Khinchine formula.[12][14]

After the 1940s, queueing theory became an area of research interest to mathematicians.[14] In 1953, David George Kendall solved the GI/M/k queue[15] and introduced the modern notation for queues, now known as Kendall's notation. In 1957, Pollaczek studied the GI/G/1 using an integral equation.[16] John Kingman gave a formula for the mean waiting time in a G/G/1 queue, now known as Kingman's formula.[17]

Leonard Kleinrock worked on the application of queueing theory to message switching in the early 1960s and packet switching in the early 1970s. His initial contribution to this field was his doctoral thesis at the Massachusetts Institute of Technology in 1962, published in book form in 1964. His theoretical work published in the early 1970s underpinned the use of packet switching in the ARPANET, a forerunner to the Internet.

The matrix geometric method and matrix analytic methods have allowed queues with phase-type distributed inter-arrival and service time distributions to be considered.[18]

Systems with coupled orbits are an important part in queueing theory in the application to wireless networks and signal processing.[19]

Modern day application of queueing theory concerns among other things product development where (material) products have a spatiotemporal existence, in the sense that products have a certain volume and a certain duration.[20]

Problems such as performance metrics for the M/G/k queue remain an open problem.[12][14]

Service disciplines

[edit]

Various scheduling policies can be used at queueing nodes:

First in, first out
First in first out (FIFO) queue example
Also called first-come, first-served (FCFS),[21] this principle states that customers are served one at a time and that the customer that has been waiting the longest is served first.[22]
Last in, first out
This principle also serves customers one at a time, but the customer with the shortest waiting time will be served first.[22] Also known as a stack.
Processor sharing
Service capacity is shared equally between customers.[22]
Priority
Customers with high priority are served first.[22] Priority queues can be of two types: non-preemptive (where a job in service cannot be interrupted) and preemptive (where a job in service can be interrupted by a higher-priority job). No work is lost in either model.[23]
Shortest job first
The next job to be served is the one with the smallest size.[24]
Preemptive shortest job first
The next job to be served is the one with the smallest original size.[25]
Shortest remaining processing time
The next job to serve is the one with the smallest remaining processing requirement.[26]
Service facility
  • Single server: customers line up and there is only one server
  • Several parallel servers (single queue): customers line up and there are several servers
  • Several parallel servers (several queues): there are many counters and customers can decide for which to queue
Unreliable server

Server failures occur according to a stochastic (random) process (usually Poisson) and are followed by setup periods during which the server is unavailable. The interrupted customer remains in the service area until server is fixed.[27]

Customer waiting behavior
  • Balking: customers decide not to join the queue if it is too long
  • Jockeying: customers switch between queues if they think they will get served faster by doing so
  • Reneging: customers leave the queue if they have waited too long for service

Arriving customers not served (either due to the queue having no buffer, or due to balking or reneging by the customer) are also known as dropouts. The average rate of dropouts is a significant parameter describing a queue.

Queueing networks

[edit]

Queue networks are systems in which multiple queues are connected by customer routing. When a customer is serviced at one node, it can join another node and queue for service, or leave the network.

For networks of m nodes, the state of the system can be described by an m–dimensional vector (x1, x2, ..., xm) where xi represents the number of customers at each node.

The simplest non-trivial networks of queues are called tandem queues.[28] The first significant results in this area were Jackson networks,[29][30] for which an efficient product-form stationary distribution exists and the mean value analysis[31] (which allows average metrics such as throughput and sojourn times) can be computed.[32] If the total number of customers in the network remains constant, the network is called a closed network and has been shown to also have a product–form stationary distribution by the Gordon–Newell theorem.[33] This result was extended to the BCMP network,[34] where a network with very general service time, regimes, and customer routing is shown to also exhibit a product–form stationary distribution. The normalizing constant can be calculated with the Buzen's algorithm, proposed in 1973.[35]

Networks of customers have also been investigated, such as Kelly networks, where customers of different classes experience different priority levels at different service nodes.[36] Another type of network are G-networks, first proposed by Erol Gelenbe in 1993:[37] these networks do not assume exponential time distributions like the classic Jackson network.

Routing algorithms

[edit]

In discrete-time networks where there is a constraint on which service nodes can be active at any time, the max-weight scheduling algorithm chooses a service policy to give optimal throughput in the case that each job visits only a single-person service node.[21] In the more general case where jobs can visit more than one node, backpressure routing gives optimal throughput. A network scheduler must choose a queueing algorithm, which affects the characteristics of the larger network.[38]

Mean-field limits

[edit]

Mean-field models consider the limiting behaviour of the empirical measure (proportion of queues in different states) as the number of queues m approaches infinity. The impact of other queues on any given queue in the network is approximated by a differential equation. The deterministic model converges to the same stationary distribution as the original model.[39]

Heavy traffic/diffusion approximations

[edit]

In a system with high occupancy rates (utilisation near 1), a heavy traffic approximation can be used to approximate the queueing length process by a reflected Brownian motion,[40] Ornstein–Uhlenbeck process, or more general diffusion process.[41] The number of dimensions of the Brownian process is equal to the number of queueing nodes, with the diffusion restricted to the non-negative orthant.

Fluid limits

[edit]

Fluid models are continuous deterministic analogs of queueing networks obtained by taking the limit when the process is scaled in time and space, allowing heterogeneous objects. This scaled trajectory converges to a deterministic equation which allows the stability of the system to be proven. It is known that a queueing network can be stable but have an unstable fluid limit.[42]

Queueing Applications

[edit]

Queueing theory finds widespread application in computer science and information technology. In networking, for instance, queues are integral to routers and switches, where packets queue up for transmission. By applying queueing theory principles, designers can optimize these systems, ensuring responsive performance and efficient resource utilization. Beyond the technological realm, queueing theory is relevant to everyday experiences. Whether waiting in line at a supermarket or for public transportation, understanding the principles of queueing theory provides valuable insights into optimizing these systems for enhanced user satisfaction. At some point, everyone will be involved in an aspect of queuing. What some may view to be an inconvenience could possibly be the most effective method. Queueing theory, a discipline rooted in applied mathematics and computer science, is a field dedicated to the study and analysis of queues, or waiting lines, and their implications across a diverse range of applications. This theoretical framework has proven instrumental in understanding and optimizing the efficiency of systems characterized by the presence of queues. The study of queues is essential in contexts such as traffic systems, computer networks, telecommunications, and service operations. Queueing theory delves into various foundational concepts, with the arrival process and service process being central. The arrival process describes the manner in which entities join the queue over time, often modeled using stochastic processes like Poisson processes. The efficiency of queueing systems is gauged through key performance metrics. These include the average queue length, average wait time, and system throughput. These metrics provide insights into the system's functionality, guiding decisions aimed at enhancing performance and reducing wait times. References: Gross, D., & Harris, C. M. (1998). Fundamentals of Queueing Theory. John Wiley & Sons. Kleinrock, L. (1976). Queueing Systems: Volume I - Theory. Wiley. Cooper, B. F., & Mitrani, I. (1985). Queueing Networks: A Fundamental Approach. John Wiley & Sons

See also

[edit]

References

[edit]
  1. ^ a b c Sundarapandian, V. (2009). "7. Queueing Theory". Probability, Statistics and Queueing Theory. PHI Learning. ISBN 978-81-203-3844-9.
  2. ^ Lawrence W. Dowdy, Virgilio A.F. Almeida, Daniel A. Menasce. "Performance by Design: Computer Capacity Planning by Example". Archived from the original on 2016-05-06. Retrieved 2009-07-08.
  3. ^ Schlechter, Kira (March 2, 2009). "Hershey Medical Center to open redesigned emergency room". The Patriot-News. Archived from the original on June 29, 2016. Retrieved March 12, 2009.
  4. ^ Mayhew, Les; Smith, David (December 2006). Using queuing theory to analyse completion times in accident and emergency departments in the light of the Government 4-hour target. Cass Business School. ISBN 978-1-905752-06-5. Archived from the original on September 7, 2021. Retrieved 2008-05-20.
  5. ^ a b c Taylor, Bernard W. (2019). Introduction to management science (13th ed.). New York: Pearson. ISBN 978-0-13-473066-0.
  6. ^ Tijms, H.C, Algorithmic Analysis of Queues, Chapter 9 in A First Course in Stochastic Models, Wiley, Chichester, 2003
  7. ^ Kendall, D. G. (1953). "Stochastic Processes Occurring in the Theory of Queues and their Analysis by the Method of the Imbedded Markov Chain". The Annals of Mathematical Statistics. 24 (3): 338–354. doi:10.1214/aoms/1177728975. JSTOR 2236285.
  8. ^ Hernández-Suarez, Carlos (2010). "An application of queuing theory to SIS and SEIS epidemic models". Math. Biosci. 7 (4): 809–823. doi:10.3934/mbe.2010.7.809. PMID 21077709.
  9. ^ "Agner Krarup Erlang (1878-1929) | plus.maths.org". Pass.maths.org.uk. 1997-04-30. Archived from the original on 2008-10-07. Retrieved 2013-04-22.
  10. ^ Asmussen, S. R.; Boxma, O. J. (2009). "Editorial introduction". Queueing Systems. 63 (1–4): 1–2. doi:10.1007/s11134-009-9151-8. S2CID 45664707.
  11. ^ Erlang, Agner Krarup (1909). "The theory of probabilities and telephone conversations" (PDF). Nyt Tidsskrift for Matematik B. 20: 33–39. Archived from the original (PDF) on 2011-10-01.
  12. ^ a b c Kingman, J. F. C. (2009). "The first Erlang century—and the next". Queueing Systems. 63 (1–4): 3–4. doi:10.1007/s11134-009-9147-4. S2CID 38588726.
  13. ^ Pollaczek, F., Ueber eine Aufgabe der Wahrscheinlichkeitstheorie, Math. Z. 1930
  14. ^ a b c Whittle, P. (2002). "Applied Probability in Great Britain". Operations Research. 50 (1): 227–239. doi:10.1287/opre.50.1.227.17792. JSTOR 3088474.
  15. ^ Kendall, D.G.:Stochastic processes occurring in the theory of queues and their analysis by the method of the imbedded Markov chain, Ann. Math. Stat. 1953
  16. ^ Pollaczek, F., Problèmes Stochastiques posés par le phénomène de formation d'une queue
  17. ^ Kingman, J. F. C.; Atiyah (October 1961). "The single server queue in heavy traffic". Mathematical Proceedings of the Cambridge Philosophical Society. 57 (4): 902. Bibcode:1961PCPS...57..902K. doi:10.1017/S0305004100036094. JSTOR 2984229. S2CID 62590290.
  18. ^ Ramaswami, V. (1988). "A stable recursion for the steady state vector in markov chains of m/g/1 type". Communications in Statistics. Stochastic Models. 4: 183–188. doi:10.1080/15326348808807077.
  19. ^ Morozov, E. (2017). "Stability analysis of a multiclass retrial system withcoupled orbit queues". Proceedings of 14th European Workshop. Lecture Notes in Computer Science. Vol. 17. pp. 85–98. doi:10.1007/978-3-319-66583-2_6. ISBN 978-3-319-66582-5.
  20. ^ Carlson, E.C.; Felder, R.M. (1992). "Simulation and queueing network modeling of single-product production campaigns". Computers & Chemical Engineering. 16 (7): 707–718. doi:10.1016/0098-1354(92)80018-5.
  21. ^ a b Manuel, Laguna (2011). Business Process Modeling, Simulation and Design. Pearson Education India. p. 178. ISBN 978-81-317-6135-9. Retrieved 6 October 2017.
  22. ^ a b c d Penttinen A., Chapter 8 – Queueing Systems, Lecture Notes: S-38.145 - Introduction to Teletraffic Theory.
  23. ^ Harchol-Balter, M. (2012). "Scheduling: Non-Preemptive, Size-Based Policies". Performance Modeling and Design of Computer Systems. pp. 499–507. doi:10.1017/CBO9781139226424.039. ISBN 978-1-139-22642-4.
  24. ^ Andrew S. Tanenbaum; Herbert Bos (2015). Modern Operating Systems. Pearson. ISBN 978-0-13-359162-0.
  25. ^ Harchol-Balter, M. (2012). "Scheduling: Preemptive, Size-Based Policies". Performance Modeling and Design of Computer Systems. pp. 508–517. doi:10.1017/CBO9781139226424.040. ISBN 978-1-139-22642-4.
  26. ^ Harchol-Balter, M. (2012). "Scheduling: SRPT and Fairness". Performance Modeling and Design of Computer Systems. pp. 518–530. doi:10.1017/CBO9781139226424.041. ISBN 978-1-139-22642-4.
  27. ^ Dimitriou, I. (2019). "A Multiclass Retrial System With Coupled Orbits And Service Interruptions: Verification of Stability Conditions". Proceedings of FRUCT 24. 7: 75–82.
  28. ^ "Archived copy" (PDF). Archived (PDF) from the original on 2017-03-29. Retrieved 2018-08-02.{{cite web}}: CS1 maint: archived copy as title (link)
  29. ^ Jackson, J. R. (1957). "Networks of Waiting Lines". Operations Research. 5 (4): 518–521. doi:10.1287/opre.5.4.518. JSTOR 167249.
  30. ^ Jackson, James R. (Oct 1963). "Jobshop-like Queueing Systems". Management Science. 10 (1): 131–142. doi:10.1287/mnsc.1040.0268. JSTOR 2627213.
  31. ^ Reiser, M.; Lavenberg, S. S. (1980). "Mean-Value Analysis of Closed Multichain Queuing Networks". Journal of the ACM. 27 (2): 313. doi:10.1145/322186.322195. S2CID 8694947.
  32. ^ Van Dijk, N. M. (1993). "On the arrival theorem for communication networks". Computer Networks and ISDN Systems. 25 (10): 1135–2013. doi:10.1016/0169-7552(93)90073-D. S2CID 45218280. Archived from the original on 2019-09-24. Retrieved 2019-09-24.
  33. ^ Gordon, W. J.; Newell, G. F. (1967). "Closed Queuing Systems with Exponential Servers". Operations Research. 15 (2): 254. doi:10.1287/opre.15.2.254. JSTOR 168557.
  34. ^ Baskett, F.; Chandy, K. Mani; Muntz, R.R.; Palacios, F.G. (1975). "Open, closed and mixed networks of queues with different classes of customers". Journal of the ACM. 22 (2): 248–260. doi:10.1145/321879.321887. S2CID 15204199.
  35. ^ Buzen, J. P. (1973). "Computational algorithms for closed queueing networks with exponential servers" (PDF). Communications of the ACM. 16 (9): 527–531. doi:10.1145/362342.362345. S2CID 10702. Archived (PDF) from the original on 2016-05-13. Retrieved 2015-09-01.
  36. ^ Kelly, F. P. (1975). "Networks of Queues with Customers of Different Types". Journal of Applied Probability. 12 (3): 542–554. doi:10.2307/3212869. JSTOR 3212869. S2CID 51917794.
  37. ^ Gelenbe, Erol (Sep 1993). "G-Networks with Triggered Customer Movement". Journal of Applied Probability. 30 (3): 742–748. doi:10.2307/3214781. JSTOR 3214781. S2CID 121673725.
  38. ^ Newell, G. F. (1982). "Applications of Queueing Theory". SpringerLink. doi:10.1007/978-94-009-5970-5. ISBN 978-94-009-5972-9.
  39. ^ Bobbio, A.; Gribaudo, M.; Telek, M. S. (2008). "Analysis of Large Scale Interacting Systems by Mean Field Method". 2008 Fifth International Conference on Quantitative Evaluation of Systems. p. 215. doi:10.1109/QEST.2008.47. ISBN 978-0-7695-3360-5. S2CID 2714909.
  40. ^ Chen, H.; Whitt, W. (1993). "Diffusion approximations for open queueing networks with service interruptions". Queueing Systems. 13 (4): 335. doi:10.1007/BF01149260. S2CID 1180930.
  41. ^ Yamada, K. (1995). "Diffusion Approximation for Open State-Dependent Queueing Networks in the Heavy Traffic Situation". The Annals of Applied Probability. 5 (4): 958–982. doi:10.1214/aoap/1177004602. JSTOR 2245101.
  42. ^ Bramson, M. (1999). "A stable queueing network with unstable fluid model". The Annals of Applied Probability. 9 (3): 818–853. doi:10.1214/aoap/1029962815. JSTOR 2667284.

Further reading

[edit]
[edit]