Applying Queueing Theory in Simulations

Benjamin Disraeli: There are three kinds of lies; lies, damned lies and statistics. And what is the deal with Queueing Theory?

Introduction

Queueing Theory deals with servicing of requests where the requests occur in a somewhat random manner and the servicing of those requests take a somewhat random amount of time. Such a situation usually results the formation of a line, or queue of requests waiting to be serviced.

Queueing Theory applies to queues formed at store checkouts, banks, drive thru fast food resturants, and toll road booths. A less concrete type of queue is formed when a caller is put on hold until the 'next available service representative'. Numerous queues are formed inside of modern computers by individual programs awaiting access to the disk, communications port or CPU. In general, any time a queue forms, queueing theory probably applies.

Consequently, queueing theory is used somewhere in most computer simulations. Computer simulations, used to model real world systems, are valuable because important information about a system can be learned at a fraction of the cost of learning that information from the system itself or physical prototype.

This document provides a primer into Queueing Theory from the viewpoint of how Queueing Theory is applied to the Ticket Agent Simulation, the code and design of which is described here.

The first section of this document consists of a Primer on Probability and Statistics. The second section assumes a rudimentary knowledge of statistics and deals with the specifics of how Queueing Theory was used in the Ticket Agent Simulation. If the reader is already familiar with probability and statistics, Section One can be skipped, or just used as a reference.

It is assumed that the reader has an understanding of simple graphing, such as knowing what a bar graph is. Although it will be briefly presented, it would be benefitial for the reader to also understand the conceptual difference between discrete and contiguous functions (but not absolutely necessary).

A Probability and Statistics Primer

Definition of Probability and Statistics

Although Probability and Statistics are 2 distinct fields of science, they share a symbiotic relationship. One is not often thought of without the other.

Probability is the prediction of the chance that a particular outcome will occur as the result of some test (often referred to as a trial). Statistics is the collection, analysis and interpretation of the data taken from the actual outcomes. Consequently, the common thread between probability and statistics is not hard to see. Both fields operate within the realm of outcomes, which are also called events. Probability predicts outcomes, while statistics records and analyzes them.

Probably will be explored by examining all the possible outcomes of the rolling of dice. Please note that when rolling dice (or an individual die), each separate roll is an individual trial.

The Rolling of a Single Die

A single die consists of a 6 sided cube. On each one of the 6 sides side of a die cube either 1, 2, 3, 4, 5 or 6 dots is placed in such a way that the number of dots of any 2 opposite sides will always add up to 7. The side of die that ends up on top after a roll is its outcome, and its value is the number the dots on that top side.

Considering that one member of the set of all 6 possible outcomes, {1, 2, 3, 4, 5, 6}, must occur, then the probability that a rolled die will be one of these 6 values is 100%, or certainty. This set of all possible outcomes is called the Sample Space. By definition, the sample space must be defined so that every possible outcome is contained within it, and correspondingly the probability that any any given outcome will occur within the sample space is 100% (certainty).

Please note that probabilities are always given as either a percentage, or a fraction of 1.0, as in 16.6% or 0.1666. Certainty, or the total probability, is described as either 100% or 1.0.

The generic term for value of the outcome generated from a trail, such as the rolling of the die, is called the random variable. So when rolling a die, the random variable Y represents one the values in the set of possible outcomes, {1, 2, 3, 4, 5, 6}.

By the intended design and rules of commonly used 6 sided dice, the probability of any one of the 6 values occurring from a roll is equal. In other words, the probability for each one of the 6 values of a die is 1/6, or .1666. Since all outcomes have an equal chance of occuring, the probability distribution is called the Uniform distribution.

Craps - A Game of Rolling 2 Dice

(NOTE: To Stop dice, hit Stop button)
Craps is a game of chance consisting of the rolling of 2 common dice. When 2 dice are rolled a 2-dimensional sample space occurs, as shown below:

However, in the game of craps what matters is not which side of each die is face up, but the sum of the numerical values represented by the face up sides of both dice. So altho there are 36 possible unique outcomes, also called atomic events, in the game of craps these 36 atomic events are grouped into 11 subsets of atomic events. The numerical sum of dice of each atomic event within a particular subset is the same, and that numerical sum will be unique among the subsets. So each of the 11 subsets will contain at least one numerical sum that is within the set of values from 2 thru 12. In other words, in craps the random variable is a value within the set of values from 2 thru 12.

Mathematically, the grouping of atomic values into subsets representing unique random variables is sometimes called the random variable function.

Just as with 1 die, each unique outcome or atomic event in the sample space of 2 dice has an equal probability of occuring. In fact, it is always true that each atomic event has an equal probability of occuring. But like the game of craps, most probabilities don't deal directly with atomic events, but the groups or subsets of atomic events that together form the possible values that is the random variable of interest.

How the probabilities of random variables formed by subsets of atomic values is straight forward. Whatever percentage or proportion of the total sample space that a particular subset contains is its probability, where this subset represents a particular random variable. Referring to the colorful illustration above, each subset of common atomic values, or individual random variable, has a separate background color.

Each atomic event, or unique dice outcome, has a 1/36 (2.78%) chance of occurring. So the probability of each subset of atomic events (random variable) is 2.78% multiplied by the number of atomic events in that subset. For example, the subset of atomic events that have a total of 2 contains only 1 atomic event, thus the probability of rolling a 2 (snake eyes) is 2.78%. The subset containing totals of 6 has 5 atomic events, so its probability is 5 multiplied by 2.78% or 11.11%.

The probabilities of all the outcomes, or random variables of the rolling of 2 dice are shown below both in the form as a table and as a bargraph.

Total Number of combinations Probability
2 1 2.78%
3 2 5.56%
4 3 8.33%
5 4 11.11%
6 5 13.89%
7 6 16.67%
8 5 13.89%
9 4 11.11%
10 3 8.33%
11 2 5.56%
12 1 2.78%
Total 36 100%

Suddenly, by rolling 2 dice instead of just 1 die, one transcends the stark simplicity of the uniform distribution of 1 die to a much more interesting probability distribution that begins to roughly approximate the shape of the well known Bell Curve. This quick introduction into the probability distribution of 2 dice establishes the fundamental concepts upon which all probability is based.

NonDeterministic Probability

If we rolled 2 dice 36 times, it is highly unlikely that the results would exactly match the predicted probability distribution. But if we rolled 2 dice 10,000 times, the results would match the predictions very accurately, as shown in the Histogram section. Thus, the probabilities of games of chance, like craps, roulette and slot machines, are deterministic. The historic and consistent success of casinos in always being able to predict their take from games of chance is a testament to the deterministic quality of these games.

Unfortunately, most things in life do not have the same deterministic quality as games of chance. Projections such as how many cars will come down a given street within a given time is ultimately an educated guess. How people will vote, who will win the Super Bowl and much rain will fall are all nondeterministic.

The dualistic science of Probability and Statistics helps to deal with the areas of the real world where probabilities are not deterministic. With the combination of mathematical theory and empirical data, probability and statistics has evolved into a sophisticated and accurate science, yet it is still based upon the same fundamental priciples of the rolling of 2 dice.

Statistics

Statistics is the gathering, storing and analyzing of data. The most fundamental entity of statistics is the individual data value of a single outcome. A single data value in statistics is often referred to as a data point to represent its position on a graph. In itself, a single data point is not very useful.

When a number of data values are taken, called a sample, then there is enough information from which one can draw conclusions. A sample of data values can be put into different forms, like a table. A very useful form of sample data is the Histogram.

The Histogram

A Histogram takes the group of data values of a sample and forms a graph, usually a bar graph. Looking back to the rolling of 2 dice, the bargraph (histogram) on the left shows the histogram for 10,000 trials of the rolling of 2 dice.

In the case of 2 rolled dice, the 11 possibilities from the subset of 2 thru 12 each have a bar which represents the number of times that the corresponding value of the subset {2..12} was the outcome. However, usually it is not the absolute numbers of a sample that are of interest, but the ratios of the various outcomes to the sample size. In case of a histogram, the shape of the graph can almost instantly communicate those ratios. By comparing the histogram with the predicted probability, one can instantly determine how well the sample distribution matched the predicted probability distribution (if one existed).

If no probability distribution exists, the histogram can be used as a basis for forming an empirical probability distribution. This technique, very useful in simulations, along with histrograms themselves and will be discussed in more detail in the section on Using the Histogram as a Probability Distribution.

Queueing Theory

Mathematical Theory

Probability and Statistics are a major fields of mathematics. This document is not meant to be a tutorial on either Mathematics or Probability and Statistics, but is intended to show as directly as possible the techniques used to create the Ticket Agent Simulation.

In a nutshell, the mathematical theory used by the Ticket Agent Simulation consists of the application of the Poisson Distribution that models the arrival of service requests, and the Exponential Distribution, which models the service time taken for each request. The rationale of these distributions will be discussed. The mathematical representations of these distributions will be accepted as they are and not derived, altho mathematical operations such as the integration of the Exponential Distribution and derivation of its generator will be detailed.

Using the Poisson Probability Distribution

Binomial Distribution The Poisson Distribution is associated with the Binomial Distribution. The Binomial Distribution is used in situations, called experiments, where:

The 2 possible outcomes of any trial in such an experiment form a binary result, hence the name Binomial Distribution. Poisson Distribution

The Poisson Distribution is similar to the Binomial Distribution by the nature of its binary properties. However, the Poisson Distribution is sort of the flip side of the Binomial Distribution. Instead of being concerned with the sum of trials that had a particular result, such a S, the Poisson Distribution focuses on each individual trial and attempts to predict how many trials it will take for the next S result to occur.

For example, if we want to predict the traffic flow down a particular street, we would use the Poisson Distribution and analyze it in the following manner. A length of street of each individual lane would be chosen such that only one vehicle would be expected to be in that section of lane at any point in time. With the selection of a length of street lane using this criteria, each lane of the street essentially become a binary trial, where a given length of street (section) either contains a vehicle, or it does not. After a section of street is encountered that contains a vehicle, the Poisson Distribution can be used to predict how many subsequent sections it will take to find next section that contains a vehicle.

In this example we used a section of street length as the individual trial. But the analysis can be done using length, area, volume, time, or just about any other metric. The the Ticket Agent Simulation, the metric used was time and the outcome of interest, mathmatically called the random variable, is the length of time between each subsequent incoming phone call.

Mathematically, the formula for the Poisson Distribution is to the right. Lambda is the mean, y is the random variable, and of course e is the natural number base. Lambda must be in the same discrete incremental units as y.

Applying Poisson Distribution in a Simulation

The Roles of Statistical Entities in a Simulation

When applying the Poisson (or any other) Distribution in a simulation, it is important to get clear in one's mind the meaning of the various aspects and properties of the probablility distributions and their empirical histograms.

A Probability Distribution is essentially a 'black box' that generates a likelyhood of an occurance (probability) for each random variable. This black box can either be a formulaic mathematical function, like the Poisson Distribution, or a table driven mechanism (still a function) that assigns a single output to each input.

A Probability Distribution function is either discrete, like the rolling of 2 dice, where the domain of this function (all the possible inputs) are contained in the finite set {2..12}, or continuous. A discrete function will produce a chart or bar graph such the bargraph at left. The sum of all the probability values of a bargraph of a discrete function will correspond to the total sample space, and be equal 1.0, or 100%.
Analogously, the total area under a graphed continuous function represents the total sample space, or 100% of the total probability. This is illustrated by the graphed Exponential Probability Distribution at right:

Unfortunately, in a simulation we do not want the probably that a particular value will occur. What is needed in a simulation is a generator to produce a series of the random values of the random variable, or what would be the input to probability distribution. The random nature of these values occuring is where the random variable get its moniker.

It may initially appear that a simulation generator could be implemented by an inverse of the Probability Distribution function, but this would be a false impression. An inverse function to most Probability Distributions, like the rolling of 2 dice, does not exist. The reason is that for many inputs of this theoretical 2 dice inverse function, such as the probability of 0.0833, there exist more than one output (4 and 10), which cannot happen in a function.

Random Variable Generators

Consequently, a common approach taken to implement value generators in simulations, also used here, is to form a 1:1 correspondence between a generated random real number between 0.0000 and 1.0000 and the probability that a given random variable or any random variable of less value will occur.

For example, referring to an expanded version of the table containing the Probability Distribution of the throwing of 2 Dice, shown below, the column labelled Accumulated Probability is the accumulation of the probability at that value plus the probability of all the lessor values. The column labelled Probability Range is the range of real numbers corresponding to a generated random number that would ultimately generate the value of interest, or random variable value. Thus, if we were simulating the rolling of 2 dice and had a random number generated of 0.377, it would fall within the range of 0.2777 to 0.4166 which corresponds to the random variable value of 6. In other words, if we input the random number 0.377, the 2 dice generator would output 6.

Probability Distribution of the throwing of 2 Dice

Random Variable Value Number of Atomic Events Probability Accumulated Probability Probability Range
2 1 0.0278 0.0278 0.0000 to 0.0278
3 2 0.0556 0.0833 0.0278 to 0.0833
4 3 0.0833 0.1667 0.0833 to 0.1667
5 4 0.1111 0.2777 0.1667 to 0.2777
6 5 0.1389 0.4166 0.2777 to 0.4166
7 6 0.1667 0.5833 0.4166 to 0.5833
8 5 0.1389 0.7222 0.5833 to 0.7222
9 4 0.1111 0.8333 0.7222 to 0.8333
10 3 0.0833 0.9166 0.8333 to 0.9166
11 2 0.0556 0.9722 0.9166 to 0.9722
12 1 0.0278 1.0000 0.9722 to 1.0000
Total 36 1.0000 1.0000 0.0000 to 1.0000

Poisson Random Variable Generator

Before the Poisson Distribution can be used, the parameter lambda, the mean of the Poisson Distribution, must be directly assigned. In addition, y, the particular value of the random variable, must indirectly have its scaling factor assigned.

The parameter y is mathematically a simple scalor value of some assigned range from the set of positive integers. However, the user must know whether this scalor value represents length, area, volume, time, or some other metric, and the scaling value of the metric. In other words, if time is the metric as it is in the Ticket Agent Simulation, what unit of time each incremental increase of y respresents must also be known.

For the Ticket Agent Simulation, 1 second was chosen as the scaling unit, which seemed appropriate for measuring the length of time between arriving phone calls. If one were simulating how often large meteors striking the earth, each unit of time might be years, decades or centuries.

Once the scaling factor of y is know, lambda, the mean, must be chosen in the same scaling units. Consequently, 2 parameters must be assigned to the Poisson Distribution before it can be used. The accuracy of the Poisson Distribution and its Random Variable Generator is dependent upon the optimal assignments of those parameters. Altho there is powerful mathematics behind the Poisson and other commonly used probability distributions, it is easy to see how difficult it is to model nondeterministic real world phenominon with the accuracy of the rolling of 2 dice. Nevertheless, these less than perfect solutions are still very useful and can be surprisingly accurate.

The algorithm used by the Poisson Random Variable Generator is shown at right. It attempts to illustrate mathematically the programmic steps taken in the software by the Poisson Generator software. (At this point we are assuming that the scaling factor of y and the value of lambda are assigned.) First, a random number is generated, symolized by the function rand(). Next, starting by assigning y a value of 1, the Poisson Distribution is summed until the summation is greater than the random number. The value of y at the transition when the summation becomes greater than the random number becomes the generated random variable. the

In the Ticket Agent Simulation, the output of the Poisson Distribution Generator was offset to produce the value actually used. This offset factor was necessary when the values of lambda started to approach 100. Even with a data type of double, or 64 bit floating point number, as lambda approached 100, computational overflow errors were occurring. To compensate, lambda was offset so that the first values that registered corresponded to 1.

Below is a histogram of the output of the Poisson Random Variable Generator that was used in Ticket Agent Simulation. Lambda, or the mean used, had a value of 100, representing an average time between incoming calls of 100 seconds. However, the offset value of 65 implies that internally, the mean was actually 35. Every y generated had 65 added to it before it was plotted. The Minimum value is 65 because it is the value that corresponds to 1.

Poisson Distribution Generator Histogram

Count
   |                                 **    *
   |                                 **    *   *
   |                                 **    *   *    Mean:         100
 30+                                ***    *   *    Pad Offset:   -65
   |                                ***    *   *    Minimum value: 65
   |                            *   *** *  *   *    Sample Size:  750
   |                            *   *** *  *  **
   |                            ** **** ** *  **
 25+                         *  ** ********** ** *
   |                       * *  ** ********** ** *
   |                       ***  ** ********** ** *
   |                       ***  ************* ** *
   |                       *** ***************** *
 20+                       *** ***************** *
   |                       *** ***************** *
   |                       *** ***************** *
   |                       ********************* *
   |                       ********************* *
 15+                       ********************* **
   |                       ************************ *
   |                      ************************* *
   |                     ****************************
   |                    *****************************
 10+                 *  *****************************
   |                 *  *****************************
   |                 *  *****************************
   |                 * ******************************  *
   |               * ***********************************
  5+               * ***********************************
   |             ***************************************
   |             ***************************************
   |      *      ***************************************      * **  *
   |     **    ******************************************    ** **  ***
   +----+----+----+----+----+----+----+----+----+----+----+----+----+----+
  65   70   75   80   85   90   95  100  105  110  115  120  125  130  135
                    Random Variable y in seconds

Exponential Probability Distribution

The Exponential Probability Distribution is used to simulate service times. In the case of the Ticket Agent Simulation, the service time is how long it takes for a ticket agent to service a call.

The theory behind the Exponential Probability Distribution might be better understood by considering the service time of a passive component like a light bulb. A light bulb may fail at just about any time. Just because a given light bulb has operated for the past month does not necessarily make it more likely to fail in the next hour than another light bulb that has only operated for a week. This property is sometimes called being memoryless.

Another factor in service times is that when a component fails, it is replaced with another component. Consequently, long lasting components will essentially crowd out replacements, whereas short lasting components allow for more components to be put into service. This pattern of replacement will tend to result in more components of shorter service time and less components of longer service time.

Human oriented service times also fit this model. Taking the example of ticket agents, the length of any given call is essentially totally independent. If a particular call has taking a long time, perhaps 5 minutes, that 5 minutes really has no bearing on how much additional time that call is likely to take. The call time is just based upon too many factors. In addition, when one call is completed, that ticket agent will immediately take another call. Thus, passive component and human service times both tend to fit the same Exponential Distribution model.

The formula for the Exponential Distribution is:

Note that unlike the discrete Poisson Distribution, the Exponential Distribution is continuous. Mathematically, there is a significant difference between discrete and continuous distributions, or functions. However, in the application of probability distributions in the realm of human endeavor, the results of continuous functions are usually converted to discrete outputs. This action of making a continuous realm discrete is often called digitizing.

For example, time, a continuous phenominon, is usually digitized to some granularity (lowest unit of measurement) that makes sense for that endeavor. For ticket agent service times, time was digitized to the nearest second. For the lifespan of some long life components, the lifespan may be digitized to nearest hour, or day.

Emphasizing this point are histograms. Histograms are all discrete. A continuous histogram would be unworkable.

Nevertheless, the mathematics of a continuous function can be adventageous. The random variable generators are a much simpler, one step process for continuous probability distributions.

Remember, the sum of all the probability values of a bargraph of a discrete function will correspond to the total sample space and be equal 1.0, or 100%. Analogously, the total area under a graphed continuous function represents the total sample space, or 100% of the total probability. The algorithm to implement a random variable generator of a discrete probability distribution involves incrementally summing the probabilities of each discrete value until that sum exceeds a generated random number. The analogy of the summation of a discrete function is the integration of a continuous function, which is a one step process. Consequently, the implemention of a random variable generator for a continuous function, such as the Exponential Probability Distribution, is the one step process shown below. The software listing is here.

Chosing the Appropriate Probability Distribution

Setting the Correct Parameters

For the Ticket Agent Simulation, the conventional Probability Distributions of the Poisson for the incoming call and the Exponential for the ticket agent service times were used. As was pointed out with the Poisson Distribution, the proper scaling must be assigned for incremental value of y. However, regardless of the Probability Distribution used, the implication of the mean and how it is calculated is independent of the particular probability distribution chosen. The mean can either be determined from a data sample from an actual system, or guessed at. Since the mean must determined before any probability distribution is used, it certainly makes sense to use the distribution that is the best fit.

Using the Empirical Histogram

There is an alternative to being concerned with theories about what is and how to use the appropriate Probability Distribution. If there is an actual system from which one can collect data samples, one can create histograms from the data and then convert those histograms to Probability Distributions.

In the case of a Ticket Agent Simulation, histograms can be created from the actual incoming call frequency and actual ticket agent service times. Remember, when doing a simulation, it is important to keep in mind the goal of what exactly is trying to be determined. In a ticket agent simulation, incoming call frequency and ticket agent service times themselves are not the issue. (If service times are the issue, that is a separate matter.) The issue the Ticket Agent Simulation is concerned with is how the overall system thruput is affected by the number of ticket agents and/or the number of holding lines.

Consequently, if a large enough sample are taken to give one confidence that the complete Probability Distribution has been captured, the first step is to make a histogram. We will use one of the histograms created by simulation as an example.

Looking at our simulated Exponential Histrogram, we will pretend that this histogram is from an actual data sample. Of primary significance is the sample size, which is 10,000 in our case. By simply interpreting the sample size as the total sample space, we have our probability distribution. Each individual sample will have the probability of the inverse of the sample size, in our case 1/10000 or 0.0001. Consequently, by replacing the sample count for each random variable with that count * 0.0001, we have a discrete probability distribution. Or more generically, the probability for each random variable is its count * 1/(sample count).

Below is the simulated Exponential Distribution, which we are using as a hypothetical sample, shown in the form of a probability distribution. Please note that both the vertical and horizontal axis have been scaled for this graphic representation. In a computation, the probability distribution function would be taken from the values contained in the tables, or some other data structure, where the exact count would be stored for each random variable.

(Please note that this simulation is also offset by 60. The rationale for this was that unless there was wrong number, it would take a minimum of 60 seconds for a customer to be serviced for even the simplest transaction.)

Exponential Distribution Generator Histogram

Count
   |  *
   |  *       Mean:          150
450+ ***      Pad Offset:    -60
   | ***      Minimum value:  60
   | ***      Sample Size: 10000
   |****
   |****            xDIVISOR:  7  
375+******          yDIVISOR: 15
   |******
   |********
   |********  *
   |********* *
300+********* *
   |********* * *
   |**************
   |**************  *
   |**************  *
225+**************  *
   |*************** * *
   |*******************
   |******************* **
   |*************************
150+*************************
   |************************* **
   |***************************** *
   |*******************************
   |********************************** *  *
 75+************************************* *
   |******************************************** * *
   |****************************************************
   |***************************************************** *****
   |***************************************************************
   +----+----+----+----+----+----+----+----+----+----+----+----+----+
  60   95  130  165  200  235  270  305  340  375  410  445  480  515
                Random Variable y in seconds

As shown in this example using the Exponential Distribution, the technique using a large sample to create a probability distribution derived from a histogram also works for continuous probabilities. Thus, any probability distribution derived from a histogram, regardless of its theoretical form, will be a discrete distribution. It also follows that any discrete probability distribution will use the discrete algorithm described in the section on Random Variable Generators.


Copyright (C) 2003 by Cary Rhode,
Software Developer, occasional student and all around great guy. If you have any comments or questions, email me at: caryrhode@yahoo.com