|
The Markov Chain model and process uses
three important kinds of modern mathematics:
probability,
matrices, and
time-stepping. To see how these work together, consider the following.
 |
Sam and Joan are two teenagers living in a small
town, Nowhere, Indiana. In this town there are only two outdoor
evening entertainment centers, a miniature golf and a frisbee golf. Not
having much to do in the evenings, Sam and Joan regularly hang out
at one of these places. They begin to notice that there are regular
customers, but also that some customers switch from one place to the
other. Intrigued by this, they decide to do an analysis. At first
they try to follow particular people and record their habits, but
this is too hard even in the little town. So they decide to keep
track of how many people repeat and how many change in each of the
two places. They enlist several of their (also bored) friends and
after many nights of observations, they arrive at the following: 60%
of those attending miniature golf on any given night also go there the
next night. At frisbee golf, 70% of those playing on a given night
also play golf the next night. |
One particular night they notice that
exactly half of the people play at miniature golf and half play
frisbee golf. Sam
and Joan decided that they should be able to predict how many will attend
each place on the following night. Sam reasons that the number at
miniature golf
the next night should be 60% of the people in the town who are there this
night, that is 60% of 50% of the town. Joan points out that they need to add
those who have change from frisbee golf, that is 30% of the 50% playing
golf. Figure 1 shows at a glance the initial state (first night) and
the transition used to get to the second state.
 |
| Figure 1: Flowchart
of pie charts of probabilities |
They arrive at (0.6)(0.5)+(0.3)(0.5) =
0.45 or 45% of the town. They continue to reason that those playing frisbee
golf on the following night should be the 70% of the 50% who stay plus the
40% of the 50% at miniature golf who leave there. They arrive at
(0.7)(0.5)+(0.4)(0.5)=0.55 or 55%. They are happy to see that 45% +55%
= 100%, so everyone is somewhere (like they should!). They wait until the
next night and sure enough they are right. They decide to predict again. Now
they have new numbers. They arrive at (0.6)(0.45)+(0.3)(0.55)=0.435 and
(0.7)(0.55)+(0.4)(0.45)=0.565 and again their prediction works.
Being intelligent but also practical,
they reason that they should be able to come up with a mathematical method
to do their predictions. They also want to be able to project their
predictions long term, such as, how many will be at each place a month from
now.
Sam realizes that the % they have used
are can be converted to probabilities. He names miniature golf ‘1’ and
frisbee golf ‘2’ and writes p(12) to mean the %, or probability, of people at
frisbee golf who changed from there to miniature golf. In this notation,
p(11)=0.6, p(21)=0.4, p(12)=0.3 and p(22)=0.7. Sam points out to Joan that
these are actually conditional probabilities: p(21)=p(2|1), etc. He then
calls the actual percentage of people at a place on any given day the state
of that place, so s(2)(1) means the %, or probability, of the
whole population who are place 1 on day 2.
He then rewrites his first
calculation, (0.6)(0.5) + (0.3)(0.5) = 0.45 as p(11)s(1)(1) +
p(12)s(1)(2) = s(2)(1).
Likewise his second calculation
(0.7)(0.5) + (0.4)(0.5) = .55 becomes p(22)s(1)(2)+p(21)s(1)(1)=s(2)(2).
 |
| Figure 2:
Probability Tree Diagram |
Figure 2 is a probability tree
diagram, a combination of visual thinking and arithmetic calculation. The
numbers on the branches are the conditional probabilities. If one follows a
given branch and multiplies the numbers, one arrives at the probability of
that sequence of events. To find the probability that a given event, say
miniature golf, occurs on a given night, one adds up all the individual
probabilities that come from any given sequence that terminates at that
event. For instance, one can use this strategy to find the probability of
someone being at miniature golf on night two. Notice that there are two ways
to end up at a miniature golf according to the Figure 2. One way to
do this is to follow the sequence: miniature golf on the first night
followed by miniature golf on the second night. This gives (0.5)(0.6) = 0.3.
Another way to do this is to follow the branch sequence frisbee golf on the
first night followed by miniature golf on the second. This gives (0.5)(0.3)
= 0.15. Now we add up these branches 0.3 + 0.15 = 0.45.
 |
| Figure 3: Markov
Chain |
Figure 3 is a schematic
representation of a Markov chain.
The numbers refer to the activities. The arrows refer to paths from one
activity to another, including itself. The numbers on the arrows indicate
the probability of taking any given path.
Joan has been looking at these
equations, and realizes that she can write them in a different form using
matrices. She rewrites them as follows.

She calls the 2x2 matrix the
transition
matrix, T, since it contains all the information of the various movements,
or transitions, from any one day to the next. The 2x1 matrices she calls the
state vectors, Sd, since they contain all information of the
state of the population on any given day, d. Thus T*S(1)=S(2)
is the above matrix multiplication in matrix language.
Continuing in this way, they reason that
T*S(2)=S(3). So in order to predict the state in 30
days, they just need to do this process thirty times, taking the answer they
get at any given day and using it as the starting state for the next day.
Sam remembers that this is a recursive process: T*S(n) = S(n+1).
Joan continues to think about matrices.
She writes T*S(2) =T*(T*S(1)), which then becomes T*S(2)
= T2*S(1). She writes Sam’s recursive
formula as T*S(n) = Tn*S(1).
|
x = [.5; .5];
% define the initial state vector
T = [.6 .3; .4 .7];
% define the transition matrix
for n = 1:25
% do multiplication 25 times
x = T*x
end
disp(x) |
|
|
| Figure
4: Matlab code calculating the 25th iteration |
Joan and Sam try their respective
processes out for a number of future states and get the same answers. With
S(1) = [ .5 .5], they obtain S(2) = [.45 .55], S(3) =
[.435 .565], S(4) = [.4305 .5695], … S(25) = [.4286
.5714].
 |
| Figure 5: Evolution
of states over time |
Joan thinks about the
initial state, and
realizes that the process will work with other initial states. She tries
S(1) = [.6 .4]. She obtains S(2) =
[.48 .52],
S(3) = [.444 .556], and finally S(25) = [.4286 .5714].
Sam thinks about the pattern of answers.
He notices that the states for larger numbers are getting closer and closer
together. It seems to him that after about a month, the number of people
switching activities becomes even. He asks whether a type of
dynamic
equilibrium has been reached, or a
steady state, and what is it. He reasons
that if he eventually reaches a point where T*S(n) = S(n), the S(n) is the
steady state. He looks at his numbers and concludes that after n=25, there
is no significant change in states and calls S(25) = [.4286 .5714] =
S, the steady state. Figure 5 shows the change in the percentage of people
at each activity on successive nights and hints at this percentage achieving
a steady state.
Joan looks at what Sam is doing and
writes this in her language. This means that Tn*S(1) = T(n+1)*S(1).
Therefore Joan just needs to find a larger enough n such that T^n =T^(n+1).
Of course she also finds that T^25 will work.
Joan has done some advanced work in
matrices. As an aside, she notices that the steady state equation that Sam
wrote, T*S(n) = S(n), can be interpreted as an
eigen-equation, with
eigenvalue 1 and corresponding
normalized eigenvector equal to the steady
state. She solves the problem this way also, just to convince herself. Sam
and Joan go home happy, feeling as if they did something productive with their
summer-time hours.
Let us analyze this example. First of
all, our model involved a
mathematical process. We did
something. Second, our information was
probabilistic. The information
we obtained did not tell where a specific individual will go on any given
day. It did tell us the probability of someone changing activities (or
not). A process based on probability is called a
stochastic process.
Third, it was convenient to represent this information in a square matrix,
called a transition matrix. We could then use matrix multiplication
to get from one day to the next. We used the word state to describe
the situation on a given day. This too was a matrix, a column matrix.
Time stepping occurs because we asked the question in discrete
time intervals (days) rather than continuously. Fourth, in this example we
eventually reach a state in which subsequent moves do not change the state
matrix. When this occurs, we say that we have a steady state (or an
equilibrium state). This is actually a dynamic equilibrium: the
probabilities are constant although individual people could still be
changing activities. Fifth, there is another property in this example which
is important but is easily overlooked. To predict what will happen on the
next day, we only need the transition matrix and the current situation – no
previous days’ information is necessary. The process “has no memory”. This
property is what makes the process Markov. This is most evident in
Sam’s recursive approach. We summarize by giving the following definition.
|
A Markov chain is a discrete
stochastic process in which the future depends only on the present and not
on past history. Suppose for instance we want to know the probability that
a student will get an A in his math class the 4th year of high school and he
has completed his 3rd year. In general, we might expect that this prediction
will depend upon what grades he got in his 1st, 2nd, and 3rd years. This
would be a Markov process if only the 3rd (current ) year had any bearing,
and that the grades in the previous two years could be ignored.
|
|
|