Hidden Markov Model using TensorFlow

Hello Readers, this blog will take you through the basics of the Hidden Markov Model (HMM) using TensorFlow in Python. This model is based on the mathematical topic – probability distributions.

Markov Property

Markov Property – when the probability of future events depends only on the conditions of the present state and is independent of past events that took place.

Let’s take an example to understand the property. Suppose we flip a coin. There are two possibilities for the coin to land on either heads or tails. You will agree that the probability of both heads and tails showing up is 50%. Suppose in the first trial, we get heads, and then we again flip the coin. Will the previous result (heads) affect the output of this flip? Can the coin store the previous result to change our outcome this time? The answer is No, this time also the probability of both heads and tails showing up will remain 50%. The outcome of the present event is oblivious to the outcome of the past event. This is the Markov property.

Hidden Markov Model

Hidden Markov Models deals in probability distributions to predict future events or states. The model consists of a given number of states which have their own probability distributions. The change between any two states is defined as a transition and the probabilities associated with these transitions in the HMM are transition probabilities.

HMM is an abstract concept, therefore I have taken an example of a weather model to study it throughout.

Components of the Markov Model

  • States: In each Markov model there are a finite set of states which can be anything like – “sleeping”,” eating”,” working” or “warm” and “cold” or “red”,” yellow” and “green” etc. These states are non-observable and therefore called hidden.
  • Observations: Each state has a particular outcome or observation associated with it based on a probability distribution. These observations are visible to us. For example: when it is a sunny day, there is an 80% probability that Joe will eat ice cream whereas a 20% probability that he won’t.
  • Transitions: Each state will have a probability defining the likelihood of transitioning to a different state. For example, there is a 90% chance that when today is sunny the next day will be sunny, and 10% that the next day would be rainy. Similarly, there is an 85% chance that when today is rainy the next day would be rainy and a 15% chance that the next day would be sunny.

Hidden Markov Model is specified by the following components:

As you have read the basics of the Hidden Markov Model. Now we consider Weather/ Temperature as an observation to our states and implement the HMM model.

Weather model

We model a simple weather system and try to predict the temperature based on given information:

  • 0 encodes for a Rainy day and 1 encodes for a Sunny day.
  • Let the first day in our sequence has an 85% chance of being rainy.
  • There is a 10% chance that a rainy day follows a sunny day.
  • There is a 15% chance that a sunny day follows a rainy day.
  • The temperature is normally distributed every day. The mean and standard deviations are 0 and 5 on a rainy day and 20 and 15 on a sunny day.
  • In our example, we take the average temperature to be 20 on a sunny day. And the range from 10 to 30 on a sunny day. The below table shows the transition matrix for the transition distributions taken in our example

 

 

 

Import the necessary Libraries

import tensorflow as tf 
import tensorflow_probability as tfp

To implement the Hidden Markov Model we use the TensorFlow probability module. We use python as our programming language.

tfd = tfp.distributions #shortform 
initial_dist = tfd.Categorical(probs=[0.85,0.15])#Rainy day
transition_dist = tfd.Categorical(probs= [[0.9,0.1], 
                                          [0.15,0.85]])
observation_dist = tfd.Normal(loc=[0.,20.], scale=[5.,15.])

We specify the initial, transitional, and observation distributions for our Hidden Markov Model. The initial distribution specifies the probability of a landing on a Rainy day as our sequence begins.
You can observe that the transition distribution 2D array in our code is the exact same transition matrix that we saw above. Also, the dimensions of this matrix are 2×2 corresponding to the two states in our model which are the rainy and sunny day state.
In the observation distribution, you see the Normal distribution with location `loc` and `scale` parameters.
Here `loc = mu` is the mean, `scale = sigma` is the std. deviation. These are used to solve the PDF(probability density function):

pdf(x; mu, sigma) = exp(-0.5 (x - mu)**2 / sigma**2) / Z 
Z = (2 pi sigma**2)**0.5 (Z - normalization constant)

The Hidden Markov Model

model = tfd.HiddenMarkovModel(initial_distribution=initial_dist, 
                              transition_distribution=transition_dist, 
                              observation_distribution=observation_dist,
                              num_steps=7)

Here in the above lines of code, we call the inbuilt HMM with the defined parameters.
The number of steps in our code defines the number of days we wish to predict the average temperature. We call it for an entire week(7 days).
Technically it means how many times it will run through this probability cycle and run the model sequentially.

Printing the outputs

mean = model.mean() 
with tf.compat.v1.Session() as sess: 
    print(mean.numpy())

model. mean() in the above code is a partially defined tensor/computation. To get the value of it we create a new session in TensorFlow and run that part of the graph.

Output: [3. 4.25 5.1875 5.890625 6.4179688 6.8134775 7.110109 ]

You can observe that the output starts from a 3-degree temperature. The first-day temperature is low because we defined the initial distribution with an 85% probability for a rainy day.
This temperature gradually rises as we predict the temperature for the further days. That is because now the model takes into account the transition probabilities.

You can play with the probability values to see the changes in temperature for yourself.
For example, We interchange the initial distribution probabilities to 15% and 85%. This means that there is a 15% probability of the first day to be rainy.

To this, we observe the below output:

Output: [17. 14.750002 13.062504 11.796877 10.847659 10.135745 9.60181 ]

Now it is evident that the temperature spikes as there is more probability for the first day to be sunny.

I hope you understood the basic implementation of this model. Keep reading!

Leave a Reply

Your email address will not be published. Required fields are marked *