Time Distributed Layer in Keras with example in Python

In this blog, we would be learning about Time Distributed Layer in Keras with an example in Python. While traditional prediction based problems solved by neural networks in general but specific sequence learning problems with temporal dependencies are best solved using LSTM models. Time Distributed layers are used to overcome the issue of training convolutional flow for every image of the sequence,

Sequence Learning Problems and LSTM Models

In any Machine Learning project, the task is to predict values based on input. The input can be of any type, also chronological like the video frames. Arbitrary sequence problems are different from traditional problems of neural networks where LSTM uses memory to learn temporal dependence between observations. Sequence learning problems generate contiguous real values in the range [0,1]. To handle such data of time range, LSTM models are generally used. The LSTM model predicts the next item in sequence for one or more time steps of past values. It will extract features and try to react to movements, actions, and directions in every time frame. Basically, a sequential input will expect a description of the sequence. 

What is a Time Distributed Layer?

A common sequential neural network is not suitable for this work because training convolutional flow for every image of the sequence is time-consuming and inefficient to learn peculiar features. To overcome this chaos, there is a need for a module that can apply the same layer to the list of input data. Keras provides an object, Time Distributed layer that helps in detecting intentions behind chronological inputs. It applies a layer to every temporal slice of the input. It helps to keep one-to-one relations with input and its corresponding output. 

Why do we need Time Distributed Layer?

If a Time Distributed Layer is not used for sequential data, then the flattened output of the model is mixed with the time steps. But if this layer is applied, the output is obtained separately for each timestep. Long Short-Term Networks are a powerful version of Recurrent Neural Network. RNNs are capable of a number of different types of input/output combinations like one-to-one, one-to-many, many-to-many, many-to-one.

Time Distributed Layer (and the former TimeDistributedDense layer) is widely used for the one-to-many and many-to-many architectures as their outputs should have the same function for every timestep. If this is not the case, the network will have one flattened output. The flattened output would not make time step values separate and lead to unwanted intervention between different timesteps. You can find further on this discussion here

Many-to-one implies return_sequence = False.

Many-to-many implies  return_sequence = True and is Time Distributed.

Implementation of Time Distributed Layer in Keras with an example in Python

We have X and y as random uniform continuous distributions using np.random.random()

LSTM and TimeDistributed Layer are used to create the model. Once the model is created, you can config the model with losses and metrics with model.compile(), train the model with model.fit(), or use the model to do prediction with model.predict().

from numpy import array
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, TimeDistributed, LSTM
Input_Dim, Output_Dim = 15, 8
Length = 64
Sample_Size = 50
X = np.random.random([Sample_Size,Length,Input_Dim])
y = np.random.random([Sample_Size,Length,Output_Dim])
model = Sequential()
model.add(LSTM(32, input_shape=(64, 15), return_sequences=True))
model.add(TimeDistributed(Dense(8)))
model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(X, y, epochs=500)
result = model.predict(X, batch_size=10, verbose=2)

At the end of this blog, we learned what is Time-Distributed Layer and its implementation in Keras.

 

Leave a Reply

Your email address will not be published.