Take a look into RNN and LSTM

Ritika Singh
3 min readAug 18, 2021

Can a Neural Network generate text based on the corpus it is trained on?

Recurrent Neural Network- takes into account the sequence of data when its learning.(order is very important in the concept of creating a text while predicting the next word!)

So, how do we fit this into Neural Networks?

Changing sequence list data into sequential data. Neural Networks for classification or regression looks something like a function which takes in 2 parameters to feed -

  1. Data
  2. Labels

and then infers the rules that fits the data into labels.

A very famous sequence to understand this function is the FIBONACCI SERIES.

Lets denote the numbers in the sequence like this-

and the rule that describes the sequence is that the current number is the sum of the previous 2 numbers.

So every number has the previous number as a part of it which is carried forward till the very end of the sequence. It can be visualized in a computational form like this-

This is the basis of a RECURRENT NEURAL NETWORK !

A recurrent neuron is drawn like this.

There is a function which takes in an input value, and produces an output value .

In addition, it also produces a feed forward value which gets passed to the next neuron and this process continues throughout the network.

This is the recurrence of data.

LETS SEE HOW TO CODE THIS RNN LAYER

As , we can observe that in RNN, the word which is farther away will have a really less impact on the current word to be predicted, hence comes the need to go beyond the short term memory of Recurrent Neural Network to Long Short Term Memory(LSTM).

the LSTM architecture introduces Cell states which is a context that can be maintained across many timestamps .

How is it different from RNN?

  1. It can bring meaning to the sentence from beginning to bear.
  2. It is also Bidirectional which helps in understanding the semantics of the sentence more clearly!

LETS SEE HOW TO CODE THIS LSTM LAYER

We, first define the LSTM style layer, and it takes a numeric parameter for the number of hidden layers which is also the dimensionality of the layer. We will then wrap the layer into Bidirectional property.

We can also stack LSTM layers, so that the output of the LSTM layers gets fed into the next layers, and this is done by setting the return_sequences property to TRUE.

--

--