Poetry Generation with LSTMs

When people think about AI, the first few things that come to their minds are automation of mundane but learnable tasks, data segmentation/classification, and some form of predicting an outcome of a complex task which involves data. In this post, I’ll be explain to you how you can do creative stuff such as training your AI to generate poems.

Language Modelling

From a modelling standpoint, language is an extremely complex process. There are many tasks such as text classification, syntactic/semantic parsing, POS tagging, and much more. But let’s take a moment and think about how would we go about making a generative model from what we learn about Recurrent Neural Networks (RNNs). RNNs are extremely good at modelling language because parameters are updated and shared through time via an update process called Backpropagation Through Time (BPTT). This allows this type of neural network to model sequences very well. Texts are by nature sequences, so RNNs tend to do very well in text processing, and recently have become state-of-the-art in many NLP tasks.

A specific architecture of RNN that has been extremely successful is a Long-Short Term Memory (LSTM). As the name suggests, the architecture is designed to be able to learn long-term dependencies. It does this by having “gates” (modelled by weights between 0 and 1, where these weights are learned through backpropagation and sigmoid activation function). The gates allow the RNN to choose which time steps should contribute to the current cell/hidden state. In essence, the architecture fixes vanishing gradient problem from which a regular RNN suffers. As a result, we get an RNN unit that has more parameters, but these parameters are trained in a very unique way that allows it to learn information from very long sequences.

Predicting Next Words

Text generation can be modelled by predicting a word based on the sequence it has seen so far (isn’t that how humans do “fill in the blank”?). To do this, we would train the network by creating a sliding window of words, and the target prediction is the next word after that sequence. For example, in a sentence “I love ice cream”, you’d use the phrase “I love ice” to predict the next word. The resulting LSTM would have learned a whole lot about predicting the next word given the sequence it has seen so far. Essentially, given a sequence so far, the LSTM outputs a probability distribution of the next word. The only thing left is to sample from that distribution (don’t use the argmax, or else you’d be producing the same poem for every starting word you use as the seed of the poem!!!). In “The Unreasonable Effectiveness of Recurrent Neural Networks”, Andrej Karpathy (http://karpathy.github.io/2015/05/21/rnn-effectiveness/) demonstrates many interesting applications of RNNs. In his post, he trained an RNN on Shakespeare’s poems and generated this poem:

"PANDARUS: Alas, I think he shall be come approached and the day

When little srain would be attain'd into being never fed,

And who is but a chain and subjects of his death,

I should not sleep.

Second Senator:

They are away this miseries, produced upon my soul,

Breaking and strongly should be buried, when I perish

The earth and thoughts of many states.

DUKE VINCENTIO:

Well, your wit is in the care of side and that.

Second Lord:

They would be ruled after this chamber, and

my fair nues begun out of the fact, to be conveyed,

Whose noble souls I'll have the heart of the wars.

Clown:

Come, sir, I will make did behold your worship.

VIOLA:

I'll drink it. “

In addition, the RNN seemed to have learned how characters sometimes do long monologues. For example:

“VIOLA:

Why, Salisbury must find his flesh and thought

That which I am not aps, not a man and in fire,

To show the reining of the raven and the wars

To grace my hand reproach within, and not a fair are hand,

That Caesar and my goodly father's world;

When I was heaven of presence and our fleets,

We spare with hours, but cut thy council I am great,

Murdered and by thy master's ready there

My power to give thee but so much as hell:

Some service in the noble bondman here,

Would show him to her wine.

KING LEAR:

O, if you were a feeble sight, the courtesy of your law,

Your sight and several breath, will wear the gods

With his heads, and my hands are wonder'd at the deeds,

So drop upon your lordship's head, and your opinion

Shall be against your honour."

This type of sampling to generate the next text in the sequence is actually applicable to any kind of corpuses, enabling you to generate texts learned from articles, books, news, essays, etc. I hope you enjoyed this post. Have a nice day, and have fun making poems! ;)

Have a question?

Drop us a line and we will get back to you