Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
keras-team
GitHub Repository: keras-team/keras-io
Path: blob/master/examples/generative/ipynb/lstm_character_level_text_generation.ipynb
3508 views
Kernel: Python 3

Character-level text generation with LSTM

Author: fchollet
Date created: 2015/06/15
Last modified: 2020/04/30
Description: Generate text from Nietzsche's writings with a character-level LSTM.

Introduction

This example demonstrates how to use a LSTM model to generate text character-by-character.

At least 20 epochs are required before the generated text starts sounding locally coherent.

It is recommended to run this script on GPU, as recurrent networks are quite computationally intensive.

If you try this script on new data, make sure your corpus has at least ~100k characters. ~1M is better.

Setup

import keras from keras import layers import numpy as np import random import io

Prepare the data

path = keras.utils.get_file( "nietzsche.txt", origin="https://s3.amazonaws.com/text-datasets/nietzsche.txt", ) with io.open(path, encoding="utf-8") as f: text = f.read().lower() text = text.replace("\n", " ") # We remove newlines chars for nicer display print("Corpus length:", len(text)) chars = sorted(list(set(text))) print("Total chars:", len(chars)) char_indices = dict((c, i) for i, c in enumerate(chars)) indices_char = dict((i, c) for i, c in enumerate(chars)) # cut the text in semi-redundant sequences of maxlen characters maxlen = 40 step = 3 sentences = [] next_chars = [] for i in range(0, len(text) - maxlen, step): sentences.append(text[i : i + maxlen]) next_chars.append(text[i + maxlen]) print("Number of sequences:", len(sentences)) x = np.zeros((len(sentences), maxlen, len(chars)), dtype="bool") y = np.zeros((len(sentences), len(chars)), dtype="bool") for i, sentence in enumerate(sentences): for t, char in enumerate(sentence): x[i, t, char_indices[char]] = 1 y[i, char_indices[next_chars[i]]] = 1

Build the model: a single LSTM layer

model = keras.Sequential( [ keras.Input(shape=(maxlen, len(chars))), layers.LSTM(128), layers.Dense(len(chars), activation="softmax"), ] ) optimizer = keras.optimizers.RMSprop(learning_rate=0.01) model.compile(loss="categorical_crossentropy", optimizer=optimizer)

Prepare the text sampling function

def sample(preds, temperature=1.0): # helper function to sample an index from a probability array preds = np.asarray(preds).astype("float64") preds = np.log(preds) / temperature exp_preds = np.exp(preds) preds = exp_preds / np.sum(exp_preds) probas = np.random.multinomial(1, preds, 1) return np.argmax(probas)

Train the model

epochs = 40 batch_size = 128 for epoch in range(epochs): model.fit(x, y, batch_size=batch_size, epochs=1) print() print("Generating text after epoch: %d" % epoch) start_index = random.randint(0, len(text) - maxlen - 1) for diversity in [0.2, 0.5, 1.0, 1.2]: print("...Diversity:", diversity) generated = "" sentence = text[start_index : start_index + maxlen] print('...Generating with seed: "' + sentence + '"') for i in range(400): x_pred = np.zeros((1, maxlen, len(chars))) for t, char in enumerate(sentence): x_pred[0, t, char_indices[char]] = 1.0 preds = model.predict(x_pred, verbose=0)[0] next_index = sample(preds, diversity) next_char = indices_char[next_index] sentence = sentence[1:] + next_char generated += next_char print("...Generated: ", generated) print("-")