CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!
CoCalc provides the best real-time collaborative environment for Jupyter Notebooks, LaTeX documents, and SageMath, scalable from individual users to large groups and classes!
Path: blob/main/beginner_source/deploy_seq2seq_hybrid_frontend_tutorial.py
Views: 494
# -*- coding: utf-8 -*-1"""2Deploying a Seq2Seq Model with TorchScript3==================================================4**Author:** `Matthew Inkawhich <https://github.com/MatthewInkawhich>`_5"""678######################################################################9# This tutorial will walk through the process of transitioning a10# sequence-to-sequence model to TorchScript using the TorchScript11# API. The model that we will convert is the chatbot model from the12# `Chatbot tutorial <https://pytorch.org/tutorials/beginner/chatbot_tutorial.html>`__.13# You can either treat this tutorial as a “Part 2” to the Chatbot tutorial14# and deploy your own pretrained model, or you can start with this15# document and use a pretrained model that we host. In the latter case,16# you can reference the original Chatbot tutorial for details17# regarding data preprocessing, model theory and definition, and model18# training.19#20# What is TorchScript?21# ----------------------------22#23# During the research and development phase of a deep learning-based24# project, it is advantageous to interact with an **eager**, imperative25# interface like PyTorch’s. This gives users the ability to write26# familiar, idiomatic Python, allowing for the use of Python data27# structures, control flow operations, print statements, and debugging28# utilities. Although the eager interface is a beneficial tool for29# research and experimentation applications, when it comes time to deploy30# the model in a production environment, having a **graph**-based model31# representation is very beneficial. A deferred graph representation32# allows for optimizations such as out-of-order execution, and the ability33# to target highly optimized hardware architectures. Also, a graph-based34# representation enables framework-agnostic model exportation. PyTorch35# provides mechanisms for incrementally converting eager-mode code into36# TorchScript, a statically analyzable and optimizable subset of Python37# that Torch uses to represent deep learning programs independently from38# the Python runtime.39#40# The API for converting eager-mode PyTorch programs into TorchScript is41# found in the ``torch.jit`` module. This module has two core modalities for42# converting an eager-mode model to a TorchScript graph representation:43# **tracing** and **scripting**. The ``torch.jit.trace`` function takes a44# module or function and a set of example inputs. It then runs the example45# input through the function or module while tracing the computational46# steps that are encountered, and outputs a graph-based function that47# performs the traced operations. **Tracing** is great for straightforward48# modules and functions that do not involve data-dependent control flow,49# such as standard convolutional neural networks. However, if a function50# with data-dependent if statements and loops is traced, only the51# operations called along the execution route taken by the example input52# will be recorded. In other words, the control flow itself is not53# captured. To convert modules and functions containing data-dependent54# control flow, a **scripting** mechanism is provided. The55# ``torch.jit.script`` function/decorator takes a module or function and56# does not requires example inputs. Scripting then explicitly converts57# the module or function code to TorchScript, including all control flows.58# One caveat with using scripting is that it only supports a subset of59# Python, so you might need to rewrite the code to make it compatible60# with the TorchScript syntax.61#62# For all details relating to the supported features, see the `TorchScript63# language reference <https://pytorch.org/docs/master/jit.html>`__.64# To provide the maximum flexibility, you can also mix tracing and scripting65# modes together to represent your whole program, and these techniques can66# be applied incrementally.67#68# .. figure:: /_static/img/chatbot/pytorch_workflow.png69# :align: center70# :alt: workflow71#72737475######################################################################76# Acknowledgments77# ----------------78#79# This tutorial was inspired by the following sources:80#81# 1) Yuan-Kuei Wu's pytorch-chatbot implementation:82# https://github.com/ywk991112/pytorch-chatbot83#84# 2) Sean Robertson's practical-pytorch seq2seq-translation example:85# https://github.com/spro/practical-pytorch/tree/master/seq2seq-translation86#87# 3) FloydHub's Cornell Movie Corpus preprocessing code:88# https://github.com/floydhub/textutil-preprocess-cornell-movie-corpus89#909192######################################################################93# Prepare Environment94# -------------------95#96# First, we will import the required modules and set some constants. If97# you are planning on using your own model, be sure that the98# ``MAX_LENGTH`` constant is set correctly. As a reminder, this constant99# defines the maximum allowed sentence length during training and the100# maximum length output that the model is capable of producing.101#102103import torch104import torch.nn as nn105import torch.nn.functional as F106import re107import os108import unicodedata109import numpy as np110111device = torch.device("cpu")112113114MAX_LENGTH = 10 # Maximum sentence length115116# Default word tokens117PAD_token = 0 # Used for padding short sentences118SOS_token = 1 # Start-of-sentence token119EOS_token = 2 # End-of-sentence token120121122######################################################################123# Model Overview124# --------------125#126# As mentioned, the model that we are using is a127# `sequence-to-sequence <https://arxiv.org/abs/1409.3215>`__ (seq2seq)128# model. This type of model is used in cases when our input is a129# variable-length sequence, and our output is also a variable length130# sequence that is not necessarily a one-to-one mapping of the input. A131# seq2seq model is comprised of two recurrent neural networks (RNNs) that132# work cooperatively: an **encoder** and a **decoder**.133#134# .. figure:: /_static/img/chatbot/seq2seq_ts.png135# :align: center136# :alt: model137#138#139# Image source:140# https://jeddy92.github.io/JEddy92.github.io/ts_seq2seq_intro/141#142# Encoder143# ~~~~~~~144#145# The encoder RNN iterates through the input sentence one token146# (e.g. word) at a time, at each time step outputting an “output” vector147# and a “hidden state” vector. The hidden state vector is then passed to148# the next time step, while the output vector is recorded. The encoder149# transforms the context it saw at each point in the sequence into a set150# of points in a high-dimensional space, which the decoder will use to151# generate a meaningful output for the given task.152#153# Decoder154# ~~~~~~~155#156# The decoder RNN generates the response sentence in a token-by-token157# fashion. It uses the encoder’s context vectors, and internal hidden158# states to generate the next word in the sequence. It continues159# generating words until it outputs an *EOS_token*, representing the end160# of the sentence. We use an `attention161# mechanism <https://arxiv.org/abs/1409.0473>`__ in our decoder to help it162# to “pay attention” to certain parts of the input when generating the163# output. For our model, we implement `Luong et164# al. <https://arxiv.org/abs/1508.04025>`__\ ’s “Global attention” module,165# and use it as a submodule in our decode model.166#167168169######################################################################170# Data Handling171# -------------172#173# Although our models conceptually deal with sequences of tokens, in174# reality, they deal with numbers like all machine learning models do. In175# this case, every word in the model’s vocabulary, which was established176# before training, is mapped to an integer index. We use a ``Voc`` object177# to contain the mappings from word to index, as well as the total number178# of words in the vocabulary. We will load the object later before we run179# the model.180#181# Also, in order for us to be able to run evaluations, we must provide a182# tool for processing our string inputs. The ``normalizeString`` function183# converts all characters in a string to lowercase and removes all184# non-letter characters. The ``indexesFromSentence`` function takes a185# sentence of words and returns the corresponding sequence of word186# indexes.187#188189class Voc:190def __init__(self, name):191self.name = name192self.trimmed = False193self.word2index = {}194self.word2count = {}195self.index2word = {PAD_token: "PAD", SOS_token: "SOS", EOS_token: "EOS"}196self.num_words = 3 # Count SOS, EOS, PAD197198def addSentence(self, sentence):199for word in sentence.split(' '):200self.addWord(word)201202def addWord(self, word):203if word not in self.word2index:204self.word2index[word] = self.num_words205self.word2count[word] = 1206self.index2word[self.num_words] = word207self.num_words += 1208else:209self.word2count[word] += 1210211# Remove words below a certain count threshold212def trim(self, min_count):213if self.trimmed:214return215self.trimmed = True216keep_words = []217for k, v in self.word2count.items():218if v >= min_count:219keep_words.append(k)220221print('keep_words {} / {} = {:.4f}'.format(222len(keep_words), len(self.word2index), len(keep_words) / len(self.word2index)223))224# Reinitialize dictionaries225self.word2index = {}226self.word2count = {}227self.index2word = {PAD_token: "PAD", SOS_token: "SOS", EOS_token: "EOS"}228self.num_words = 3 # Count default tokens229for word in keep_words:230self.addWord(word)231232233# Lowercase and remove non-letter characters234def normalizeString(s):235s = s.lower()236s = re.sub(r"([.!?])", r" \1", s)237s = re.sub(r"[^a-zA-Z.!?]+", r" ", s)238return s239240241# Takes string sentence, returns sentence of word indexes242def indexesFromSentence(voc, sentence):243return [voc.word2index[word] for word in sentence.split(' ')] + [EOS_token]244245246######################################################################247# Define Encoder248# --------------249#250# We implement our encoder’s RNN with the ``torch.nn.GRU`` module which we251# feed a batch of sentences (vectors of word embeddings) and it internally252# iterates through the sentences one token at a time calculating the253# hidden states. We initialize this module to be bidirectional, meaning254# that we have two independent GRUs: one that iterates through the255# sequences in chronological order, and another that iterates in reverse256# order. We ultimately return the sum of these two GRUs’ outputs. Since257# our model was trained using batching, our ``EncoderRNN`` model’s258# ``forward`` function expects a padded input batch. To batch259# variable-length sentences, we allow a maximum of *MAX_LENGTH* tokens in260# a sentence, and all sentences in the batch that have less than261# *MAX_LENGTH* tokens are padded at the end with our dedicated *PAD_token*262# tokens. To use padded batches with a PyTorch RNN module, we must wrap263# the forward pass call with ``torch.nn.utils.rnn.pack_padded_sequence``264# and ``torch.nn.utils.rnn.pad_packed_sequence`` data transformations.265# Note that the ``forward`` function also takes an ``input_lengths`` list,266# which contains the length of each sentence in the batch. This input is267# used by the ``torch.nn.utils.rnn.pack_padded_sequence`` function when268# padding.269#270# TorchScript Notes:271# ~~~~~~~~~~~~~~~~~~~~~~272#273# Since the encoder’s ``forward`` function does not contain any274# data-dependent control flow, we will use **tracing** to convert it to275# script mode. When tracing a module, we can leave the module definition276# as-is. We will initialize all models towards the end of this document277# before we run evaluations.278#279280class EncoderRNN(nn.Module):281def __init__(self, hidden_size, embedding, n_layers=1, dropout=0):282super(EncoderRNN, self).__init__()283self.n_layers = n_layers284self.hidden_size = hidden_size285self.embedding = embedding286287# Initialize GRU; the ``input_size`` and ``hidden_size`` parameters are both set to 'hidden_size'288# because our input size is a word embedding with number of features == hidden_size289self.gru = nn.GRU(hidden_size, hidden_size, n_layers,290dropout=(0 if n_layers == 1 else dropout), bidirectional=True)291292def forward(self, input_seq, input_lengths, hidden=None):293# type: (Tensor, Tensor, Optional[Tensor]) -> Tuple[Tensor, Tensor]294# Convert word indexes to embeddings295embedded = self.embedding(input_seq)296# Pack padded batch of sequences for RNN module297packed = torch.nn.utils.rnn.pack_padded_sequence(embedded, input_lengths)298# Forward pass through GRU299outputs, hidden = self.gru(packed, hidden)300# Unpack padding301outputs, _ = torch.nn.utils.rnn.pad_packed_sequence(outputs)302# Sum bidirectional GRU outputs303outputs = outputs[:, :, :self.hidden_size] + outputs[:, : ,self.hidden_size:]304# Return output and final hidden state305return outputs, hidden306307308######################################################################309# Define Decoder’s Attention Module310# ---------------------------------311#312# Next, we’ll define our attention module (``Attn``). Note that this313# module will be used as a submodule in our decoder model. Luong et314# al. consider various “score functions”, which take the current decoder315# RNN output and the entire encoder output, and return attention316# “energies”. This attention energies tensor is the same size as the317# encoder output, and the two are ultimately multiplied, resulting in a318# weighted tensor whose largest values represent the most important parts319# of the query sentence at a particular time-step of decoding.320#321322# Luong attention layer323class Attn(nn.Module):324def __init__(self, method, hidden_size):325super(Attn, self).__init__()326self.method = method327if self.method not in ['dot', 'general', 'concat']:328raise ValueError(self.method, "is not an appropriate attention method.")329self.hidden_size = hidden_size330if self.method == 'general':331self.attn = nn.Linear(self.hidden_size, hidden_size)332elif self.method == 'concat':333self.attn = nn.Linear(self.hidden_size * 2, hidden_size)334self.v = nn.Parameter(torch.FloatTensor(hidden_size))335336def dot_score(self, hidden, encoder_output):337return torch.sum(hidden * encoder_output, dim=2)338339def general_score(self, hidden, encoder_output):340energy = self.attn(encoder_output)341return torch.sum(hidden * energy, dim=2)342343def concat_score(self, hidden, encoder_output):344energy = self.attn(torch.cat((hidden.expand(encoder_output.size(0), -1, -1), encoder_output), 2)).tanh()345return torch.sum(self.v * energy, dim=2)346347def forward(self, hidden, encoder_outputs):348# Calculate the attention weights (energies) based on the given method349if self.method == 'general':350attn_energies = self.general_score(hidden, encoder_outputs)351elif self.method == 'concat':352attn_energies = self.concat_score(hidden, encoder_outputs)353elif self.method == 'dot':354attn_energies = self.dot_score(hidden, encoder_outputs)355356# Transpose max_length and batch_size dimensions357attn_energies = attn_energies.t()358359# Return the softmax normalized probability scores (with added dimension)360return F.softmax(attn_energies, dim=1).unsqueeze(1)361362363######################################################################364# Define Decoder365# --------------366#367# Similarly to the ``EncoderRNN``, we use the ``torch.nn.GRU`` module for368# our decoder’s RNN. This time, however, we use a unidirectional GRU. It369# is important to note that unlike the encoder, we will feed the decoder370# RNN one word at a time. We start by getting the embedding of the current371# word and applying a372# `dropout <https://pytorch.org/docs/stable/nn.html?highlight=dropout#torch.nn.Dropout>`__.373# Next, we forward the embedding and the last hidden state to the GRU and374# obtain a current GRU output and hidden state. We then use our ``Attn``375# module as a layer to obtain the attention weights, which we multiply by376# the encoder’s output to obtain our attended encoder output. We use this377# attended encoder output as our ``context`` tensor, which represents a378# weighted sum indicating what parts of the encoder’s output to pay379# attention to. From here, we use a linear layer and softmax normalization380# to select the next word in the output sequence.381382# TorchScript Notes:383# ~~~~~~~~~~~~~~~~~~~~~~384#385# Similarly to the ``EncoderRNN``, this module does not contain any386# data-dependent control flow. Therefore, we can once again use387# **tracing** to convert this model to TorchScript after it388# is initialized and its parameters are loaded.389#390391class LuongAttnDecoderRNN(nn.Module):392def __init__(self, attn_model, embedding, hidden_size, output_size, n_layers=1, dropout=0.1):393super(LuongAttnDecoderRNN, self).__init__()394395# Keep for reference396self.attn_model = attn_model397self.hidden_size = hidden_size398self.output_size = output_size399self.n_layers = n_layers400self.dropout = dropout401402# Define layers403self.embedding = embedding404self.embedding_dropout = nn.Dropout(dropout)405self.gru = nn.GRU(hidden_size, hidden_size, n_layers, dropout=(0 if n_layers == 1 else dropout))406self.concat = nn.Linear(hidden_size * 2, hidden_size)407self.out = nn.Linear(hidden_size, output_size)408409self.attn = Attn(attn_model, hidden_size)410411def forward(self, input_step, last_hidden, encoder_outputs):412# Note: we run this one step (word) at a time413# Get embedding of current input word414embedded = self.embedding(input_step)415embedded = self.embedding_dropout(embedded)416# Forward through unidirectional GRU417rnn_output, hidden = self.gru(embedded, last_hidden)418# Calculate attention weights from the current GRU output419attn_weights = self.attn(rnn_output, encoder_outputs)420# Multiply attention weights to encoder outputs to get new "weighted sum" context vector421context = attn_weights.bmm(encoder_outputs.transpose(0, 1))422# Concatenate weighted context vector and GRU output using Luong eq. 5423rnn_output = rnn_output.squeeze(0)424context = context.squeeze(1)425concat_input = torch.cat((rnn_output, context), 1)426concat_output = torch.tanh(self.concat(concat_input))427# Predict next word using Luong eq. 6428output = self.out(concat_output)429output = F.softmax(output, dim=1)430# Return output and final hidden state431return output, hidden432433434######################################################################435# Define Evaluation436# -----------------437#438# Greedy Search Decoder439# ~~~~~~~~~~~~~~~~~~~~~440#441# As in the chatbot tutorial, we use a ``GreedySearchDecoder`` module to442# facilitate the actual decoding process. This module has the trained443# encoder and decoder models as attributes, and drives the process of444# encoding an input sentence (a vector of word indexes), and iteratively445# decoding an output response sequence one word (word index) at a time.446#447# Encoding the input sequence is straightforward: simply forward the448# entire sequence tensor and its corresponding lengths vector to the449# ``encoder``. It is important to note that this module only deals with450# one input sequence at a time, **NOT** batches of sequences. Therefore,451# when the constant **1** is used for declaring tensor sizes, this452# corresponds to a batch size of 1. To decode a given decoder output, we453# must iteratively run forward passes through our decoder model, which454# outputs softmax scores corresponding to the probability of each word455# being the correct next word in the decoded sequence. We initialize the456# ``decoder_input`` to a tensor containing an *SOS_token*. After each pass457# through the ``decoder``, we *greedily* append the word with the highest458# softmax probability to the ``decoded_words`` list. We also use this word459# as the ``decoder_input`` for the next iteration. The decoding process460# terminates either if the ``decoded_words`` list has reached a length of461# *MAX_LENGTH* or if the predicted word is the *EOS_token*.462#463# TorchScript Notes:464# ~~~~~~~~~~~~~~~~~~~~~~465#466# The ``forward`` method of this module involves iterating over the range467# of :math:`[0, max\_length)` when decoding an output sequence one word at468# a time. Because of this, we should use **scripting** to convert this469# module to TorchScript. Unlike with our encoder and decoder models,470# which we can trace, we must make some necessary changes to the471# ``GreedySearchDecoder`` module in order to initialize an object without472# error. In other words, we must ensure that our module adheres to the473# rules of the TorchScript mechanism, and does not utilize any language474# features outside of the subset of Python that TorchScript includes.475#476# To get an idea of some manipulations that may be required, we will go477# over the diffs between the ``GreedySearchDecoder`` implementation from478# the chatbot tutorial and the implementation that we use in the cell479# below. Note that the lines highlighted in red are lines removed from the480# original implementation and the lines highlighted in green are new.481#482# .. figure:: /_static/img/chatbot/diff.png483# :align: center484# :alt: diff485#486# Changes:487# ^^^^^^^^488#489# - Added ``decoder_n_layers`` to the constructor arguments490#491# - This change stems from the fact that the encoder and decoder492# models that we pass to this module will be a child of493# ``TracedModule`` (not ``Module``). Therefore, we cannot access the494# decoder’s number of layers with ``decoder.n_layers``. Instead, we495# plan for this, and pass this value in during module construction.496#497#498# - Store away new attributes as constants499#500# - In the original implementation, we were free to use variables from501# the surrounding (global) scope in our ``GreedySearchDecoder``\ ’s502# ``forward`` method. However, now that we are using scripting, we503# do not have this freedom, as the assumption with scripting is that504# we cannot necessarily hold on to Python objects, especially when505# exporting. An easy solution to this is to store these values from506# the global scope as attributes to the module in the constructor,507# and add them to a special list called ``__constants__`` so that508# they can be used as literal values when constructing the graph in509# the ``forward`` method. An example of this usage is on NEW line510# 19, where instead of using the ``device`` and ``SOS_token`` global511# values, we use our constant attributes ``self._device`` and512# ``self._SOS_token``.513#514#515# - Enforce types of ``forward`` method arguments516#517# - By default, all parameters to a TorchScript function are assumed518# to be Tensor. If we need to pass an argument of a different type,519# we can use function type annotations as introduced in `PEP520# 3107 <https://www.python.org/dev/peps/pep-3107/>`__. In addition,521# it is possible to declare arguments of different types using522# Mypy-style type annotations (see523# `doc <https://pytorch.org/docs/master/jit.html#types>`__).524#525#526# - Change initialization of ``decoder_input``527#528# - In the original implementation, we initialized our529# ``decoder_input`` tensor with ``torch.LongTensor([[SOS_token]])``.530# When scripting, we are not allowed to initialize tensors in a531# literal fashion like this. Instead, we can initialize our tensor532# with an explicit torch function such as ``torch.ones``. In this533# case, we can easily replicate the scalar ``decoder_input`` tensor534# by multiplying 1 by our SOS_token value stored in the constant535# ``self._SOS_token``.536#537538class GreedySearchDecoder(nn.Module):539def __init__(self, encoder, decoder, decoder_n_layers):540super(GreedySearchDecoder, self).__init__()541self.encoder = encoder542self.decoder = decoder543self._device = device544self._SOS_token = SOS_token545self._decoder_n_layers = decoder_n_layers546547__constants__ = ['_device', '_SOS_token', '_decoder_n_layers']548549def forward(self, input_seq : torch.Tensor, input_length : torch.Tensor, max_length : int):550# Forward input through encoder model551encoder_outputs, encoder_hidden = self.encoder(input_seq, input_length)552# Prepare encoder's final hidden layer to be first hidden input to the decoder553decoder_hidden = encoder_hidden[:self._decoder_n_layers]554# Initialize decoder input with SOS_token555decoder_input = torch.ones(1, 1, device=self._device, dtype=torch.long) * self._SOS_token556# Initialize tensors to append decoded words to557all_tokens = torch.zeros([0], device=self._device, dtype=torch.long)558all_scores = torch.zeros([0], device=self._device)559# Iteratively decode one word token at a time560for _ in range(max_length):561# Forward pass through decoder562decoder_output, decoder_hidden = self.decoder(decoder_input, decoder_hidden, encoder_outputs)563# Obtain most likely word token and its softmax score564decoder_scores, decoder_input = torch.max(decoder_output, dim=1)565# Record token and score566all_tokens = torch.cat((all_tokens, decoder_input), dim=0)567all_scores = torch.cat((all_scores, decoder_scores), dim=0)568# Prepare current token to be next decoder input (add a dimension)569decoder_input = torch.unsqueeze(decoder_input, 0)570# Return collections of word tokens and scores571return all_tokens, all_scores572573574575######################################################################576# Evaluating an Input577# ~~~~~~~~~~~~~~~~~~~578#579# Next, we define some functions for evaluating an input. The ``evaluate``580# function takes a normalized string sentence, processes it to a tensor of581# its corresponding word indexes (with batch size of 1), and passes this582# tensor to a ``GreedySearchDecoder`` instance called ``searcher`` to583# handle the encoding/decoding process. The searcher returns the output584# word index vector and a scores tensor corresponding to the softmax585# scores for each decoded word token. The final step is to convert each586# word index back to its string representation using ``voc.index2word``.587#588# We also define two functions for evaluating an input sentence. The589# ``evaluateInput`` function prompts a user for an input, and evaluates590# it. It will continue to ask for another input until the user enters ‘q’591# or ‘quit’.592#593# The ``evaluateExample`` function simply takes a string input sentence as594# an argument, normalizes it, evaluates it, and prints the response.595#596597def evaluate(searcher, voc, sentence, max_length=MAX_LENGTH):598### Format input sentence as a batch599# words -> indexes600indexes_batch = [indexesFromSentence(voc, sentence)]601# Create lengths tensor602lengths = torch.tensor([len(indexes) for indexes in indexes_batch])603# Transpose dimensions of batch to match models' expectations604input_batch = torch.LongTensor(indexes_batch).transpose(0, 1)605# Use appropriate device606input_batch = input_batch.to(device)607lengths = lengths.to(device)608# Decode sentence with searcher609tokens, scores = searcher(input_batch, lengths, max_length)610# indexes -> words611decoded_words = [voc.index2word[token.item()] for token in tokens]612return decoded_words613614615# Evaluate inputs from user input (``stdin``)616def evaluateInput(searcher, voc):617input_sentence = ''618while(1):619try:620# Get input sentence621input_sentence = input('> ')622# Check if it is quit case623if input_sentence == 'q' or input_sentence == 'quit': break624# Normalize sentence625input_sentence = normalizeString(input_sentence)626# Evaluate sentence627output_words = evaluate(searcher, voc, input_sentence)628# Format and print response sentence629output_words[:] = [x for x in output_words if not (x == 'EOS' or x == 'PAD')]630print('Bot:', ' '.join(output_words))631632except KeyError:633print("Error: Encountered unknown word.")634635# Normalize input sentence and call ``evaluate()``636def evaluateExample(sentence, searcher, voc):637print("> " + sentence)638# Normalize sentence639input_sentence = normalizeString(sentence)640# Evaluate sentence641output_words = evaluate(searcher, voc, input_sentence)642output_words[:] = [x for x in output_words if not (x == 'EOS' or x == 'PAD')]643print('Bot:', ' '.join(output_words))644645646######################################################################647# Load Pretrained Parameters648# --------------------------649#650# No, let's load our model!651#652# Use hosted model653# ~~~~~~~~~~~~~~~~654#655# To load the hosted model:656#657# 1) Download the model `here <https://download.pytorch.org/models/tutorials/4000_checkpoint.tar>`__.658#659# 2) Set the ``loadFilename`` variable to the path to the downloaded660# checkpoint file.661#662# 3) Leave the ``checkpoint = torch.load(loadFilename)`` line uncommented,663# as the hosted model was trained on CPU.664#665# Use your own model666# ~~~~~~~~~~~~~~~~~~667#668# To load your own pretrained model:669#670# 1) Set the ``loadFilename`` variable to the path to the checkpoint file671# that you wish to load. Note that if you followed the convention for672# saving the model from the chatbot tutorial, this may involve changing673# the ``model_name``, ``encoder_n_layers``, ``decoder_n_layers``,674# ``hidden_size``, and ``checkpoint_iter`` (as these values are used in675# the model path).676#677# 2) If you trained the model on a CPU, make sure that you are opening the678# checkpoint with the ``checkpoint = torch.load(loadFilename)`` line.679# If you trained the model on a GPU and are running this tutorial on a680# CPU, uncomment the681# ``checkpoint = torch.load(loadFilename, map_location=torch.device('cpu'))``682# line.683#684# TorchScript Notes:685# ~~~~~~~~~~~~~~~~~~~~~~686#687# Notice that we initialize and load parameters into our encoder and688# decoder models as usual. If you are using tracing mode(``torch.jit.trace``)689# for some part of your models, you must call ``.to(device)`` to set the device690# options of the models and ``.eval()`` to set the dropout layers to test mode691# **before** tracing the models. `TracedModule` objects do not inherit the692# ``to`` or ``eval`` methods. Since in this tutorial we are only using693# scripting instead of tracing, we only need to do this before we do694# evaluation (which is the same as we normally do in eager mode).695#696697save_dir = os.path.join("data", "save")698corpus_name = "cornell movie-dialogs corpus"699700# Configure models701model_name = 'cb_model'702attn_model = 'dot'703#attn_model = 'general'``704#attn_model = 'concat'705hidden_size = 500706encoder_n_layers = 2707decoder_n_layers = 2708dropout = 0.1709batch_size = 64710711# If you're loading your own model712# Set checkpoint to load from713checkpoint_iter = 4000714715#############################################################716# Sample code to load from a checkpoint:717#718# .. code-block:: python719#720# loadFilename = os.path.join(save_dir, model_name, corpus_name,721# '{}-{}_{}'.format(encoder_n_layers, decoder_n_layers, hidden_size),722# '{}_checkpoint.tar'.format(checkpoint_iter))723724# If you're loading the hosted model725loadFilename = 'data/4000_checkpoint.tar'726727# Load model728# Force CPU device options (to match tensors in this tutorial)729checkpoint = torch.load(loadFilename, map_location=torch.device('cpu'))730encoder_sd = checkpoint['en']731decoder_sd = checkpoint['de']732encoder_optimizer_sd = checkpoint['en_opt']733decoder_optimizer_sd = checkpoint['de_opt']734embedding_sd = checkpoint['embedding']735voc = Voc(corpus_name)736voc.__dict__ = checkpoint['voc_dict']737738739print('Building encoder and decoder ...')740# Initialize word embeddings741embedding = nn.Embedding(voc.num_words, hidden_size)742embedding.load_state_dict(embedding_sd)743# Initialize encoder & decoder models744encoder = EncoderRNN(hidden_size, embedding, encoder_n_layers, dropout)745decoder = LuongAttnDecoderRNN(attn_model, embedding, hidden_size, voc.num_words, decoder_n_layers, dropout)746# Load trained model parameters747encoder.load_state_dict(encoder_sd)748decoder.load_state_dict(decoder_sd)749# Use appropriate device750encoder = encoder.to(device)751decoder = decoder.to(device)752# Set dropout layers to ``eval`` mode753encoder.eval()754decoder.eval()755print('Models built and ready to go!')756757758######################################################################759# Convert Model to TorchScript760# -----------------------------761#762# Encoder763# ~~~~~~~764#765# As previously mentioned, to convert the encoder model to TorchScript,766# we use **scripting**. The encoder model takes an input sequence and767# a corresponding lengths tensor. Therefore, we create an example input768# sequence tensor ``test_seq``, which is of appropriate size (MAX_LENGTH,769# 1), contains numbers in the appropriate range770# :math:`[0, voc.num\_words)`, and is of the appropriate type (int64). We771# also create a ``test_seq_length`` scalar which realistically contains772# the value corresponding to how many words are in the ``test_seq``. The773# next step is to use the ``torch.jit.trace`` function to trace the model.774# Notice that the first argument we pass is the module that we want to775# trace, and the second is a tuple of arguments to the module’s776# ``forward`` method.777#778# Decoder779# ~~~~~~~780#781# We perform the same process for tracing the decoder as we did for the782# encoder. Notice that we call forward on a set of random inputs to the783# traced_encoder to get the output that we need for the decoder. This is784# not required, as we could also simply manufacture a tensor of the785# correct shape, type, and value range. This method is possible because in786# our case we do not have any constraints on the values of the tensors787# because we do not have any operations that could fault on out-of-range788# inputs.789#790# GreedySearchDecoder791# ~~~~~~~~~~~~~~~~~~~792#793# Recall that we scripted our searcher module due to the presence of794# data-dependent control flow. In the case of scripting, we do necessary795# language changes to make sure the implementation complies with796# TorchScript. We initialize the scripted searcher the same way that we797# would initialize an unscripted variant.798#799800### Compile the whole greedy search model to TorchScript model801# Create artificial inputs802test_seq = torch.LongTensor(MAX_LENGTH, 1).random_(0, voc.num_words).to(device)803test_seq_length = torch.LongTensor([test_seq.size()[0]]).to(device)804# Trace the model805traced_encoder = torch.jit.trace(encoder, (test_seq, test_seq_length))806807### Convert decoder model808# Create and generate artificial inputs809test_encoder_outputs, test_encoder_hidden = traced_encoder(test_seq, test_seq_length)810test_decoder_hidden = test_encoder_hidden[:decoder.n_layers]811test_decoder_input = torch.LongTensor(1, 1).random_(0, voc.num_words)812# Trace the model813traced_decoder = torch.jit.trace(decoder, (test_decoder_input, test_decoder_hidden, test_encoder_outputs))814815### Initialize searcher module by wrapping ``torch.jit.script`` call816scripted_searcher = torch.jit.script(GreedySearchDecoder(traced_encoder, traced_decoder, decoder.n_layers))817818819820821######################################################################822# Print Graphs823# ------------824#825# Now that our models are in TorchScript form, we can print the graphs of826# each to ensure that we captured the computational graph appropriately.827# Since TorchScript allow us to recursively compile the whole model828# hierarchy and inline the ``encoder`` and ``decoder`` graph into a single829# graph, we just need to print the `scripted_searcher` graph830831print('scripted_searcher graph:\n', scripted_searcher.graph)832833834######################################################################835# Run Evaluation836# --------------837#838# Finally, we will run evaluation of the chatbot model using the TorchScript839# models. If converted correctly, the models will behave exactly as they840# would in their eager-mode representation.841#842# By default, we evaluate a few common query sentences. If you want to843# chat with the bot yourself, uncomment the ``evaluateInput`` line and844# give it a spin.845#846847848# Use appropriate device849scripted_searcher.to(device)850# Set dropout layers to ``eval`` mode851scripted_searcher.eval()852853# Evaluate examples854sentences = ["hello", "what's up?", "who are you?", "where am I?", "where are you from?"]855for s in sentences:856evaluateExample(s, scripted_searcher, voc)857858# Evaluate your input by running859# ``evaluateInput(traced_encoder, traced_decoder, scripted_searcher, voc)``860861862######################################################################863# Save Model864# ----------865#866# Now that we have successfully converted our model to TorchScript, we867# will serialize it for use in a non-Python deployment environment. To do868# this, we can simply save our ``scripted_searcher`` module, as this is869# the user-facing interface for running inference against the chatbot870# model. When saving a Script module, use script_module.save(PATH) instead871# of torch.save(model, PATH).872#873874scripted_searcher.save("scripted_chatbot.pth")875876877