Andrej Baranovskij Blog: Text Classification with Deep Neural Network in TensorFlow

Monday, July 30, 2018

Text Classification with Deep Neural Network in TensorFlow - Simple Explanation

Text classification implementation with TensorFlow can be simple. One of the areas where text classification can be applied - chatbot text processing and intent resolution. I will describe step by step in this post, how to build TensorFlow model for text classification and how classification is done. Please refer to my previous post related to similar topic - Contextual Chatbot with TensorFlow, Node.js and Oracle JET - Steps How to Install and Get It Working. I would recommend to go through this great post about chatbot implementation - Contextual Chatbots with Tensorflow.

Complete source code is available in GitHub repo (refer to the steps described in the blog referenced above).

Text classification implementation:

Step 1: Preparing Data

Tokenise patterns into array of words
Lower case and stem all words. Example: Pharmacy = pharm. Attempt to represent related words
Create list of classes - intents
Create list of documents - combination between list of patterns and list of intents

Python implementation:

Step 2: Preparing TensorFlow Input

[X: [0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, ...N], Y: [0, 0, 1, 0, 0, 0, ...M]]
[X: [0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, ...N], Y: [0, 0, 0, 1, 0, 0, ...M]]
Array representing pattern with 0/1. N = vocabulary size. 1 when word position in vocabulary is matching word from pattern
Array representing intent with 0/1. M = number of intents. 1 when intent position in list of intents/classes is matching current intent

Python implementation:

Step 3: Training Neural Network

Use tflearn - deep learning library featuring a higher-level API for TensorFlow
Define X input shape - equal to word vocabulary size
Define two layers with 8 hidden neurones - optimal for text classification task (based on experiments)
Define Y input shape - equal to number of intents
Apply regression to find the best equation parameters
Define Deep Neural Network model (DNN)
Run model.fit to construct classification model. Provide X/Y inputs, number of epochs and batch size
Per each epoch, multiple operations are executed to find optimal model parameters to classify future input converted to array of 0/1
Batch size

Smaller batch size requires less memory. Especially important for datasets with large vocabulary
Typically networks train faster with smaller batches. Weights and network parameters are updated after each propagation
The smaller the batch the less accurate estimate of the gradient (function which describes the data) could be

Python implementation:

Step 4: Initial Model Testing

Tokenise input sentence - split it into array of words
Create bag of words (array with 0/1) for the input sentence - array equal to the size of vocabulary, with 1 for each word found in input sentence
Run model.predict with given bag of words array, this will return probability for each intent

Python implementation:

Step 5: Reuse Trained Model

For better reusability, it is recommended to create separate TensorFlow notebook, to handle classification requests
We can reuse previously created DNN model, by loading it with TensorFlow pickle

Python implementation:

Step 6: Text Classification

Define REST interface, so that function will be accessible outside TensorFlow
Convert incoming sentence into bag of words array and run model.predict
Consider results with probability higher than 0.25 to filter noise
Return multiple identified intents (if any), together with assigned probability

Python implementation:

Monday, July 30, 2018

Text Classification with Deep Neural Network in TensorFlow - Simple Explanation

No comments: