Monday, July 30, 2018

Text Classification with Deep Neural Network in TensorFlow - Simple Explanation

Text classification implementation with TensorFlow can be simple. One of the areas where text classification can be applied - chatbot text processing and intent resolution. I will describe step by step in this post, how to build TensorFlow model for text classification and how classification is done. Please refer to my previous post related to similar topic - Contextual Chatbot with TensorFlow, Node.js and Oracle JET - Steps How to Install and Get It Working. I would recommend to go through this great post about chatbot implementation - Contextual Chatbots with Tensorflow.

Complete source code is available in GitHub repo (refer to the steps described in the blog referenced above).

Text classification implementation:

Step 1: Preparing Data
  • Tokenise patterns into array of words
  • Lower case and stem all words. Example: Pharmacy = pharm. Attempt to represent related words 
  • Create list of classes - intents
  • Create list of documents - combination between list of patterns and list of intents
Python implementation:


Step 2: Preparing TensorFlow Input
  • [X: [0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, ...N], Y: [0, 0, 1, 0, 0, 0, ...M]]
  • [X: [0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, ...N], Y: [0, 0, 0, 1, 0, 0, ...M]]
  • Array representing pattern with 0/1. N = vocabulary size. 1 when word position in vocabulary is matching word from pattern
  • Array representing intent with 0/1. M = number of intents. 1 when intent position in list of intents/classes is matching current intent
Python implementation:


Step 3: Training Neural Network
  • Use tflearn - deep learning library featuring a higher-level API for TensorFlow
  • Define X input shape - equal to word vocabulary size
  • Define two layers with 8 hidden neurones - optimal for text classification task (based on experiments)
  • Define Y input shape - equal to number of intents
  • Apply regression to find the best equation parameters
  • Define Deep Neural Network model (DNN)
  • Run model.fit to construct classification model. Provide X/Y inputs, number of epochs and batch size
  • Per each epoch, multiple operations are executed to find optimal model parameters to classify future input converted to array of 0/1
  • Batch size
    • Smaller batch size requires less memory. Especially important for datasets with large vocabulary
    • Typically networks train faster with smaller batches. Weights and network parameters are updated after each propagation
    • The smaller the batch the less accurate estimate of the gradient (function which describes the data) could be
Python implementation:


Step 4: Initial Model Testing
  • Tokenise input sentence - split it into array of words
  • Create bag of words (array with 0/1) for the input sentence - array equal to the size of vocabulary, with 1 for each word found in input sentence
  • Run model.predict with given bag of words array, this will return probability for each intent
Python implementation:


Step 5: Reuse Trained Model
  • For better reusability, it is recommended to create separate TensorFlow notebook, to handle classification requests
  • We can reuse previously created DNN model, by loading it with TensorFlow pickle
Python implementation:


Step 6: Text Classification
  • Define REST interface, so that function will be accessible outside TensorFlow
  • Convert incoming sentence into bag of words array and run model.predict
  • Consider results with probability higher than 0.25 to filter noise
  • Return multiple identified intents (if any), together with assigned probability
Python implementation:

No comments: