Andrej Baranovskij Blog: July 2018

Monday, July 30, 2018

Text Classification with Deep Neural Network in TensorFlow - Simple Explanation

Text classification implementation with TensorFlow can be simple. One of the areas where text classification can be applied - chatbot text processing and intent resolution. I will describe step by step in this post, how to build TensorFlow model for text classification and how classification is done. Please refer to my previous post related to similar topic - Contextual Chatbot with TensorFlow, Node.js and Oracle JET - Steps How to Install and Get It Working. I would recommend to go through this great post about chatbot implementation - Contextual Chatbots with Tensorflow.

Complete source code is available in GitHub repo (refer to the steps described in the blog referenced above).

Text classification implementation:

Step 1: Preparing Data

Tokenise patterns into array of words
Lower case and stem all words. Example: Pharmacy = pharm. Attempt to represent related words
Create list of classes - intents
Create list of documents - combination between list of patterns and list of intents

Python implementation:

Step 2: Preparing TensorFlow Input

[X: [0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, ...N], Y: [0, 0, 1, 0, 0, 0, ...M]]
[X: [0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, ...N], Y: [0, 0, 0, 1, 0, 0, ...M]]
Array representing pattern with 0/1. N = vocabulary size. 1 when word position in vocabulary is matching word from pattern
Array representing intent with 0/1. M = number of intents. 1 when intent position in list of intents/classes is matching current intent

Python implementation:

Step 3: Training Neural Network

Use tflearn - deep learning library featuring a higher-level API for TensorFlow
Define X input shape - equal to word vocabulary size
Define two layers with 8 hidden neurones - optimal for text classification task (based on experiments)
Define Y input shape - equal to number of intents
Apply regression to find the best equation parameters
Define Deep Neural Network model (DNN)
Run model.fit to construct classification model. Provide X/Y inputs, number of epochs and batch size
Per each epoch, multiple operations are executed to find optimal model parameters to classify future input converted to array of 0/1
Batch size

Smaller batch size requires less memory. Especially important for datasets with large vocabulary
Typically networks train faster with smaller batches. Weights and network parameters are updated after each propagation
The smaller the batch the less accurate estimate of the gradient (function which describes the data) could be

Python implementation:

Step 4: Initial Model Testing

Tokenise input sentence - split it into array of words
Create bag of words (array with 0/1) for the input sentence - array equal to the size of vocabulary, with 1 for each word found in input sentence
Run model.predict with given bag of words array, this will return probability for each intent

Python implementation:

Step 5: Reuse Trained Model

For better reusability, it is recommended to create separate TensorFlow notebook, to handle classification requests
We can reuse previously created DNN model, by loading it with TensorFlow pickle

Python implementation:

Step 6: Text Classification

Define REST interface, so that function will be accessible outside TensorFlow
Convert incoming sentence into bag of words array and run model.predict
Consider results with probability higher than 0.25 to filter noise
Return multiple identified intents (if any), together with assigned probability

Python implementation:

Thursday, July 19, 2018

Oracle VBCS - Pay As You Go Cloud Model Experience Explained

If you are considering starting using VBCS cloud service from Oracle, may be this post will be useful. I will share my experience with pay as you go model.

Two payment models are available:

1. Pay As You Go - good when accessing VBCS time to time. Can be terminated at any time
2. Monthly Flex - good when need to run VBCS 24/7. Requires commitment, can't be terminated at any time

When you create Oracle Cloud account, initially you will get 30 days free trial period. At the end of that period (or earlier), you can upgrade to billable plan. To upgrade, go to account management and choose to upgrade promotional offer - you will be given choice to go with Pay As You Go or Monthly Flex:

As soon as you upgrade to Pay As You Go, you will start seeing monthly usage amount in the dashboard. Also it shows hourly usage of VBCS instance, for the one you will be billed:

Click on monthly usage amount, you will see detail view per each service billing. When VBCS instance is stopped (in case of Pay As You Go) - you will be billed only for hardware storage (Compute Classic) - this is relatively very small amount:

There are two options, how you can create VBCS instance - either autonomous VBCS or customer managed VBCS. To be able to stop/start VBCS instance and avoid billing when instance is not used (in case of Pay As You Go) - make sure to go with customer managed VBCS. In this example, VBCS instance was used only for 1 hour and then it was stopped, it can be started again at anytime:

To manage VBCS instance, you would need to navigate to Oracle Cloud Stack UI. From here you can start stop both DB and VBCS in single action. It is not enough to stop VBCS, make sure to stop DB too, if you are not using it:

Sunday, July 15, 2018

ADF Postback Payload Size Optimization

Recently I came across property called oracle.adf.view.rich.POSTBACK_PAYLOAD_TYPE. This property helps to optimize postback payload size. It is described in ADF Faces configuration section - A.2.3.16 Postback Payload Size Optimization. ADF partial request is executing HTTP post with values from all fields included. When postback property is set to dirty, it will include into HTTP post only changed values. As result - server will get only changed attributes, potentially this can reduce server time processing and make HTTP request size smaller. This especially can be important for large forms, with many fields.

Let's take a look into example. After clicking on any button in the form, go to network monitor and study Form Data section. You will see ID's and values for all fields included in the UI. All fields are submitted with HTTP request by default, even these fields were not changed:

Postback optimization property can be set in web.xml. By default it's value is full, change it to dirty:

With value set to dirty, try to change at least one field and then press any button. Observe Form Data section in network monitor - only fields with changed values will be submitted:

Try to test it in your project and see the difference.

Check my sample app for this use case on GitHub.

Tuesday, July 10, 2018

Contextual Chatbot with TensorFlow, Node.js and Oracle JET - Steps How to Install and Get It Working

Blog reader was asking to provide a list of steps, to guide through install and run process for chatbot solution with TensorFlow, Node.JS and Oracle JET.

Resources:

1. Chatbot UI and context handling backend implementation - Machine Learning Applied - TensorFlow Chatbot UI with Oracle JET Custom Component

2. Classification implementation - Classification - Machine Learning Chatbot with TensorFlow

3. TensorFlow installation - TensorFlow - Getting Started with Docker Container and Jupyter Notebook

4. Source code - GitHub

Install and run steps:

1. Download source code from GitHub repository: