Linear Model With Tensorflow

5 minute read

Published: July 26, 2017

Linear Model is a foundational model when it comes to Machine Learning, this simple article is to explore building a simple Linear model with Tensorflow. The basic idea is to lay a foundation of a model that is very important in understanding deep neural network. Deep Neural Network (DNN) is intuitively getting a good representation of your input data that a model can use to predict rightly the information contained in the data as would a human. Okay, that sounds a little complex, DNN takes your data, which may be image, text and even stock market data, and passes it through layers of neurons - here neurons means just function, such as the Linear function we are trying to model here - and uses this new representations to predict perfectly the information contained in the input data.

Why Tensorflow?

Tensorflow is one of the many state of the art Deep Learning software Library that makes learning a DNN fast and scalable. It’s well optimized and can abstract a lot of codes that would have gone into training a neural network by coding all the mathematics, note that Tensorflow itself is a library for numerical computation. It is built to be scalabel and distributed and can work perfectly in production.

Linear Regresssion

Linear Regression is the mapping of a function $f(x)$ to $R^n$ with $f(x, W, b)$ where $f(x) = Wx + b$ predicts a continous value $R^n$ . $W$ is the weight that gives a representation of the data $x$ for the predicted value $R^n$ and $b$ is the bias that helps to generalize the Weight $W$ on the data.

$f(x) = Wx + b$

To minize the error on the function, one will apply gradient descent.

The idea is compute the error from the prediction and minimize the error with gradient descent

$Error = \sum_{i=0}^m \frac{1}{m}(f(x_i) - y_i)^2$

For every iteration over data sample (Batch Gradient Descent) Gradient Descent intuitively decrease the prediction error and optimize the weight to generalize on all samples by multiplying the prediction erorr ( $f(x)-y$ ) with the data sample (of that case), scaled by an alpha over all the training cases and finally deducting this new value from weight of previous iteration

Putting it into equation

$\nabla{W} = W_j - \alpha \frac{1}{m} \sum_{i=0}^m(f(x_i) - y_i)x_i$

Coding it up:

1. Import Required Packages and data

import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import xlrd
import pandas as pd

import python_utils

DATA_FILE = 'data/fire_theft.xls'

2. Read in Data using xlrd and specify learning rate

book = xlrd.open_workbook(DATA_FILE, encoding_override='utf-8')
sheet = book.sheet_by_index(0)
data = np.asarray([sheet.row_values(i) for i in range(1, sheet.nrows)])
n_samples = sheet.nrows - 1

train_X = data[:, 0].astype(np.float32)
train_Y= data[:, 1].astype(np.float32)
print(train_X.dtype)

n_sample = data.shape[0]
learning_rate = 0.001

3. Define Tenforflow objects that will compute the model

For input, weigth, bias and target

 X_n = tf.placeholder(np.float32) #specifing shape=[n_samples] worked just fine
 Y_n = tf.placeholder(np.float32)

 W = tf.Variable(0., name="W")
 b = tf.Variable(1., name="b")

Computing and minimize the error by computing gradient from scratch and try to descent it to the minimum error

 prediction_n = X_n * W + b
 error = prediction_n - Y_n
 mse = tf.reduce_sum(tf.square(error))

 gradient = 1/n_sample * (tf.reduce_sum(tf.multiply(X_n, error)))
 training_op = tf.assign(W, W - learning_rate * gradient)

4. Train the model

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(10):
        mse_val, _ = sess.run([mse, training_op], feed_dict={X_n:train_X, Y_n:train_Y})
        print(mse_val)
    best_theta = W.eval()
    print("Best Theta is: ", W.eval())

Okay so we are done training a Linear Model with Tensorflow. However, we can save ourselves from the headeache of the gradient descent Mathematics, this is where the power of Tensorflow really comes to play. 5. Replace Step 3 with Tenforflow optimizer

prediction_n = tf.add(tf.multiply(X_n, W), b)
error = prediction_n - Y_n
mse = tf.reduce_mean(tf.square(error))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(mse)

6. Retrain the model with Tensorflow Optimizer

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(10):
        mse_val, _ = sess.run([mse, optimizer], feed_dict={X_n:train_X, Y_n:train_Y})
        print(mse_val)

And We are Truly Done!!!

Key Tensorflow Issues to note:

tf.matmul() will only multiply a 2-D tensor hence for dot product multiplication between two 1-D Tensor, the available approach is to use tf.reduce_sum(tf.multiply(Vec_1, Vec_2))
tf.train.*Optimizers*() are well optimized to take your loss function and train it quite well. In first example, even while training on the examples one by one, the optimizer is able to yet optimize the loss function since it knows what to combine to give the necessary output
You can evaluate any of the Variables while the training is going on or is concluded consider the Best_theta part

Once you define the Error you can then call optimization functions to minize the error, Different optimization techniques are in Tensorflow that you can make use of, and they make the work a little easier Examples of optimizer that exists

AdaDeltaOptimizer
AdagradDAOptimizer
AdagradOptimizer
AdamOptimizer
FtrlOptimizer
GradientDescentOptimizer
MomentumOptimizer
ProximalGradientDescentOptimizer
ProximalAdagradOptimizer
RMSPropOptimizer
SyncReplicasOptimizer

Understanding the process of building a linear model is important in overall Deep Learning Networks this is because, Linear units - that contains basic afine transformation: $y = Wx + b$ are core to Deep Learning. Infacts, it is the basic building block for first ever Neural Network which is Perceptron.

Share on

Twitter Facebook Google+ LinkedIn

Adekunle Babatunde

Linear Model With Tensorflow

Why Tensorflow?

Linear Regresssion

Coding it up:

And We are Truly Done!!!

Key Tensorflow Issues to note:

Share on

Leave a Comment

You May Also Enjoy

Machine Learning Operations At Scale (part 2)

Setting up an Orchestration Engine for Machine learning operations with Kubernetes and Kubeflow.

Machine Learning Operations At Scale (part 1)

Introduction to MLOPs

Rethinking Data Engineering At Scale

Scalable Data Engineering: A Case For Build Your Own Platform