Linear Model With Tensorflow
Published:
Linear Model is a foundational model when it comes to Machine Learning, this simple article is to explore building a simple Linear model with Tensorflow. The basic idea is to lay a foundation of a model that is very important in understanding deep neural network. Deep Neural Network (DNN) is intuitively getting a good representation of your input data that a model can use to predict rightly the information contained in the data as would a human. Okay, that sounds a little complex, DNN takes your data, which may be image, text and even stock market data, and passes it through layers of neurons - here neurons means just function, such as the Linear function we are trying to model here - and uses this new representations to predict perfectly the information contained in the input data.
Why Tensorflow?
Tensorflow is one of the many state of the art Deep Learning software Library that makes learning a DNN fast and scalable. It’s well optimized and can abstract a lot of codes that would have gone into training a neural network by coding all the mathematics, note that Tensorflow itself is a library for numerical computation. It is built to be scalabel and distributed and can work perfectly in production.
Linear Regresssion
Linear Regression is the mapping of a function to with where predicts a continous value . is the weight that gives a representation of the data for the predicted value and is the bias that helps to generalize the Weight on the data.
To minize the error on the function, one will apply gradient descent.
The idea is compute the error from the prediction and minimize the error with gradient descent
For every iteration over data sample (Batch Gradient Descent) Gradient Descent intuitively decrease the prediction error and optimize the weight to generalize on all samples by multiplying the prediction erorr () with the data sample (of that case), scaled by an alpha over all the training cases and finally deducting this new value from weight of previous iteration
Putting it into equation
Coding it up:
1. Import Required Packages and data
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import xlrd
import pandas as pd
import python_utils
DATA_FILE = 'data/fire_theft.xls'
2. Read in Data using xlrd and specify learning rate
book = xlrd.open_workbook(DATA_FILE, encoding_override='utf-8')
sheet = book.sheet_by_index(0)
data = np.asarray([sheet.row_values(i) for i in range(1, sheet.nrows)])
n_samples = sheet.nrows - 1
train_X = data[:, 0].astype(np.float32)
train_Y= data[:, 1].astype(np.float32)
print(train_X.dtype)
n_sample = data.shape[0]
learning_rate = 0.001
3. Define Tenforflow objects that will compute the model
- For input, weigth, bias and target
X_n = tf.placeholder(np.float32) #specifing shape=[n_samples] worked just fine
Y_n = tf.placeholder(np.float32)
W = tf.Variable(0., name="W")
b = tf.Variable(1., name="b")
- Computing and minimize the error by computing gradient from scratch and try to descent it to the minimum error
prediction_n = X_n * W + b
error = prediction_n - Y_n
mse = tf.reduce_sum(tf.square(error))
gradient = 1/n_sample * (tf.reduce_sum(tf.multiply(X_n, error)))
training_op = tf.assign(W, W - learning_rate * gradient)
4. Train the model
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(10):
mse_val, _ = sess.run([mse, training_op], feed_dict={X_n:train_X, Y_n:train_Y})
print(mse_val)
best_theta = W.eval()
print("Best Theta is: ", W.eval())
Okay so we are done training a Linear Model with Tensorflow. However, we can save ourselves from the headeache of the gradient descent Mathematics, this is where the power of Tensorflow really comes to play. 5. Replace Step 3 with Tenforflow optimizer
prediction_n = tf.add(tf.multiply(X_n, W), b)
error = prediction_n - Y_n
mse = tf.reduce_mean(tf.square(error))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(mse)
6. Retrain the model with Tensorflow Optimizer
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for i in range(10):
mse_val, _ = sess.run([mse, optimizer], feed_dict={X_n:train_X, Y_n:train_Y})
print(mse_val)
And We are Truly Done!!!
Key Tensorflow Issues to note:
tf.matmul()
will only multiply a 2-D tensor hence for dot product multiplication between two 1-D Tensor, the available approach is to usetf.reduce_sum(tf.multiply(Vec_1, Vec_2))
tf.train.*Optimizers*()
are well optimized to take your loss function and train it quite well. In first example, even while training on the examples one by one, the optimizer is able to yet optimize the loss function since it knows what to combine to give the necessary output- You can evaluate any of the Variables while the training is going on or is concluded consider the Best_theta part
Once you define the Error you can then call optimization functions to minize the error, Different optimization techniques are in Tensorflow that you can make use of, and they make the work a little easier Examples of optimizer that exists
- AdaDeltaOptimizer
- AdagradDAOptimizer
- AdagradOptimizer
- AdamOptimizer
- FtrlOptimizer
- GradientDescentOptimizer
- MomentumOptimizer
- ProximalGradientDescentOptimizer
- ProximalAdagradOptimizer
- RMSPropOptimizer
- SyncReplicasOptimizer
Understanding the process of building a linear model is important in overall Deep Learning Networks this is because, Linear units - that contains basic afine transformation: are core to Deep Learning. Infacts, it is the basic building block for first ever Neural Network which is Perceptron.
Leave a Comment