Zero Initialiser for Biases using get_variable in TensorFlow

Zero Initialiser for Biases using get_variable in TensorFlow

Introduction

In TensorFlow, initialising variables with appropriate values is crucial for efficient model training. For biases, which are typically added to the output of a layer, a common practice is to initialise them to zero. This article will delve into using the get_variable function with a zero initialiser for biases.

Understanding the Importance of Bias Initialisation

Bias terms in neural networks have a significant impact on model performance. Here’s why proper initialisation matters:

  • Breaking Symmetry: Zero biases help break the symmetry in the initial state of neurons, allowing them to learn unique features during training.
  • Activation Function Impact: Biases influence the output of activation functions. Initialising them to zero ensures that activations are not skewed initially, aiding in gradient-based learning.
  • Avoiding Dead Neurons: In certain activation functions, like ReLU, zero biases can lead to “dead neurons” (neurons that never activate). Zero initialisation helps prevent this by ensuring a reasonable starting point.

Implementing Zero Initialiser with get_variable

TensorFlow’s get_variable function provides a convenient way to create and initialise variables. Here’s a step-by-step approach to using a zero initialiser:

1. Import Necessary Libraries

import tensorflow as tf

2. Define the Zero Initialiser

zero_initializer = tf.zeros_initializer()

3. Create the Bias Variable

bias = tf.get_variable( 'bias', shape=[output_size], initializer=zero_initializer, dtype=tf.float32 )

Explanation:

  • 'bias': The name of the variable.
  • [output_size]: The shape of the bias tensor (number of output units).
  • zero_initializer: The initialiser to use (in this case, zero initialisation).
  • tf.float32: The data type of the bias variable.

4. Use the Bias in Your Model

Once created, the bias can be added to the output of a layer in your model:

output = tf.matmul(input, weights) + bias

Example: A Simple Neural Network with Zero Bias Initialisation

import tensorflow as tf # Define model parameters input_size = 10 output_size = 5 # Input placeholder input_placeholder = tf.placeholder(tf.float32, shape=[None, input_size]) # Weights weights = tf.get_variable( 'weights', shape=[input_size, output_size], initializer=tf.random_normal_initializer() ) # Zero initialiser for bias zero_initializer = tf.zeros_initializer() bias = tf.get_variable( 'bias', shape=[output_size], initializer=zero_initializer ) # Model output output = tf.matmul(input_placeholder, weights) + bias # Session and execution with tf.Session() as sess: sess.run(tf.global_variables_initializer()) input_data = [[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0]] result = sess.run(output, feed_dict={input_placeholder: input_data}) print(result)

Output:

[[ 0. 0. 0. 0. 0. ]]

The output shows that the initial values of the bias are all zero, as expected. This initialisation will then be used to train the model during subsequent optimization steps.

Conclusion

Using a zero initialiser for biases with get_variable in TensorFlow is a common and effective practice. It ensures that biases start with a neutral value, allowing the model to learn appropriate biases during training, leading to more robust and well-performing neural networks.

Leave a Reply

Your email address will not be published. Required fields are marked *