Our analysis is motivated by conducting investigative experiments in-order to observe the effect of initialization schemes together with the activation functions used across the layers. The goal of the study is to evaluate the effect of these parameters in the performance of the network, evaluating their influence in saturation of hidden layers.
During the study, we explored different combinations of hyper-parameters for each of the model that we have used.
Initialization of parameters:
Weight and bias initialization:
Activation Functions:
Designing the network should be done in consideration of different aspects. Initialization of parameters is one of the most important criteria among these. All these factors influence the performance of the developed network. Exploding or vanishing activations and gradients is one of the common problems that can occur if the parameters chosen is of smaller values or too small. The variance of activations will drop in each layer. For example, if we choose sigmoid activation function , as it is approximately closer to 0, we may loose non-linearity which means no benefit in having multiple layers. Also, if the activations gets larger with successive layers, saturation of these values with gradients approaching to 0 may occur.                                                                                      
Experimental Setting and Datasets:
For experimental study, we have used public liver datasets. CT data from LiTS dataset and 3Dircadb were used training and testing the pre-trained models. LiTS dataset consist of 130 train and 70 test data. 
LiTS Dataset:
Explain about the images in lits. How many data it has, specifications.
3DIRCADB dataset: