Artificial neural network are biologically inspired and consists of elements that more or less perform in manner that is analogous to the human brain .Some of the brains characteristics exhibited are that they learn from experience and generalize from examples and can extract important information from irrelevant data set.

The artificial neuron

The complexity of the human nervous system built up of neurons is quite staggering, with neurons receiving, processing and transmitting electrochemical signals .the artificial neuron is designed to duplicate this very complex task.
Each unit performs a relatively simple job: takes in values from external sources and use this to compute an output signal which is propagated to other neurons. In addition to processing, another task is the adjustment of the weights. Within Neural Network it is useful to distinguish three types of units: input units which receive data from outside the neural network, output units which send data out of the neural network, and hidden layers whose input and output signals remain within the neural network.

During training, weights can be updated either synchronously or asynchronously. With synchronous updating, all units update their activation simultaneously; with asynchronous updating, each unit has a (usually fixed) probability of updating its activation at a time t, and usually only one unit will be able to do this at a time. In some cases the latter model has some advantages.

The basic methodology behind is , a set of inputs applied (which may be output of some other neuron ) .this input is multiplied to a arbitrary weight and then all the weighted inputs are summed up and then checked to determine the activation level .For the input applied there is a fixed target response desired at the output. The error resulting from the output produced and the desired output is used to change the weights, and thus gradually the desired output is found with acceptable error .This process is called learning.
For the learning process to be accurate the data set for training should be large and should cover a significant part of operating conditions, with relatively less noise.

Disadvantages: – it requires large amount of data to train, and it takes much processing time to train large neural nets.

THE mathematical model of artificial neuron

When a functional model duplicating a biological neuron is created, there are three basic parts, first one being weights .And others being adder to sum up inputs and an activity function control. Generally the acceptable range of output is usually between 0 and 1 or -1 and 1.
Let there be p inputs X0 …… Xp
And weights for each input being W0 …… Wp
The weights are multiplied to input and then summed up: –
X0W0 + X1W1 + X2W2 +…….. + XpWp =net
If net is greater than the threshold value then the output is 1 else its 0.
The weight is analogous to the synaptic strength

Activation function: – the function used to get the threshold value is termed as the squashing function. Common non linear functions used are

• Sigmoid function :- out = 1(1+𝑒−𝑛𝑒𝑡)

• Hyperbolic tangent function :- out = tanh(net)

• Step function :- out = 1 if net >0
out = 0 if net =0 and undefined at net0
out = -1 if net <0 and undefined at net=0Single layer and multi-layer artificial neural network

The simplest of neural network is the single layer neural network .but fail to solve some very simple problems (Exclusive –OR problem).larger and more complex networks (multi layer network) provide greater computational capabilities, provided the activation function is non linear.

Training of neural network

A neural network is trained so that application of a set of inputs produces the desired set of outputs .the training is done by sequentially applying the input vector and adjusting the network according to the output. The algorithms to train a network do ensure that the weights gradually converge so as to obtain a desired output. There are two types of training, supervised and unsupervised.

Supervised training: – in this type of training a set training pair is made comprising of the input and target vector .the output of the network is found and is compared to the target network and the difference is send back to the network to adjust the weights.

Unsupervised training: – this kind of training requires no target vector and no comparison of the output is made with the desired output. The training set only has the input vectors.

Reinforcement Learning: – This type of learning is a intermediate form of the above two types of learning. Here the learning machine does some action on the environment and gets a feedback response from the environment. The learning system grades its action acceptable or unacceptable based on the response received and accordingly adjust its parameters. Generally, parameter adjustment is continued until an equilibrium state occurs, following which there will be no more changes in its parameters. The self organizing neural learning may be categorized under this type of learning.

Perceptrons

They are single layer of artificial neural network and results in output of one or zero. it’s a single layer neural network whose weights and biases are trained with a target vector to obtain the desired output vector. The training used is called the perceptron learning .One of the greatest advantages is its ability to generalize from training data .They are mostly used in pattern clustering .Although it has some drawbacks ,being a single layer network it cannot simulate a simple exclusive –or gate.
The perceptron network consists of a single neuron connected to two or more inputs through a set of equal number of weights, with an additional bias input.
The neuron calculates its output using the following equation:
X * W + b > T
X * W + b < TWhere X is the input vector fed into the network, W is the Weight vector and b is the bias.Perceptron learning:-

Perceptron learning is of the supervised type i.e. the error in output is used to change the weights .The input vectors are fed to the network one after the another. If the network’s output is correct, no changes are made. Otherwise, the weights and biases are updated,Using the Perceptron learning rule.

When the complete input training vector are fed it is called an epoch. When such an epoch results without the occurrence of any error or acceptable error , the learning of perceptron is said to be done .After the learning is done ,if a input vector in supplied the perceptron will give the correct output and if a new input vector is presented (which wasn’t in the training data ) then the neural net will exhibit generalization ,resulting in output nearly same as the output for a nearly same training input .

The delta rule:-

The delta rule is a learning rule used for updating the weights of the artificial neurons in a single-layer perceptron.
The delta rule can be summarized as follows:-
For all i:
µ=[ T – A ] In case of µ=0 the output obtained is correct
Wn+1(i) = Wn(i) + [ T – A ] * X(i)

A learning rate co efficient is also sometimes used ( Ʊ )
Wn+1(i) = Wn(i) + Ʊ *[ T – A ] * X(i)

The default value of Ʊ is: – Ʊ = (𝑀𝑎𝑥𝑋− 𝑀𝑖𝑛 𝑋)𝑁

If bias are used they are modified by: – b = b + [ T – A ]
where T is the expected result , A is the actual output of the neuron,
W is the vector of weights, X is the input vector to the network, and b is the bias.
If a solution exists and a algorithm using this rule ,then it has been proven to converge on a solution in finite time .

Linear seperablity

Like the exclusive or gate , there are some other functions which cannot be resolved using a single neuron layer. Such functions are called linearly inseparable . When a single straight line completely separates two different cluster of points in a two-dimensional mapping they , they are said to be linearly separable. For a three input case the separation is done by a flat plane cutting through a 3 d space. In general, for n input case, they are linearly separable in n-dimensional space if they can be separated by an (n − 1)-dimensional hyper plane.

Limitations

Perceptron networks have several limitations. First, the linear seperablity limits the representational power of the perceptron .it can only classify only linearly separable sets of vectors. If a straight line or (n-1 dimensional ) plane can be drawn to separate the input vectors into their correct categories, the input vectors are linearly separable and the perceptron will find the solution. If the vectors are not linearly separable learning will never reach a point where all vectors are classified properly and will always produce error.
Secondly, output values of a perceptron can take on only one of two values (True or False).
The most famous example of the perceptron’s inability to solve problems with linearly nonseparable vectors is the logical exclusive-or problem.
To overcome this limitation more number of layers are added. A multi layer network is formed by cascading number of single layer networks. these multi layer network perform a great deal more then single layer

Back propagation

Since the single layer network severely limit in what they could represent ,algorithms were needed to train multi layer networks. Backpropagation is a systematic way to train multilayer neural networks. the artificial neurons are organized in layers, and send their signals “forward”, and then the errors are propagated backwards.

Multilayer network :- The first set of neurons serve only as distribution points of input .the input signal is simply passed through the weights and each neuron in subsequent layer produces the output .The network receives inputs by neurons in the input layer, and the output of the network is given by the neurons on an output layer. There may be one or more intermediate hidden layers. The backpropagation algorithm uses supervised learning, which means that we provide the algorithm with examples of the inputs and outputs we want the network to compute, and then the error (difference between actual and expected results) is calculated. With this error the weights are readjusted .The idea of the backpropagation algorithm is to reduce this error, until the neural network learns the training data. The training begins with random weights, and the goal is to adjust them so that the error will be minimal.
The weights are changed according to :-
Wpq,k(i+1) = Wpq,k(i) + Ʊ *A(1-A)[T-A]* A
Where W is weight , T is target output , A is actual output and Ʊ is rate training coefficient.
Adjusting the weight of hidden layer:- Here the training is more complicated because the
hidden layers do not have any target value for comparison .The backpropagation algorithm
trains the hidden layers by propagating the output error through the network layer by layer
adjusting weights of each layer.

Before the start of the training all the weights are set to small random numbers to eliminate the possibility of network being over saturated by large value of weights. A drawback here is if the weights start at same values and the desired neural structure needs different values then this network will never learn .