Introduction to loss functions

3 min readMar 29, 2021

Basically a loss function is a method of evaluating how well your algorithm models your dataset. A neural network is one such approach which applies arithmetic operations over many layers until an appropriate solution is obtained. To check how accurate the solution is, loss functions are used.

Difference between a Loss Function and a Cost Function?

Loss function is for a single training dataset also called as error function.

Cost function is for the entire training dataset.

Regression Loss Functions

1.Mean Squared Error

Mean square error is measured as the average of squared difference between predictions and actual observations. The mean squared error function is widely used as it is simple, continuous and differentiable. MSE has nice mathematical properties which makes it easier to calculate gradients.

MSE is not Robust to outlier

The lesser the value of MSE, the better are the predictions.

2.Mean Absolute Error

It measured as the average of sum of absolute differences between predictions and actual observations. The main issue with the MAE is that it is not differentiable at its minimum.

MAE is more Robust to outlier as compared to MSE.

3. Huber Loss

where δ is a hyperparameter that controls the split between the two sub-function.

The Huber loss combines the properties of both MSE and MAE. It is quadratic for smaller errors (like 1st equation when delta parameter is greater) and linear otherwise (2nd equation when delta parameter is lesser).

Binary Classification Loss Functions

1. Binary Cross Entropy Loss

This function is used in classification where a network has to give two distinct outputs. The right side of the expression has an addition to account for the false positives in the results and penalise it. This function is also used for estimation of an anomaly.

2. Multi-Class Cross Entropy Loss

Multi-class classification is the predictive models in which the data points are assigned to more than two classes.

Multi-class cross-entropy is the default loss function for text classification problems. Output is the class with the maximum probability for the given input.

I hope this article will help you to lean and understand loss function.