Hyperparameter is a configuration variable external to the model which cannot be learned from the data. These variables directly control the training process and the structure of the machine learning models. They are often manually set prior to the start of the training process and remain constant throughout it. Examples of common hyperparameters include learning rate, number of hidden layers in a deep learning model, the ‘k’ in k-nearest neighbors, and the ‘C’ or ‘gamma’ in Support Vector Machines.
Hyperparameter tuning, also known as hyperparameter optimization, refers to the process of choosing a set of optimal hyperparameters to maximize the predictive accuracy of a model. This search for the optimal hyperparameter values is a critical step in building a powerful and efficient machine learning model as the performance of machine learning algorithms is highly dependent on the choice of hyperparameters.
There are several strategies to perform hyperparameter tuning. Grid search, for example, builds a model for every combination of hyperparameters specified and evaluates each model. Random search randomly selects the combinations of hyperparameters to build and evaluate models, while Bayesian Optimization and Gradient-based Optimization are more advanced methods that seek to build a probability model of the objective function to find the optimal hyperparameters. The goal of hyperparameter tuning is to navigate the search space of hyperparameters in an efficient manner to achieve the highest model performance.