What is Learning Rate?

Learning Rate

Deep Learning

The learning rate is a hyperparameter that controls the step size during gradient descent optimization. Too high a learning rate causes instability, while too low a rate leads to slow convergence.

Understanding Learning Rate

The learning rate is a critical hyperparameter that controls how much a model's weights are adjusted during each step of gradient descent optimization. A learning rate that is too high causes training to overshoot optimal values and diverge, while one that is too low results in painfully slow convergence or getting stuck in poor local minima. Finding the right learning rate is one of the most impactful aspects of hyperparameter tuning, and techniques like learning rate schedulers, warmup periods, and cyclical learning rates dynamically adjust this value during training. The Adam optimizer and other adaptive methods automatically scale learning rates per parameter, reducing some of the sensitivity. In practice, researchers often use learning rate finder techniques to identify a good starting value. The interaction between learning rate and batch size is also important, with larger batches generally tolerating higher learning rates.

Linear Regression

Back to glossary

Learning Rate

Understanding Learning Rate

Related in Deep Learning

Activation Function

Adam Optimizer

Adapter Layers

Attention Mechanism

Autoencoder

Backpropagation

Batch Normalization

Batch Size