What is Positional Encoding?

Deep Learning

Positional Encoding

Positional encoding adds information about token position to input embeddings in Transformer models, which otherwise have no inherent sense of sequence order. This enables the model to understand word order and sentence structure.

Understanding Positional Encoding

Positional encoding is a technique used in transformer architectures to inject information about the order and position of tokens in a sequence, since the self-attention mechanism itself is permutation-invariant and has no inherent notion of sequence order. The original transformer paper introduced sinusoidal positional encodings that use sine and cosine functions of different frequencies to represent each position, allowing the model to learn relative positioning patterns. Without positional encoding, a transformer would treat "the cat sat on the mat" identically to any rearrangement of those same words. Modern variants include learned positional embeddings, where position representations are trained alongside other parameters, and rotary position embeddings (RoPE) used in models like LLaMA. Positional encoding is essential for tasks where word order carries meaning, which encompasses virtually all natural language processing and sequence modeling applications.

Is AI recommending your brand?

Find out if ChatGPT, Perplexity, and Gemini mention you when people search your industry.

Check your brand — $9

Pre-training

Back to full glossary

Positional Encoding

Understanding Positional Encoding

Is AI recommending your brand?

Related Deep Learning Terms

Activation Function

Adam Optimizer

Adapter Layers

Attention Mechanism

Autoencoder

Backpropagation

Batch Normalization

Batch Size