Chapter 5: Positional Encoding in Transformers

5.2 Understanding Positional Encoding

The positional encoding for a position $p$ in the sequence and a dimension $i$ in the embedding space is computed as:

$PE_{(p,2i)} = sin(p / 10000^{2i/d_{model}})$

$PE_{(p,2i+1)} = cos(p / 10000^{2i/d_{model}})$

where:

$PE_{(p,2i)}$ and $PE_{(p,2i+1)}$ are the positional encodings for the position $p$ and dimensions $2i$ and $2i+1$.
$p$ is the position in the sequence.
$i$ is the dimension in the embedding space.
$d_{model}$ is the dimensionality of the embeddings.

These formulas generate sinusoidal curves for positional encodings. For each dimension of the embedding, the model learns a different frequency of sine and cosine curves. This variation allows the model to learn to attend to relative positions since for any fixed offset $k$, $PE_{pos+k}$ can be represented as a linear function of $PE_{pos}$.

Let's take a look at how to implement these positional encoding formulas in Python:

import numpy as np
import matplotlib.pyplot as plt

def positional_encoding(sequence_length, d_model):
    positions = np.arange(sequence_length)[:, np.newaxis]
    div_terms = np.exp(np.arange(0, d_model, 2) * -(np.log(10000.0) / d_model))
    pos_enc = np.zeros((sequence_length, d_model))
    pos_enc[:, 0::2] = np.sin(positions * div_terms)
    pos_enc[:, 1::2] = np.cos(position

s * div_terms)
    return pos_enc

pos_enc = positional_encoding(50, 512)

plt.figure(figsize=(12,8))
plt.pcolormesh(pos_enc, cmap='viridis')
plt.xlabel('Depth')
plt.xlim((0, 512))
plt.ylim((50,0))
plt.ylabel('Position')
plt.colorbar()
plt.show()

This code generates a 2D numpy array with positional encodings for a sequence of length 50 and an embedding dimension of 512. The plot shows the values of the positional encodings. As you can see, the values oscillate between -1 and 1 in a sinusoidal pattern.

5.2 Understanding Positional Encoding

The positional encoding for a position $p$ in the sequence and a dimension $i$ in the embedding space is computed as:

$PE_{(p,2i)} = sin(p / 10000^{2i/d_{model}})$

$PE_{(p,2i+1)} = cos(p / 10000^{2i/d_{model}})$

where:

$PE_{(p,2i)}$ and $PE_{(p,2i+1)}$ are the positional encodings for the position $p$ and dimensions $2i$ and $2i+1$.
$p$ is the position in the sequence.
$i$ is the dimension in the embedding space.
$d_{model}$ is the dimensionality of the embeddings.

These formulas generate sinusoidal curves for positional encodings. For each dimension of the embedding, the model learns a different frequency of sine and cosine curves. This variation allows the model to learn to attend to relative positions since for any fixed offset $k$, $PE_{pos+k}$ can be represented as a linear function of $PE_{pos}$.

Let's take a look at how to implement these positional encoding formulas in Python:

import numpy as np
import matplotlib.pyplot as plt

def positional_encoding(sequence_length, d_model):
    positions = np.arange(sequence_length)[:, np.newaxis]
    div_terms = np.exp(np.arange(0, d_model, 2) * -(np.log(10000.0) / d_model))
    pos_enc = np.zeros((sequence_length, d_model))
    pos_enc[:, 0::2] = np.sin(positions * div_terms)
    pos_enc[:, 1::2] = np.cos(position

s * div_terms)
    return pos_enc

pos_enc = positional_encoding(50, 512)

plt.figure(figsize=(12,8))
plt.pcolormesh(pos_enc, cmap='viridis')
plt.xlabel('Depth')
plt.xlim((0, 512))
plt.ylim((50,0))
plt.ylabel('Position')
plt.colorbar()
plt.show()

This code generates a 2D numpy array with positional encodings for a sequence of length 50 and an embedding dimension of 512. The plot shows the values of the positional encodings. As you can see, the values oscillate between -1 and 1 in a sinusoidal pattern.

5.2 Understanding Positional Encoding

The positional encoding for a position $p$ in the sequence and a dimension $i$ in the embedding space is computed as:

$PE_{(p,2i)} = sin(p / 10000^{2i/d_{model}})$

$PE_{(p,2i+1)} = cos(p / 10000^{2i/d_{model}})$

where:

$PE_{(p,2i)}$ and $PE_{(p,2i+1)}$ are the positional encodings for the position $p$ and dimensions $2i$ and $2i+1$.
$p$ is the position in the sequence.
$i$ is the dimension in the embedding space.
$d_{model}$ is the dimensionality of the embeddings.

These formulas generate sinusoidal curves for positional encodings. For each dimension of the embedding, the model learns a different frequency of sine and cosine curves. This variation allows the model to learn to attend to relative positions since for any fixed offset $k$, $PE_{pos+k}$ can be represented as a linear function of $PE_{pos}$.

Let's take a look at how to implement these positional encoding formulas in Python:

import numpy as np
import matplotlib.pyplot as plt

def positional_encoding(sequence_length, d_model):
    positions = np.arange(sequence_length)[:, np.newaxis]
    div_terms = np.exp(np.arange(0, d_model, 2) * -(np.log(10000.0) / d_model))
    pos_enc = np.zeros((sequence_length, d_model))
    pos_enc[:, 0::2] = np.sin(positions * div_terms)
    pos_enc[:, 1::2] = np.cos(position

s * div_terms)
    return pos_enc

pos_enc = positional_encoding(50, 512)

plt.figure(figsize=(12,8))
plt.pcolormesh(pos_enc, cmap='viridis')
plt.xlabel('Depth')
plt.xlim((0, 512))
plt.ylim((50,0))
plt.ylabel('Position')
plt.colorbar()
plt.show()

This code generates a 2D numpy array with positional encodings for a sequence of length 50 and an embedding dimension of 512. The plot shows the values of the positional encodings. As you can see, the values oscillate between -1 and 1 in a sinusoidal pattern.

5.2 Understanding Positional Encoding

The positional encoding for a position $p$ in the sequence and a dimension $i$ in the embedding space is computed as:

$PE_{(p,2i)} = sin(p / 10000^{2i/d_{model}})$

$PE_{(p,2i+1)} = cos(p / 10000^{2i/d_{model}})$

where:

$PE_{(p,2i)}$ and $PE_{(p,2i+1)}$ are the positional encodings for the position $p$ and dimensions $2i$ and $2i+1$.
$p$ is the position in the sequence.
$i$ is the dimension in the embedding space.
$d_{model}$ is the dimensionality of the embeddings.

These formulas generate sinusoidal curves for positional encodings. For each dimension of the embedding, the model learns a different frequency of sine and cosine curves. This variation allows the model to learn to attend to relative positions since for any fixed offset $k$, $PE_{pos+k}$ can be represented as a linear function of $PE_{pos}$.

Let's take a look at how to implement these positional encoding formulas in Python:

import numpy as np
import matplotlib.pyplot as plt

def positional_encoding(sequence_length, d_model):
    positions = np.arange(sequence_length)[:, np.newaxis]
    div_terms = np.exp(np.arange(0, d_model, 2) * -(np.log(10000.0) / d_model))
    pos_enc = np.zeros((sequence_length, d_model))
    pos_enc[:, 0::2] = np.sin(positions * div_terms)
    pos_enc[:, 1::2] = np.cos(position

s * div_terms)
    return pos_enc

pos_enc = positional_encoding(50, 512)

plt.figure(figsize=(12,8))
plt.pcolormesh(pos_enc, cmap='viridis')
plt.xlabel('Depth')
plt.xlim((0, 512))
plt.ylim((50,0))
plt.ylabel('Position')
plt.colorbar()
plt.show()

This code generates a 2D numpy array with positional encodings for a sequence of length 50 and an embedding dimension of 512. The plot shows the values of the positional encodings. As you can see, the values oscillate between -1 and 1 in a sinusoidal pattern.

The App is Under a Quick Maintenance

We apologize for the inconvenience. Please come back later

5.2 Understanding Positional Encoding

5.2 Understanding Positional Encoding

5.2 Understanding Positional Encoding

5.2 Understanding Positional Encoding