What is Reed Solomon Code?

Reliable communication over noisy channels is not just about detecting errors, but also about correcting them efficiently. While many error-correcting codes operate at the bit level, Reed–Solomon (RS) codes take a fundamentally different approach by working on symbols, making them especially effective against burst errors.

What is a Reed–Solomon Code?

A Reed–Solomon code is a block-based error-correcting code that operates on fixed-size symbols rather than individual bits. Each symbol typically represents multiple bits (for example, 8 bits = 1 byte).

RS codes belong to the class of maximum distance separable (MDS) codes, meaning they achieve the best possible trade-off between redundancy and error correction capability.

Why Reed–Solomon Codes Were Developed

Many real-world channels do not produce random single-bit errors. Instead, they produce burst errors, where multiple consecutive bits or symbols are corrupted together due to:

Fading in wireless and satellite links
Impulse noise
Synchronization loss
Storage media defects

Unlike bit-level codes:

A single symbol error may involve multiple corrupted bits
Reed–Solomon treats this as one error, regardless of how many bits inside the symbol are wrong

Bit-level codes struggle in such scenarios. Reed–Solomon codes were designed to correct entire corrupted symbols, making them extremely effective for these conditions.

RS Code Parameters: (n, k)

A Reed–Solomon code is usually described as RS(n, k):

$k$ → Number of data symbols
$n$ → Total symbols after encoding (data + parity)
$n − k$ → Number of parity symbols

The encoder takes k data symbols and adds redundancy to form an n symbol codeword. Please note that the data is represented as symbols instead of bits, one symbol can be 4 bits, 1 byte, 4 bytes and so on. depending on the implementation.

Error Correction Capability

A Reed–Solomon code can correct up to t symbol errors, where:

t = \left\lfloor \frac{n - k}{2} \right\rfloor

This means:

If 4 parity symbols are added, the code can correct 2 symbol errors
The position of errors does not need to be known in advance

How to generate RS Codes?

Parameters

RS(7,3) has the following parameters:

Parameter	Value	Meaning
n	7	Total symbols in the codeword
k	3	Number of data symbols
n-k	4	Number of parity symbols
GF size	8	Each symbol is 3 bits (0–7)

So there are 3 data symbols and 4 parity symbols.

Implementation

Let us understand how a data symbol $m$ with length $k$ is coded using RS codes to a Codeword symbol of length $n$ .

Let the data symbol have 3 bits each. So the data symbols can have values from 0 to 7. So for example, let’s take the data as below and has a length of 3. These are the symbols we want to transmit,

[m_0, m_1, m_2] = [2, 5, 3]

We now map the message $m$ on to a polynomial,

m(x) = m_2 x^2 + m_1 x + m_0

m(x) = 3 x^2 + 5 x + 2

Let’s generate the generator polynomial,

RS(7,3) → $n-k = 4$ parity symbols → generator polynomial has degree 4:

g(x) = (x-\alpha)(x-\alpha^2)(x-\alpha^3)(x-\alpha^4)

$\alpha$ is a primitive element of GF(2³)
Expanded $g(x)$ in GF(2³) will be:

g(x) = x^4 + \alpha^3 x^3 + \alpha x^2 + \alpha^2 x + 1

x^{n-k} m(x) = x^4 m(x) = 3x^6 + 5x^5 + 2x^4

Next we divide this with the generator polynomial, Divide $x^4 m(x)$ by $g(x)$ in GF(2³). Now the remainder $r(x)$ is the parity polynomial.

r(x) = r_3 x^3 + r_2 x^2 + r_1 x + r_0 = 6 x^3 + 4 x^2 + 2 x + 1

These are the 4 parity symbols, $[6,4,2,1]$ .

It’s now time to form the final RS Codeword,

c(x) = x^{n-k} m(x) + r(x)

c(x) = 3x^6 + 5x^5 + 2x^4 + 6x^3 + 4x^2 + 2x + 1

Data symbols occupy the higher-degree positions
Parity symbols occupy the lower-degree positions

Final transmitted codeword (symbols in order):

c = [c_0, c_1, c_2, c_3, c_4, c_5, c_6] = [3, 5, 2, 6, 4, 2, 1]

Real-World Applications

QR Codes: Use RS codes to remain readable even when partially damaged
CDs/DVDs: Correct scratches and fingerprints (up to 4000 consecutive error bits)
Satellite communications: NASA’s Voyager missions use RS codes
DSL/Cable modems: To correct impulse noise which causes burst errors

Why do we use polynomial representation?

RS codes use polynomial representation because polynomial division over finite fields provides a systematic way to generate parity symbols. The remainder from polynomial division naturally gives us the error-checking information we need.

Why GF(2³)?

The field size GF(2³) = 8 symbols means each symbol represents 3 bits. Larger fields allow longer codewords but require more complex arithmetic. Common choices are GF(2⁸) for byte-oriented systems.

Now we want the highest power of the message polynomial to be the same as the highest power of codeword which is 7 in our case, so we multiply the message polynomial with $x^{n-k}$ ,

You can try hands-on how the Reed Solomon encoder and decoder work together using the below jupyter notebook. We make use of the Python galois Library to understand Error detection and correction using RS Codes.