# Raptor code

In computer science, **Raptor codes** (* rapid tornado*;[1] see Tornado codes) are the first known class of fountain codes with linear time encoding and decoding. They were invented by Amin Shokrollahi in 2000/2001 and were first published in 2004 as an extended abstract. Raptor codes are a significant theoretical and practical improvement over LT codes, which were the first practical class of fountain codes.

Raptor codes, as with fountain codes in general, encode a given source block of data consisting of a number *k* of equal size symbols into a potentially limitless sequence of encoding symbols such that reception of any *k* or more encoding symbols allows the source block to be recovered with some non-zero probability. The probability that the source block can be recovered increases with the number of encoding symbols received above *k* becoming very close to 1, once the number of received encoding symbols is only very slightly larger than *k*. For example, with the latest generation of Raptor codes, the RaptorQ codes, the chance of decoding failure when *k* encoding symbols have been received is less than 1%, and the chance of decoding failure when *k+2* encoding symbols have been received is less than one in a million. (See *Recovery probability and overhead* section below for more discussion on this.) A symbol can be any size, from a single byte to hundreds or thousands of bytes.

Raptor codes may be systematic or non-systematic. In the systematic case, the symbols of the original source block, i.e. the source symbols, are included within the set of encoding symbols. An example of a systematic Raptor code is the code defined by the 3rd Generation Partnership Project for use in mobile cellular wireless broadcast and multicast and also used by DVB-H standards for IP datacast to handheld devices (see external links). The Raptor codes in these standards is defined also in IETF RFC 5053. The most advanced version of a practical Raptor code is RaptorQ defined in IETF RFC 6330.

Information about an efficient software implementation of the RaptorQ code specified in IETF RFC 6330 (the most advanced fountain code), can be found at the website for the Codornices project at ICSI .

Online codes are another example of a non-systematic fountain code.

## Overview

Raptor codes are formed by the concatenation of two codes.

A fixed rate erasure code, usually with a fairly high rate, is applied as a 'pre-code' or 'outer code'. This pre-code may itself be a concatenation of multiple codes, for example in the code standardized by 3GPP a high density parity check code derived from the binary Gray sequence is concatenated with a simple regular low density parity check code. Another possibility would be a concatenation of a Hamming code with a low density parity check code.

The inner code takes the result of the pre-coding operation and generates a sequence of encoding symbols. The inner code is a form of LT codes. Each encoding symbol is the XOR of a pseudo-randomly chosen set of symbols from the pre-code output. The number of symbols which are XOR'ed together to form an output symbol is chosen pseudo-randomly for each output symbol according to a specific probability distribution.

This distribution, as well as the mechanism for generating pseudo-random numbers for sampling this distribution and for choosing the symbols to be XOR'ed, must be known to both sender and receiver. In one approach, each symbol is accompanied with an identifier which can be used as a seed to a pseudo-random number generator to generate this information, with the same process being followed by both sender and receiver.

In the case of non-systematic Raptor codes, the source data to be encoded is used as the input to the pre-coding stage.

In the case of systematic Raptor codes, the input to the pre-coding stage is obtained by first applying the inverse of the encoding operation that generates the first *k* output symbols to the source data. Thus, applying the normal encoding operation to the resulting symbols causes the original source symbols to be regenerated as the first *k* output symbols of the code. It is necessary to ensure that the pseudo-random
processes which generate the first *k* output symbols generate an operation which is invertible.

## Decoding

Two approaches are possible for decoding Raptor codes. In a concatenated approach, the inner code is decoded first, using a belief propagation algorithm, as used for the LT codes. Decoding succeeds if this operation recovers a sufficient number of symbols, such that the outer code can recover the remaining symbols using the decoding algorithm appropriate for that code.

In a combined approach, the relationships between symbols defined by both the inner and outer codes are considered as a single combined set of simultaneous equations which can be solved by the usual means, typically by Gaussian elimination.

## Computational complexity

Raptor codes require *O(symbol size)* time to generate an encoding symbol from a source block, and require *O(source block size)* time to recover a source block from at least *k* encoding symbols.

## Recovery probability and overhead

The overhead is how many additional encoding symbols beyond the number *k* of source symbols in the original source block need to be
received to completely recover the source block.
(Based on elementary information theory considerations, complete recovery of a source block with *k* source symbols is not possible if less than *k* encoding symbols are received.)
The recovery probability is the probability that the source block is completely recovered upon receiving a given number of random encoding symbols generated from the source block.
The RaptorQ code specified in IETF RFC 6330 has the following trade-off between recovery probability and recovery overhead:

- Greater than 99% recovery probability with overhead of 0 symbols (recovery from
*k*received encoding symbols). - Greater than 99.99% recovery probability with overhead of 1 symbol (recovery from
*k+1*received encoding symbols). - Greater than 99.9999% recovery probability with overhead of 2 symbols (recovery from
*k+2*received encoding symbols).

These statements hold for the entire range of *k* supported in IETF RFC 6330, i.e., *k*=1,...,56403. See IETF RFC 6330 for more details.

## Legal status

Qualcomm, Inc. has published an IPR statement for the Raptor code specified in IETF RFC 5053, and an IPR statement for the more advanced RaptorQ code specified in IETF RFC 6330. These statements mirror the licensing commitment Qualcomm, Inc. has made with respect to the MPEG DASH standard. The MPEG DASH standard has been deployed by a wide variety of companies, including DASH Industry Forum member companies.

## Notes

- Amin Shokrollahi (31 January 2011).
*The Development of Raptor Codes*(Speech). Invited talk at the Kungliga Tekniska högskolan. Retrieved 24 February 2012.

2. The Open Source Implementation of Raptor Code RFC5053 can be found here : https://code.google.com/p/raptor-code-rfc/

3. Information about an efficient software implementation of the RaptorQ code specified in IETF RFC 6330 (the most advanced fountain code), can be found at the website for the Codornices project at ICSI .

## References

- Amin Shokrollahi and Michael Luby (2011). "Raptor Codes".
*Foundations and Trends in Communications and Information Theory*. Now Publishers.**6**(3–4): 213–322. doi:10.1561/0100000060. - Amin Shokrollahi, "Raptor Codes," IEEE Transactions on Information Theory, vol. 52, pp. 2551-2567, 2006. PDF (requires login)
- 3GPP The 3rd Generation Partnership Project
- DVB Digital Video Broadcasting
- 3GPP TS26.346 3GPP Technical Specification for Multimedia Broadcast/Multicast Service: Protocols and Codecs.
- RFC5053 Raptor Forward Error Correction Scheme for Object Delivery
- DVB-H IP Datacasting specifications
- RFC6330 RaptorQ Forward Error Correction Scheme for Object Delivery
- "IPR" Search Result for RFC 5053, with statements by some patent owners
- "IPR" Search Result for RFC 6330, with statements by some patent owners