











Real-Time streaming
and choice of an encoder
For the real-time media streaming with a limited buffering
time and therefore relatively small number M
of media packets carried in a transmission block, the required length of the
transmission block cannot be derived from M
assuming the
. Hence we must suggest a specific encoder and propose the
number of media packets M to be
carried by the transmission block. Smaller is M shorter is the buffering delay, important for the real-time
media, but also higher will be the cost of FEC overhead. Before playing, the
receiver must hold in the buffer enough packets to restore the recoverable
losses. The receiving side of the media application is already equipped with a
playback buffer to compensate the network jitter and to reorder packets
arriving in wrong order. Thus, if necessary the playback buffer must be
extended enough to consider also the packet holding resulting from the decoding
needs.
Both [Huang05] and [Johansson02] (for combining with the AMR source coding) adopted simple FEC schemes: a packet level XOR and duplicates. For adding redundancy [Huang05] has proposed for every four media packets three modes with one, two and four redundant packets. In the first mode, the redundant packet is the XOR product of the four media packets. In the second mode, the redundant packets are: the XOR product of the first two media packets, and the XOR product of the last two media packets. In the third mode, four redundant packets are simply the duplicates of the four media packets. [Johansson02] proposed to extend the size of the packet with the copy of the previous packet or an XOR product of the two previous packets (or their corresponding samples encoded at lower rate).
A little more elaborated erasure resilient codes however can
provide a much stronger protection. According the third FEC mode of [Huang05] a duplicate packet is sent for every media packet.
If for the two original media packets a
and b, i.e. out of four transmitted packets,
we receive only two packets, there is a chance of about 33% that one of the
packets will left unrecoverable: when you receive a (or b) with its own
duplicate. The situation can be however improved, if we transmit a as the first redundant packet and
(operation
is the binary XOR) as
the second redundant packet. Under the same loss scenario, where you receive
only two packets out of four transmitted, the chance that one of the packets
will not be recovered is lower, but still exists and is of about 17% (that is
when you receive a with its own
duplicate). Yet another improvement is possible. We can split the packet a into two equal portions
. If the packet size was 20 bytes we will have thus 10 bytes
in each of the halves
and
. Similarly we break the second packet into two halves:
. As before we transmit the two original packets a and b unchanged, and additionally two redundant packets. This time the
first redundant packet is
and the second redundant
packet is
. With such a FEC, we can now observe that under the same
loss scenario (two packet losses out of four transmitted) we always can recover
the two original media packets from any two available packets.
The author of [Nguyen03] has chosen a Reed Solomon RS(30,23) code with 23 data packets and 7 redundant packets for each FEC block. Hence, similarly to the last example above, the lost packets can be entirely recovered if there are no more than 7 losses of any type of packets.
Similarly to [Nguyen03] in
our choice of an encoder we rely on any Maximum Distance Separable (MDS)
erasure resilient code, Reed-Solomon is an example. We however do not fix the
FEC block length, and will use different RS codes for the changing loss
patterns. The small example above was an MDS code and is denoted as RS(4,2)
pointing on 2 media packets in a FEC block of 4 packets. RS(4,2) is an extended
Reed Solomon code, since if a packet size (or here the same – the symbol size) s is reduced to only two bits (not less,
because in our method we must split the packets into two halves), then the classical
Reed Solomon codeword length cannot be longer than
(codeword length,
number of symbols or packets, all are the same here). A systematic erasure
resilient MDS code adds the redundant packets along the original media packets,
which are sent unchanged. It is a good choice for real-time media, because,
when there are too many losses such that the FEC block cannot be decoded, a few
survived original media packets, if any, could be the media fragments, still being
useful e.g. via passive error concealment or the human perception. MDS is not a
rateless code, like LT [Luby02] or Raptor [Shokrollahi04], but its rate range is largely
sufficient for real-time streaming. RS algorithm based on 8 bits symbols is
widely used, and can be scaled on packets producing FEC blocks with 255
packets, which is 10 times longer than our maximum guess of the number of media
packets in a block. A possibility of a rate increase of 10 times is more than
sufficient.