Generalized minimum-distance decoding

In coding theory, generalized minimum-distance (GMD) decoding provides an efficient algorithm for decoding concatenated codes, which is based on using an errors-and-erasures decoder for the outer code.

A naive decoding algorithm for concatenated codes can not be an optimal way of decoding because it does not take into account the information that maximum likelihood decoding (MLD) gives. In other words, in the naive algorithm, inner received codewords are treated the same regardless of the difference between their hamming distances. Intuitively, the outer decoder should place higher confidence in symbols whose inner encodings are close to the received word. David Forney in 1966 devised a better algorithm called generalized minimum distance (GMD) decoding which makes use of those information better. This method is achieved by measuring confidence of each received codeword, and erasing symbols whose confidence is below a desired value. And GMD decoding algorithm was one of the first examples of soft-decision decoders. We will present three versions of the GMD decoding algorithm. The first two will be randomized algorithms while the last one will be a deterministic algorithm.

Setup

  • Hamming distance : Given two vectors u , v Σ n {\displaystyle u,v\in \Sigma ^{n}} the Hamming distance between u {\displaystyle u} and v {\displaystyle v} , denoted by Δ ( u , v ) {\displaystyle \Delta (u,v)} , is defined to be the number of positions in which u {\displaystyle u} and v {\displaystyle v} differ.
  • Minimum distance: Let C Σ n {\displaystyle C\subseteq \Sigma ^{n}} be a code. The minimum distance of code C {\displaystyle C} is defined to be d = min Δ ( c 1 , c 2 ) {\displaystyle d=\min \Delta (c_{1},c_{2})} where c 1 c 2 C {\displaystyle c_{1}\neq c_{2}\in C}
  • Code concatenation: Given m = ( m 1 , , m K ) [ Q ] K {\displaystyle m=(m_{1},\cdots ,m_{K})\in [Q]^{K}} , consider two codes which we call outer code and inner code
C out = [ Q ] K [ Q ] N , C in : [ q ] k [ q ] n , {\displaystyle C_{\text{out}}=[Q]^{K}\to [Q]^{N},\qquad C_{\text{in}}:[q]^{k}\to [q]^{n},}
and their distances are D {\displaystyle D} and d {\displaystyle d} . A concatenated code can be achieved by C out C in ( m ) = ( C in ( C out ( m ) 1 ) , , C in ( C out ( m ) N ) ) {\displaystyle C_{\text{out}}\circ C_{\text{in}}(m)=(C_{\text{in}}(C_{\text{out}}(m)_{1}),\ldots ,C_{\text{in}}(C_{\text{out}}(m)_{N}))} where C out ( m ) = ( ( C out ( m ) 1 , , ( m ) N ) ) . {\displaystyle C_{\text{out}}(m)=((C_{\text{out}}(m)_{1},\ldots ,(m)_{N})).} Finally we will take C out {\displaystyle C_{\text{out}}} to be RS code, which has an errors and erasure decoder, and K = O ( log N ) {\displaystyle K=O(\log N)} , which in turn implies that MLD on the inner code will be polynomial in N {\displaystyle N} time.
  • Maximum likelihood decoding (MLD): MLD is a decoding method for error correcting codes, which outputs the codeword closest to the received word in Hamming distance. The MLD function denoted by D M L D : Σ n C {\displaystyle D_{MLD}:\Sigma ^{n}\to C} is defined as follows. For every y Σ n , D M L D ( y ) = arg min c C Δ ( c , y ) {\displaystyle y\in \Sigma ^{n},D_{MLD}(y)=\arg \min _{c\in C}\Delta (c,y)} .
  • Probability density function : A probability distribution Pr {\displaystyle \Pr } on a sample space S {\displaystyle S} is a mapping from events of S {\displaystyle S} to real numbers such that Pr [ A ] 0 {\displaystyle \Pr[A]\geq 0} for any event A , Pr [ S ] = 1 {\displaystyle A,\Pr[S]=1} , and Pr [ A B ] = Pr [ A ] + Pr [ B ] {\displaystyle \Pr[A\cup B]=\Pr[A]+\Pr[B]} for any two mutually exclusive events A {\displaystyle A} and B {\displaystyle B}
  • Expected value: The expected value of a discrete random variable X {\displaystyle X} is
E [ X ] = x Pr [ X = x ] . {\displaystyle \mathbb {E} [X]=\sum _{x}\Pr[X=x].}

Randomized algorithm

Consider the received word y = ( y 1 , , y N ) [ q n ] N {\displaystyle \mathbf {y} =(y_{1},\ldots ,y_{N})\in [q^{n}]^{N}} which was corrupted by a noisy channel. The following is the algorithm description for the general case. In this algorithm, we can decode y by just declaring an erasure at every bad position and running the errors and erasure decoding algorithm for C out {\displaystyle C_{\text{out}}} on the resulting vector.

Randomized_Decoder
Given : y = ( y 1 , , y N ) [ q n ] N {\displaystyle \mathbf {y} =(y_{1},\dots ,y_{N})\in [q^{n}]^{N}} .

  1. For every 1 i N {\displaystyle 1\leq i\leq N} , compute y i = M L D C in ( y i ) {\displaystyle y_{i}'=MLD_{C_{\text{in}}}(y_{i})} .
  2. Set ω i = min ( Δ ( C in ( y i ) , y i ) , d 2 ) {\displaystyle \omega _{i}=\min(\Delta (C_{\text{in}}(y_{i}'),y_{i}),{\tfrac {d}{2}})} .
  3. For every 1 i N {\displaystyle 1\leq i\leq N} , repeat : With probability 2 ω i d {\displaystyle 2\omega _{i} \over d} , set y i ? , {\displaystyle y_{i}''\leftarrow ?,} otherwise set y i = y i {\displaystyle y_{i}''=y_{i}'} .
  4. Run errors and erasure algorithm for C out {\displaystyle C_{\text{out}}} on y = ( y 1 , , y N ) {\displaystyle \mathbf {y} ''=(y_{1}'',\ldots ,y_{N}'')} .

Theorem 1. Let y be a received word such that there exists a codeword c = ( c 1 , , c N ) C out C in [ q n ] N {\displaystyle \mathbf {c} =(c_{1},\cdots ,c_{N})\in C_{\text{out}}\circ {C_{\text{in}}}\subseteq [q^{n}]^{N}} such that Δ ( c , y ) < D d 2 {\displaystyle \Delta (\mathbf {c} ,\mathbf {y} )<{\tfrac {Dd}{2}}} . Then the deterministic GMD algorithm outputs c {\displaystyle \mathbf {c} } .

Note that a naive decoding algorithm for concatenated codes can correct up to D d 4 {\displaystyle {\tfrac {Dd}{4}}} errors.

Lemma 1. Let the assumption in Theorem 1 hold. And if y {\displaystyle \mathbf {y} ''} has e {\displaystyle e'} errors and s {\displaystyle s'} erasures (when compared with c {\displaystyle \mathbf {c} } ) after Step 1, then E [ 2 e + s ] < D . {\displaystyle \mathbb {E} [2e'+s']<D.}

Remark. If 2 e + s < D {\displaystyle 2e'+s'<D} , then the algorithm in Step 2 will output c {\displaystyle \mathbf {c} } . The lemma above says that in expectation, this is indeed the case. Note that this is not enough to prove Theorem 1, but can be crucial in developing future variations of the algorithm.

Proof of lemma 1. For every 1 i N , {\displaystyle 1\leq i\leq N,} define e i = Δ ( y i , c i ) . {\displaystyle e_{i}=\Delta (y_{i},c_{i}).} This implies that

i = 1 N e i < D d 2 ( 1 ) {\displaystyle \sum _{i=1}^{N}e_{i}<{\frac {Dd}{2}}\qquad \qquad (1)} Next for every 1 i N {\displaystyle 1\leq i\leq N} , we define two indicator variables:

X i ? = 1 y i = ? X i e = 1 C in ( y i ) c i   and   y i ? {\displaystyle {\begin{aligned}X{_{i}^{?}}=1&\Leftrightarrow y_{i}''=?\\X{_{i}^{e}}=1&\Leftrightarrow C_{\text{in}}(y_{i}'')\neq c_{i}\ {\text{and}}\ y_{i}''\neq ?\end{aligned}}} We claim that we are done if we can show that for every 1 i N {\displaystyle 1\leq i\leq N} :

E [ 2 X i e + X i ? ] 2 e i d ( 2 ) {\displaystyle \mathbb {E} \left[2X{_{i}^{e}+X{_{i}^{?}}}\right]\leqslant {2e_{i} \over d}\qquad \qquad (2)} Clearly, by definition

e = i X i e and s = i X i ? . {\displaystyle e'=\sum _{i}X_{i}^{e}\quad {\text{and}}\quad s'=\sum _{i}X_{i}^{?}.} Further, by the linearity of expectation, we get

E [ 2 e + s ] 2 d i e i < D . {\displaystyle \mathbb {E} [2e'+s']\leqslant {\frac {2}{d}}\sum _{i}e_{i}<D.} To prove (2) we consider two cases: i {\displaystyle i} -th block is correctly decoded (Case 1), i {\displaystyle i} -th block is incorrectly decoded (Case 2):

Case 1: ( c i = C in ( y i ) ) {\displaystyle (c_{i}=C_{\text{in}}(y_{i}'))}

Note that if y i = ? {\displaystyle y_{i}''=?} then X i e = 0 {\displaystyle X_{i}^{e}=0} , and Pr [ y i = ? ] = 2 ω i d {\displaystyle \Pr[y_{i}''=?]={\tfrac {2\omega _{i}}{d}}} implies E [ X i ? ] = Pr [ X i ? = 1 ] = 2 ω i d , {\displaystyle \mathbb {E} [X_{i}^{?}]=\Pr[X_{i}^{?}=1]={\tfrac {2\omega _{i}}{d}},} and E [ X i e ] = Pr [ X i e = 1 ] = 0 {\displaystyle \mathbb {E} [X_{i}^{e}]=\Pr[X_{i}^{e}=1]=0} .

Further, by definition we have

ω i = min ( Δ ( C in ( y i ) , y i ) , d 2 ) Δ ( C in ( y i ) , y i ) = Δ ( c i , y i ) = e i {\displaystyle \omega _{i}=\min \left(\Delta (C_{\text{in}}(y_{i}'),y_{i}),{\tfrac {d}{2}}\right)\leqslant \Delta (C_{\text{in}}(y_{i}'),y_{i})=\Delta (c_{i},y_{i})=e_{i}} Case 2: ( c i C in ( y i ) ) {\displaystyle (c_{i}\neq C_{\text{in}}(y_{i}'))}

In this case, E [ X i ? ] = 2 ω i d {\displaystyle \mathbb {E} [X_{i}^{?}]={\tfrac {2\omega _{i}}{d}}} and E [ X i e ] = Pr [ X i e = 1 ] = 1 2 ω i d . {\displaystyle \mathbb {E} [X_{i}^{e}]=\Pr[X_{i}^{e}=1]=1-{\tfrac {2\omega _{i}}{d}}.}

Since c i C in ( y i ) , e i + ω i d {\displaystyle c_{i}\neq C_{\text{in}}(y_{i}'),e_{i}+\omega _{i}\geqslant d} . This follows another case analysis when ( ω i = Δ ( C in ( y i ) , y i ) < d 2 ) {\displaystyle (\omega _{i}=\Delta (C_{\text{in}}(y_{i}'),y_{i})<{\tfrac {d}{2}})} or not.

Finally, this implies

E [ 2 X i e + X i ? ] = 2 2 ω i d 2 e i d . {\displaystyle \mathbb {E} [2X_{i}^{e}+X_{i}^{?}]=2-{2\omega _{i} \over d}\leq {2e_{i} \over d}.} In the following sections, we will finally show that the deterministic version of the algorithm above can do unique decoding of C out C in {\displaystyle C_{\text{out}}\circ C_{\text{in}}} up to half its design distance.

Modified randomized algorithm

Note that, in the previous version of the GMD algorithm in step "3", we do not really need to use "fresh" randomness for each i {\displaystyle i} . Now we come up with another randomized version of the GMD algorithm that uses the same randomness for every i {\displaystyle i} . This idea follows the algorithm below.

Modified_Randomized_Decoder
Given : y = ( y 1 , , y N ) [ q n ] N {\displaystyle \mathbf {y} =(y_{1},\ldots ,y_{N})\in [q^{n}]^{N}} , pick θ [ 0 , 1 ] {\displaystyle \theta \in [0,1]} at random. Then every for every 1 i N {\displaystyle 1\leq i\leq N} :

  1. Set y i = M L D C in ( y i ) {\displaystyle y_{i}'=MLD_{C_{\text{in}}}(y_{i})} .
  2. Compute ω i = min ( Δ ( C in ( y i ) , y i ) , d 2 ) {\displaystyle \omega _{i}=\min(\Delta (C_{\text{in}}(y_{i}'),y_{i}),{d \over 2})} .
  3. If θ < 2 ω i d {\displaystyle \theta <{\tfrac {2\omega _{i}}{d}}} , set y i ? , {\displaystyle y_{i}''\leftarrow ?,} otherwise set y i = y i {\displaystyle y_{i}''=y_{i}'} .
  4. Run errors and erasure algorithm for C out {\displaystyle C_{\text{out}}} on y = ( y 1 , , y N ) {\displaystyle \mathbf {y} ''=(y_{1}'',\ldots ,y_{N}'')} .

For the proof of Lemma 1, we only use the randomness to show that

Pr [ y i = ? ] = 2 ω i d . {\displaystyle \Pr[y_{i}''=?]={2\omega _{i} \over d}.} In this version of the GMD algorithm, we note that

Pr [ y i = ? ] = Pr [ θ [ 0 , 2 ω i d ] ] = 2 ω i d . {\displaystyle \Pr[y_{i}''=?]=\Pr \left[\theta \in \left[0,{\tfrac {2\omega _{i}}{d}}\right]\right]={\tfrac {2\omega _{i}}{d}}.} The second equality above follows from the choice of θ {\displaystyle \theta } . The proof of Lemma 1 can be also used to show E [ 2 e + s ] < D {\displaystyle \mathbb {E} [2e'+s']<D} for version2 of GMD. In the next section, we will see how to get a deterministic version of the GMD algorithm by choosing θ {\displaystyle \theta } from a polynomially sized set as opposed to the current infinite set [ 0 , 1 ] {\displaystyle [0,1]} .

Deterministic algorithm

Let Q = { 0 , 1 } { 2 ω 1 d , , 2 ω N d } {\displaystyle Q=\{0,1\}\cup \{{2\omega _{1} \over d},\ldots ,{2\omega _{N} \over d}\}} . Since for each i , ω i = min ( Δ ( y i , y i ) , d 2 ) {\displaystyle i,\omega _{i}=\min(\Delta (\mathbf {y_{i}'} ,\mathbf {y_{i}} ),{d \over 2})} , we have

Q = { 0 , 1 } { q 1 , , q m } {\displaystyle Q=\{0,1\}\cup \{q_{1},\ldots ,q_{m}\}} where q 1 < < q m {\displaystyle q_{1}<\cdots <q_{m}} for some m d 2 {\displaystyle m\leq \left\lfloor {\frac {d}{2}}\right\rfloor } . Note that for every θ [ q i , q i + 1 ] {\displaystyle \theta \in [q_{i},q_{i+1}]} , the step 1 of the second version of randomized algorithm outputs the same y . {\displaystyle \mathbf {y} ''.} . Thus, we need to consider all possible value of θ Q {\displaystyle \theta \in Q} . This gives the deterministic algorithm below.

Deterministic_Decoder
Given : y = ( y 1 , , y N ) [ q n ] N {\displaystyle \mathbf {y} =(y_{1},\ldots ,y_{N})\in [q^{n}]^{N}} , for every θ Q {\displaystyle \theta \in Q} , repeat the following.

  1. Compute y i = M L D C in ( y i ) {\displaystyle y_{i}'=MLD_{C_{\text{in}}}(y_{i})} for 1 i N {\displaystyle 1\leq i\leq N} .
  2. Set ω i = min ( Δ ( C in ( y i ) , y i ) , d 2 ) {\displaystyle \omega _{i}=\min(\Delta (C_{\text{in}}(y_{i}'),y_{i}),{d \over 2})} for every 1 i N {\displaystyle 1\leq i\leq N} .
  3. If θ < 2 ω i d {\displaystyle \theta <{2\omega _{i} \over d}} , set y i ? , {\displaystyle y_{i}''\leftarrow ?,} otherwise set y i = y i {\displaystyle y_{i}''=y_{i}'} .
  4. Run errors-and-erasures algorithm for C out {\displaystyle C_{\text{out}}} on y = ( y 1 , , y N ) {\displaystyle \mathbf {y} ''=(y_{1}'',\ldots ,y_{N}'')} . Let c θ {\displaystyle c_{\theta }} be the codeword in C out C in {\displaystyle C_{\text{out}}\circ C_{\text{in}}} corresponding to the output of the algorithm, if any.
  5. Among all the c θ {\displaystyle c_{\theta }} output in 4, output the one closest to y {\displaystyle \mathbf {y} }

Every loop of 1~4 can be run in polynomial time, the algorithm above can also be computed in polynomial time. Specifically, each call to an errors and erasures decoder of < d D / 2 {\displaystyle <dD/2} errors takes O ( d ) {\displaystyle O(d)} time. Finally, the runtime of the algorithm above is O ( N Q n O ( 1 ) + N T out ) {\displaystyle O(NQn^{O(1)}+NT_{\text{out}})} where T out {\displaystyle T_{\text{out}}} is the running time of the outer errors and erasures decoder.

See also

References

  • University at Buffalo Lecture Notes on Coding Theory – Atri Rudra
  • MIT Lecture Notes on Essential Coding Theory – Madhu Sudan
  • University of Washington – Venkatesan Guruswami
  • G. David Forney. Generalized Minimum Distance decoding. IEEE Transactions on Information Theory, 12:125–131, 1966