# Zemor's decoding algorithm

This is the approved revision of this page, as well as being the most recent.

Zemor's algorithm, designed and developed by Gilles Zemor, is a recursive low-complexity approach to code construction. It is an improvement over the algorithm of Sipser and Spielman.

Zemor considered a typical class of Sipser–Spielman construction of expander codes, where the underlying graph is bipartite graph. Sipser and Spielman introduced a constructive family of asymptotically good linear-error codes together with a simple parallel algorithm that will always remove a constant fraction of errors. The article is based on Dr. Venkatesan Guruswami's course notes

## Code construction

Zemor's algorithm is based on a type of expander graphs called Tanner graph. The construction of code was first proposed by Tanner. The codes are based on double cover $d$, regular expander $G$, which is a bipartite graph. $G$ =$\left(V,E\right)$, where $V$ is the set of vertices and $E$ is the set of edges and $V$ = $A$$\cup$$B$ and $A$$\cap$$B$ = $\emptyset$, where $A$ and $B$ denotes the set of 2 vertices. Let $n$ be the number of vertices in each group, i.e, $|A| =|B| =n$. The edge set $E$ be of size $N$ =$nd$ and every edge in $E$ has one endpoint in both $A$ and $B$. $E(v)$ denotes the set of edges containing $v$.

Assume an ordering on $V$, therefore ordering will be done on every edges of $E(v)$ for every $v \in V$. Let finite field $\mathbb{F}=GF(2)$, and for a word $x=(x_e), e\in E$ in $\mathbb{F}^N$, let the subword of the word will be indexed by $E(v)$. Let that word be denoted by $(x)_v$. The subset of vertices $A$ and $B$ induces every word $x\in \mathbb{F}^N$ a partition into $n$ non-overlapping sub-words $\left(x\right)_v\in \mathbb{F}^d$, where $v$ ranges over the elements of $A$. For constructing a code $C$, consider a linear subcode $C_o$, which is a $[d,r_od,\delta]$ code, where $q$, the size of the alphabet is $2$. For any vertex $v \in V$, let $v(1), v(2),\ldots,v(d)$ be some ordering of the $d$ vertices of $E$ adjacent to $v$. In this code, each bit $x_e$ is linked with an edge $e$ of $E$.

We can define the code $C$ to be the set of binary vectors $x = \left( x_1,x_2,\ldots,x_N \right)$ of $\{0,1\}^N$ such that, for every vertex $v$ of $V$, $\left(x_{v(1)}, x_{v(2)},\ldots, x_{v(d)}\right)$ is a code word of $C_o$. In this case, we can consider a special case when every vertex of $E$ is adjacent to exactly $2$ vertices of $V$. It means that $V$ and $E$ make up, respectively, the vertex set and edge set of $d$ regular graph $G$.

Let us call the code $C$ constructed in this way as $\left(G,C_o\right)$ code. For a given graph $G$ and a given code $C_o$, there are several $\left(G,C_o\right)$ codes as there are different ways of ordering edges incident to a given vertex $v$, i.e., $v(1), v(2),\ldots,v(d)$. In fact our code $C$ consist of all codewords such that $x_v\in C_o$ for all $v \in A, B$. The code $C$ is linear $[N,K,D]$ in $\mathbb{F}$ as it is generated from a subcode $C_o$, which is linear. The code $C$ is defined as $C=\{c\in \mathbb{F}^N :(c)_v \in C_o\}$ for every $v\in V$.

In this figure, $(x)_v =\left(x_{e1}, x_{e2}, x_{e3}, x_{e4}\right)\in C_o$. It shows the graph $G$ and code $C$.

In matrix $G$, let $\lambda$ is equal to the second largest eigen value of adjacency matrix of $G$. Here the largest eigen value is $d$. Two important claims are made:

### Claim 1

$\left(\dfrac{K}{N}\right)\geq 2r_o-1$
. Let $R$ be the rate of a linear code constructed from a bipartite graph whose digit nodes have degree $m$ and whose subcode nodes have degree $n$. If a single linear code with parameters $\left(n,k\right)$ and rate $r = \left(\dfrac{k}{n}\right)$ is associated with each of the subcode nodes, then $k\geq 1-\left(1-r\right)m$.

#### Proof

Let $R$ be the rate of the linear code, which is equal to $K/N$ Let there are $S$ subcode nodes in the graph. If the degree of the subcode is $n$, then the code must have $\left(\dfrac{n}{m}\right) S$ digits, as each digit node is connected to $m$ of the $\left(n\right)S$ edges in the graph. Each subcode node contributes $(n-k)$ equations to parity check matrix for a total of $\left(n-k\right) S$. These equations may not be linearly independent. Therefore, $\left(\dfrac{K}{N}\right)\geq \left(\dfrac{(\dfrac{n}{m})S - (n-k)S}{(\dfrac{n}{m}) S}\right)$
$\geq 1-m\left(\dfrac{n-k}{n}\right)$
$\geq 1-m \left(1-r\right)$, Since the value of $m$,i.e., the digit node of this bipartite graph is $2$ and here $r = r_o$, we can write as:
$\left(\dfrac{K}{N}\right)\geq 2r_o -1$

### Claim 2

$D\geq N\left(\dfrac{(\delta-(\dfrac{\lambda}{d}))}{(1-(\dfrac{\lambda}{d})})\right)^2$
$=N\left(\delta^2- O\left(\dfrac{\lambda}{d}\right)\right)$$\rightarrow (1)$

If $S$ is linear code of rate $r$, block code length $d$, and minimum relative distance $\delta$, and if $B$ is the edge vertex incidence graph of a $d$ – regular graph with second largest eigen value $\lambda$, then the code $C(B,S)$ has rate at least $2r_o -1$ and minimum relative distance at least $\left(\left(\dfrac{\delta- \left(\dfrac{\lambda}{d}\right)}{1-\left(\dfrac{\lambda}{d}\right)}\right)\right)^2$.

#### Proof

Let $B$ be derived from the $d$ regular graph $G$. So, the number of variables of $C(B,S)$ is $\left(\dfrac{dn}{2}\right)$ and the number of constraints is $n$. According to Alon - Chung, if $X$ is a subset of vertices of $G$ of size $\gamma n$, then the number of edges contained in the subgraph is induced by $X$ in $G$ is at most $\left(\dfrac{dn}{2}\right) \left(\gamma^2 + (\dfrac{\lambda}{d})\gamma \left(1-\gamma\right)\right)$.

As a result, any set of $\left(\dfrac{dn}{2}\right) \left(\gamma^2 + \left(\dfrac{\lambda}{d}\right)\gamma \left(1-\gamma\right)\right)$ variables will be having at least $\gamma n$ constraints as neighbours. So the average number of variables per constraint is : $\left(\dfrac{(\dfrac{2nd}{2}) \left(\gamma^2 + (\dfrac{\lambda}{d})\gamma \left(1-\gamma\right)\right)}{\gamma n}\right)$$= d\left( \gamma + (\dfrac{\lambda}{d}) \left( 1-\gamma\right)\right)$$\rightarrow (2)$

So if $d\left( \gamma + (\dfrac{\lambda}{d}) \left( 1-\gamma\right)\right) < \gamma d$, then a word of relative weight $\left(\gamma^2 + (\dfrac{\lambda}{d})\gamma \left(1-\gamma\right)\right)$, cannot be a codeword of $C(B,S)$. The inequality $(2)$ is satisfied for $\gamma < \left(\dfrac{1-(\dfrac{\lambda}{d})}{\delta-(\dfrac{\lambda}{d})}\right)$. Therefore, $C(B,S)$ cannot have a non zero codeword of relative weight $\left(\dfrac{\delta-(\dfrac{\lambda}{d})}{1-(\dfrac{\lambda}{d})}\right)^2$ or less.

In matrix $G$, we can assume that $\lambda/d$ is bounded away from $1$. For those values of $d$ in which $d-1$ is odd prime, there are explicit constructions of sequences of $d$ - regular bipartite graphs with arbitrarily large number of vertices such that each graph $G$ in the sequence is a Ramanujan graph. It is called Ramanujan graph as it satisfies the inequality $\lambda(G) \leq 2\sqrt{d-1}$. Certain expansion properties are visible in graph $G$ as the separation between the eigen values $d$ and $\lambda$. If the graph $G$ is Ramanujan graph, then that expression $(1)$ will become $0$ eventually as $d$ becomes large.

## Zemor's algorithm

The iterative decoding algorithm written below alternates between the vertices $A$ and $B$ in $G$ and corrects the codeword of $C_o$ in $A$ and then it switches to correct the codeword $C_o$ in $B$. Here edges associated with a vertex on one side of a graph are not incident to other vertex on that side. In fact, it doesn't matter in which order, the set of nodes $A$ and $B$ are processed. The vertex processing can also be done in parallel.

The decoder $\mathbb{D}:\mathbb{F}^d \rightarrow C_o$stands for a decoder for $C_o$ that recovers correctly with any codewords with less than $\left(\dfrac{d}{2}\right)$ errors.

### Decoder algorithm

Received word : $w=(w_e), e\in E$
 $z \leftarrow w$ For $t \leftarrow 1$ to $m$ do //$m$ is the number of iterations { if ($t$ is odd) // Here the algorithm will alternate between its two vertex sets.$X \leftarrow A$ else $X \leftarrow B$ Iteration $t$: For every $v \in X$, let $(z)_v \leftarrow \mathbb{D}((z)_v)$ // Decoding $z_v$ to its nearest codeword. } Output: $z$

### Explanation of the algorithm

Since $G$ is bipartite, the set $A$ of vertices induces the partition of the edge set $E$ = $\cup_{v\in A} E_v$ . The set $B$ induces another partition, $E$ = $\cup_{v\in B} E_v$ .

Let $w \in \{0,1\}^N$ be the received vector, and recall that $N=dn$. The first iteration of the algorithm consists of applying the complete decoding for the code induced by $E_v$ for every $v \in A$ . This means that for replacing, for every $v \in A$, the vector $\left( w_{v(1)}, w_{v(2)}, \ldots, w_{v(d)}\right)$ by one of the closest codewords of $C_o$. Since the subsets of edges $E_v$ are disjoint for $v \in A$, the decoding of these $n$ subvectors of $w$ may be done in parallel.

The iteration will yield a new vector $z$. The next iteration consists of applying the preceding procedure to $z$ but with $A$ replaced by $B$. In other words, it consists of decoding all the subvectors induced by the vertices of $B$. The coming iterations repeat those two steps alternately applying parallel decoding to the subvectors induced by the vertices of $A$ and to the subvectors induced by the vertices of $B$.
Note: [If $d=n$ and $G$ is the complete bipartite graph, then $C$ is a product code of $C_o$ with itself and the above algorithm reduces to the natural hard iterative decoding of product codes].

Here, the number of iterations, $m$ is $\left(\dfrac{(\log{n})}{\log(2-\alpha)}\right)$. In general, the above algorithm can correct a code word whose Hamming weight is no more than $(\dfrac{1}{2}).\alpha N \delta \left((\dfrac{\delta}{2})-(\dfrac{\lambda}{d})\right) =\left((\dfrac{1}{4}).\alpha N (\delta^2- O(\dfrac{\lambda}{d})\right)$ for values of $\alpha < 1$. Here, the decoding algorithm is implemented as a circuit of size $O(N \log{N} )$ and depth $O(\log{N})$ that returns the codeword given that error vector has weight less than $\alpha N \delta^2 (1-\epsilon)/4$ .

### Theorem

If $G$ is a Ramanujan graph of sufficiently high degree, for any $\alpha < 1$, the decoding algorithm can correct $(\dfrac{\alpha \delta_o^2}{4})(1-\in) N$ errors, in $O(\log {n})$ rounds ( where the big- $O$ notation hides a dependence on $\alpha$). This can be implemented in linear time on a single processor; on $n$ processors each round can be implemented in constant time.

#### Proof

Since the decoding algorithm is insensitive to the value of the edges and by linearity, we can assume that the transmitted codeword is the all zeros - vector. Let the received codeword be $w$. The set of edges which has an incorrect value while decoding is considered. Here by incorrect value, we mean $1$ in any of the bits. Let $w=w^0$ be the initial value of the codeword, $w^1, w^2,\ldots, w^t$ be the values after first, second . . . $t$ stages of decoding. Here, $X^i={e\in E|x_e^i =1}$, and $S^i ={v\in V^i | E_v \cap X^{i+1} !=\emptyset}$. Here $S^i$ corresponds to those set of vertices that was not able to successfully decode their codeword in the $i^{th}$ round. From the above algorithm $S^1 as number of unsuccessful vertices will be corrected in every iteration. We can prove that $S^0>S^1>S^2>\cdots$is a decreasing sequence. In fact, $|S_{i+1}|<=(\dfrac{1}{2-\alpha})|S_i|$. As we are assuming, $\alpha<1$, the above equation is in a geometric decreasing sequence. So, when $|S_i|, more than $log_{2-\alpha} n$ rounds are necessary. Furthermore, $\sum|S_i|=n\sum(\dfrac{1}{(2-\alpha)^i})=O(n)$, and if we implement the $i^{th}$ round in $O(|S_i|)$ time, then the total sequential running time will be linear.

## Drawbacks of Zemor's algorithm

1. It is lengthy process as the number of iterations $m$ in decoder algorithm takes is $[(\log{ n})/(\log(2-\alpha))]$
2. Zemor's decoding algorithm finds it difficult to decode erasures. A detailed way of how we can improve the algorithm is

given in.