# AN codes

AN codes are error-correcting code that are used in arithmetic applications. Arithmetic codes were commonly used in computer processors to ensure the accuracy of its arithmetic operations when electronics were more unreliable. Arithmetic codes help the processor to detect when an error is made and correct it. Without these codes, processors would be unreliable since any errors would go undetected. AN codes are arithmetic codes that are named for the integers $A$ and $N$ that are used to encode and decode the codewords.

These codes differ from most other codes in that they use arithmetic weight to maximize the arithmetic distance between codewords as opposed to the hamming weight and hamming distance. The arithmetic distance between two words is a measure of the number of errors made while computing an arithmetic operation. Using the arithmetic distance is necessary since one error in an arithmetic operation can cause a large hamming distance between the received answer and the correct answer.

## Arithmetic Weight and Distance

The arithmetic weight of an integer $x$ in base $r$ is defined by

$w(x) = \min \{t | x=\sum_{i=1}^t a_i r^{n(i)}\}$

where $|{a_i}|$< $r$, $n(i) \geq 0$, and $r, n(i) \in \mathbb{Z}$. The arithmetic distance of a word is upper bounded by its hamming weight since any integer can be represented by its standard polynomial form of $x=\sum_{i=1}^n b_i r^i$ where the $b_i$ are the digits in the integer. Removing all the terms where $b_i = 0$ will simulate a $t$ equal to its hamming weight. The arithmetic weight will usually be less than the hamming weight since the $a_i$ are allowed to be negative. For example, the integer $x = 29$ which is $11101$ in binary has a hamming weight of $4$. This is a quick upper bound on the arithmetic weight since $x = 2^0 + 2^2 + 2^3 + 2^4$. However, since the $a_i$ can be negative, we can write $x = 2^5 - 2^1 - 2^0$ which makes the arithmetic weight equal to $3$.

The arithmetic distance between two integers is defined by

$d(x,y) = w(x-y)$

This is one of the primary metrics used when analyzing arithmetic codes.

## AN Codes

AN codes are defined by integers $A$ and $B$ and are used to encode integers from $0$ to $B-1$ such that

$C = \{ AN|N\in \mathbb{Z}, 0 \leq N$<$B \}$

Each choice of $A$ will result in a different code, while $B$ serves as a limiting factor to ensure useful properties in the distance of the code. If $B$ is too large, it could let a codeword with a very small arithmetic weight into the code which will degrade the distance of the entire code. To utilize these codes, before an arithmetic operation is performed on two integers, each integer is multiplied by $A$. Let the result of the operation on the codewords be $R$. Note that $R$ must also be between $0$ to $B-1$ for proper decoding. To decode, simply divide $R/A$. If $A$ is not a factor of $R$, then at least one error has occurred and the most likely solution will be the codeword with the least arithmetic distance from $R$. As with codes using hamming distance, AN codes can correct up to $\lfloor \frac{d-1}{2} \rfloor$ errors where $d$ is the distance of the code.

For example, an AN code with $A = 3$, the operation of adding $15$ and $16$ will start by encoding both operands. This results in the operation $R = 45 + 48 = 93$. Then, to find the solution we divide $93/3 = 31$. As long as $B$>$31$, this will be a possible operation under the code. Suppose an error occurs in each of the binary representation of the operands such that $45 = 101101 \rightarrow 101111$ and $48 = 110000 \rightarrow 110001$, then $R = 101111 + 110001 = 1100000$. Notice that since $93 = 1011101$, the hamming weight between the received word and the correct solution is $5$ after just $2$ errors. To compute the arithmetic weight, we take $1100000 - 1011101 = 11$ which can be represented as $11 = 2^0 + 2^1$ or $11 = 2^2 - 2^0$. In either case, the arithmetic distance is $2$ as expected since this is the number of errors that were made. To correct this error, an algorithm would be used to compute the nearest codeword to the received word in terms of arithmetic distance. We will not describe the algorithms in detail.

To ensure that the distance of the code will not be too small, we will define modular AN codes. A modular AN code $C$ is a subgroup of $\mathbb{Z}/m \mathbb{Z}$, where $m = AB$. The codes are measured in terms of modular distance which is defined in terms of a graph with vertices being the elements of $\mathbb{Z}/m \mathbb{Z}$. Two vertices $x \pmod{m}$ and $x' \pmod{m}$ are connected iff

$x - x' \equiv \pm c \cdot r^j \pmod{m}$

where $c,j \in \mathbb{Z}$ and $0$<$c$<$r$, $j \geq 0$. Then the modular distance between two words is the length of the shortest path between their nodes in the graph. The modular weight of a word is its distance from $0$ which is equal to

$w_m(x) = min\{w(y)|y \in \mathbb{Z}, y \equiv x \pmod{m} \}$

In practice, the value of $m$ is typically chosen such that $m=r^n -1$ since most computer arithmetic is computed $\mod 2^n-1$ so there is no additional loss of data due to the code going out of bounds since the computer will also be out of bounds. Choosing $m=r^n-1$ also tends to result in codes with larger distances than other codes.

By using modular weight with $m=r^n-1$, the AN codes will be cyclic code.

definition: A cyclic AN code is a code $C$ that is a subgroup of $[r^n-1]$, where $[r^n-1] = \{0,1,2,\dots,r^n-1\}$.

A cyclic AN code is a principal ideal of the ring $[r^n-1]$. There are integers $A$ and $B$ where $AB = r^n-1$ and $A,B$ satisfy the definition of an AN code. Cyclic AN codes are a subset of cyclic codes and have the same properties.

## Mandelbaum-Barrows Codes

The Mandelbaum-Barrows Codes are a type of cyclic AN codes introduced by D. Mandelbaum and J. T. Barrows. These codes are created by choosing $B$ to be a prime number that does not divide $r$ such that $\mathbb{Z}/B\mathbb{Z}$ is generated by $r$ and $-1$, and $m=r^n-1$. Let $n$ be a positive integer where $r^n \equiv 1 \pmod{B}$ and $A = (r^n -1)/B$. For example, choosing $r=2, B=5, n=4$, and $A= (r^n-1)/B = 3$ the result will be a Mandelbaum-Barrows Code such that $C = \{ 3N|N\in \mathbb{Z}, 0 \leq N$<$5 \}$ in base $2$.

To analyze the distance of the Mandelbaum-Barrows Codes, we will need the following theorem.

theorem: Let $C \subset [r^n - 1]$ be a cyclic AN code with generator $A$, and

$B = |C| = (r^n - 1)/A$

Then,

$\sum_{x \in C} w_m(x) = n(\lfloor\frac{rB}{r+1}\rfloor - \lfloor\frac{B}{r+1}\rfloor)$

proof: Assume that each $x \in C$ has a unique cyclic NAF representation which is

$x \equiv \sum_{i=0}^{n-1} c_{i,x}r^i \pmod{r^n -1}$

We define an $n \times B$ matrix with elements $c_{i,x}$ where $0 \leq i \leq n-1$ and $x \in C$. This matrix is essentially a list of all the codewords in $C$ where each column is a codeword. Since $C$ is cyclic, each column of the matrix has the same number of zeros. We must now calculate $n|\{ x \in C | c_{n-1,x} \neq 0\}|$, which is $n$ times the number of codewords that don't end with a $0$. As a property of being in cyclic NAF, $c_{n-1,x} \neq 0$ iff there is a $y \in \mathbb{Z}$ with $y \equiv x \pmod{ r^n -1}, \frac{m}{r+1}$<$y \leq \frac{mr}{r+1}$. Since $x = AN \pmod{r^n-1}$ with $0 \leq N$<$B$, then $\frac{B}{r+1}$<$N \leq \frac{Br}{r+1}$. Then the number of integers that have a zero as their last bit are $\lfloor\frac{rB}{r+1}\rfloor - \lfloor\frac{B}{r+1}\rfloor$. Multiplying this by the $n$ characters in the codewords gives us a sum of the weights of the codewords of $n(\lfloor\frac{rB}{r+1}\rfloor - \lfloor\frac{B}{r+1}\rfloor)$ as desired.

We will now use the previous theorem to show that the Mandelbaum-Barrows Codes are equidistant(which means that every pair of codewords have the same distance), with a distance of

$\frac{n}{B-1}(\lfloor\frac{rB}{r+1}\rfloor - \lfloor\frac{B}{r+1}\rfloor)$

proof: Let $x \in C, x \neq 0$, then $x = AN \pmod{r^n-1}$ and $N$ is not divisible by $B$. This implies there $\exists j (N \equiv \pm r^j \pmod{B})$. Then $w_m(x) = w_m(\pm r^j A) = w_m(A)$. This proves that $C$ is equidistant since all codewords have the same weight as $A$. Since all codewords have the same weight, and by the previous theorem we know the total weight of all codewords, the distance of the code is found by dividing the total weight by the number of codewords(excluding 0).