# Coding Algebra

[linear

algebra

math

is

fun

]
One pretty neat, and yet, little-known application of Linear Algebra is coding~encoding schemes that can detect errors in transmission on the spot.

### UPC

Perhaps, the most used code of these type is UPC. Introduced in 1973, and now ubiquitous, consists of 11 numeric digits and a 12th digit used to check the validity of the code.

We can also think the code as a vector with 11 components in the interval [0…9] (from the set of Integers). To those 11 components, we add a 12th component (called the *check digit*) composed in such a way that, when we compute the dot product of that vector with a vector called *check vector*, the result, in mod 10, is zero. The check vector is `{3,1,3,1,3,1,3,1,3,1,3,1}`

.

For instance, let’s consider the code `19019807207`

(which happens to be the UPC for an iPhone 7). We need to compute the *check digit*. If we call the *check vector* `v`

, the code vector `c`

and the check digit `d`

, we know that, after rearranging the components, we have

```
3 * (v1 + v3 + v5 + v7 + v9 + v11) + (v2 + v4 + v6 + v8 + v10) + d = 0 mod 10
```

In our example we have

```
3 * (1 + 0 + 9 + 0 + 2 + 7) + (9 + 1 + 8 + 7 + 0) + d
57 + 25 + d = 82 + d
```

`d`

has to be 8.

UPC will detect all single errors and most transposition errors for adjacent components, for instance, if the scanner reads the code as `{190198702078}`

, the result of `c.v`

is going to be `4`

in mod 10. Something went wrong.

### ISBN-10

In 1970 ISO published ISO 2108, standardizing codes for books.

This time, the code vector is composed of 9 components in the interval [0…10], encoding information such as country of origin, publisher, and book id. To those, we append the *check digit*, but this time the rules to compute it at a little different. For starters, the `check vector`

is `{10, 9, 8, 7, 6, 5, 4, 3, 2, 1}`

, and the result of `v.c`

should be zero, but in mod 11. Let’s consider `{9, 5, 0, 7, 4, 2, 4, 5, 6}`

^{1}

```
9 * 10 + 5 * 9 + 0 * 8 + 7 * 7 + 4 * 6 + 2 * 5 + 4 * 4 + 5 * 3 + 6 * 2 + d
261 + d
```

In mod 11, 261 is 8, the *check digit* should be 3 then.

ISBN-10 has been designed to detect all adjacent transposition errors.

### ISBN-13

To gain compatibility with EAN-13, ISBN has been extended to 13 digits. The *check vector* has the same structure of the check vector in UPC, and the dot product should be zero in mod 13.

Let’s try `{9, 7, 8, 9, 8, 7, 4, 5, 3, 5, 3, 9}`

^{2}

```
3 * (v1 + v3 + v5 + v7 + v9 + v11) + (v2 + v4 + v6 + v8 + v10 + v12) + d
3 * (9 + 8 + 8 + 4 + 3 + 3) + (7 + 9 + 7 + 5 + 5 + 9) + d
147 + d
```

147 mod 13 is 4, so `d`

has to be `9`

.

ISBN-13 improves on ISBN-10 and is also capable to detect all single errors and adjacent transposition.

### Codabar System

Pretty much the same scheme is used to validate credit card numbers.

The number is composed of 15 digits, assigned by the Bank issuing the card, and a *check digit*.

Most banks use a system called Codabar to compute the check digit. Because Codabar is self-checking, most standards do not require the check digit, that’s not the case with Luhn algorithm, the one actually used in credit cards. We are going to use a vector with 16 components from the Integer interval `[0…9]`

, and now, we have to add the number of odd components greater than 4. We will call this tally `h`

.

```
c.v + h = 0 (mod 10)
```

With the *code vector* `{5, 6, 1, 0, 5, 9, 1, 0, 8, 1, 0, 1, 8, 2, 5, d}`

we compute the dot product. Note that the check vector is pretty much the one used in UPC, except the odd component is `2`

and not `3`

. In this case `h = 5`

```
{5, 6, 1, 0, 5, 9, 1, 0, 8, 1, 0, 1, 8, 2, 5, d}.{2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1} + 5
2 * (5 + 1 + 5 + 1 + 8 + 0 + 8 + 5) + (6 + 0 + 9 + 0 + 1 + 1 + 2 + d) = 85 + d + 5
90 + d = 0 (mod 10)
```

Now we have our *check digit*, `0`

^{3}.