*By Louis Philippe Doyon-Lessard*

The redundant binary representation (RBR) has many advantages over traditional binary representation. This article tries to demonstrate the usefulness of RBR on FPGA. The performance and characteristics of apparatus for arithmetic operations (addition, subtraction and multiplication) and conversions between two’s complement and RBR are evaluated. A faster conversion scheme from RBR to two’s complement and moving average FIR filter are also demonstrated.

The redundant binary representation (RBR) is a numeral system that uses more bits than needed to represent a single binary digit so that most numbers have several representations. RBR is unlike usual binary numeral systems, including two's complement, which use a single bit for each digit. Many of RBR's properties differ from those of regular binary representation systems. Most importantly, RBR allows addition without using a typical carry, but makes bitwise logical operation slower. Usually, every bit has a sign that is not necessarily the same as the sign of the number represented. When digits have signs, RBR is also a signed-digit representation (1).

RBR is a place-value notation system (2). In RBR, digits are pairs of bits, that is, for every place RBR uses a pair of bits. The value represented by an RBR digit can be found using a translation table. This table indicates the mathematical value of each possible pair of bits.

As in conventional binary representation, the integer value of a given representation is a weighted sum of the values of the digits. The weight starts at 1 for the rightmost position and goes up by a factor of 2 for each next position. Usually, RBR allows negative values. There is no single sign bit that tells if a RBR represented number is positive or negative. Most integers have several possible representations in a RBR.

An integer value can be converted back from RBR using the following formula, where *n* is the number of digit and *dk* is the interpreted value of the *k*-th digit, where *k* starts at 0 at the rightmost position:

The following redundant binary representation is used in this article:

Digit |
Represented value |

0 0 |
-1 |

0 1 |
0 |

1 0 |
0 |

1 1 |
1 |

**Table 1 Example of redundant binary translation table**

This notation has advantages not found in other redundant binary representation. It is possible to easily find the additive inverse of a value by flipping all the bits of the represented value using NOT gates. This allows building adder/subtracter unit more easily. (8)

Addition in redundant binary representation can be done in constant time contrarily to addition in two’s complement notation. This can be explained by the fact that the carry does have to propagate through all the width of the addition unit. This does not imply that the addition is always faster in RBR than is two's complement representation, but that the addition will eventually be faster in RBR with increasing bit width because the two's complement addition unit's delay is proportional to log(n) (where n is the bit width) (3).

Bearing in mind, it is interesting to compare the performance of typical binary adder unit and redundant binary adder unit considering their bit width. The following results have been obtained by using Altera and Xilinx platform.

**Figure 2 Combinatorial delay for redundant binary adder on Xilinx Virtex 5**

Number of bits |
xc5vlx85 3ff676 |
xc4vlx80 12ff1148 |
xc5vlx85-3ff676 |
xc4vlx80-12ff1148 |

8 |
1,736 |
2,368 |
1,499 |
1,831 |

16 |
1,736 |
2,368 |
1,681 |
2,135 |

32 |
1,736 |
2,368 |
2,045 |
2,758 |

64 |
1,736 |
2,368 |
2,773 |
3,959 |

128 |
1,736 |
2,368 |
4,229 |
6,391 |

256 |
1,736 |
2,368 |
7,141 |
15,723 |

**Table 2 Redundant binary adder delay on Xilinx platforms**

**Figure 3 Combinatorial delay for redundant binary adder on Altera Stratix III**

EP3SE80F1152C2 Redundant |
EP3SE80F1152C2 Two's complement |

0,803 |
1,225 |

0,803 |
1,730 |

0,803 |
2,240 |

0,803 |
3,254 |

0,803 |
4,631 |

**Table 3 Combinatorial delay for redundant binary adder on Altera Stratix III**

Performance of RBR adder unit is much better on the Altera Platform. The RBR adder is always faster than the regular two’s complement adder. This can be explained by the fact that Stratix III FPGA architecture is much better at seven inputs to one output combinatorial function (4).

The subtraction is the same as the addition except that the additive inverse of one of the operand needs to be found. Essentially, the subtraction is the addition of one of the operand to the additive inverse of the other operand.

Using the notation used in this article, the additive inverse of a value is easily found by inverting every bit of an operand.

The multiplication unit evaluated here is made of many adder unit arranged in a tree. The multiplication unit is not pipelined although it could easily be.

Firstly, partials are computed by multiplying every digit of an operand with every digit of the other operand using usual arithmetic (5):

-1 |
-1 |
1 |

-1 |
0 |
0 |

-1 |
1 |
-1 |

0 |
-1 |
0 |

0 |
0 |
0 |

0 |
1 |
0 |

1 |
-1 |
-1 |

1 |
0 |
0 |

1 |
1 |
1 |

**Table 4 Rule to compute the partial results**

**Figure 4 Combinatorial delay in multiplication unit on Xilinx Virtex 5**

Number of bit |
xc5vlx85-3ff676 |
xc4vlx80-12ff1148 |
xc5vlx85-3ff676 |
xc4vlx80-12ff1148 |

8 |
5,668 |
6,312 |
2,91 |
4,152 |

16 |
6,443 |
7,713 |
2,91 |
4,152 |

32 |
7,662 |
9,049 |
7,923 |
8,559 |

64 |
8,921 |
10,623 |
11,705 |
13,346 |

128 |
10,14 |
12,319 |
14,165 |
18,75 |

**Table 5 Combinatorial delay in multiplication unit on Xilinx Virtex 5**

The delay in the multiplier unit as seen in Figure 4 and Table 5 is proportional to log *n* where *n* is the bit width of the operands. This is an inherent characteristic of the adder tree used in the design of the multiplier unit.

The performance of a RBR multiplier unit begins to be interesting when the multiplier unit have bit width greater than 32 bit. When this is the case, the delay of the multiplier becomes smaller than the delay of the specialized multiplier circuit of Xilinx FPGAs. However, it comes at a significant cost of LUT (Table 6). Moreover, this design could be easily pipelined leading to a significant lessening of delay. For example, a 64 bits RBR multiplier unit could be implemented as a 6 stages pipeline by adding D flip-flop between each level of the adder tree thus utilizing more of the FPGA resources. Considering this, the frequency is expected to be six times higher. This design would be interesting for a processor which would use only the redundant binary representation.

Number of bit |
xc5vlx85-3ff676 |
xc4vlx80-12ff1148 |

8 |
272 |
328 |

16 |
1204 |
1385 |

32 |
5009 |
5632 |

64 |
20346 |
22636 |

128 |
81833 |
90684 |

**Table 6 Resources used by RBR multiplier unit on Xilinx Virtex**

This multiplication unit does not perform well compared to the built-in multiplication unit on Altera platform. This was to be expected because Stratix III FPGA have hardwired dedicated multiplication unit with native support up 36 bits (6).

**Figure 5 Delay in Altera multiplication unit**

Number of bit |
EP3SE80F1152C2 |
EP3SE80F1152C2 |

8 |
3,48 |
3,35 |

16 |
4,78 |
2,84 |

32 |
6,24 |
3,47 |

64 |
10,04 |
8,70 |

128 |
N/A |
12,95 |

**Table 7 Delay in Altera multiplication unit**

The converter presented here uses Xilinx and Altera specialized carry propagation circuit to accelerate redundant binary to binary conversion. The basic idea is to convert the redundant binary number to two two’s complement number that then can be added with any two’s complement adder (7). This allows using the FPGA resources more efficiently and is most of the time faster than a regular redundant binary to binary converter using a sequential adder.

*X* being a redundant binary number, it is possible to convert it to 2 two’s complement number:

Those 3 numbers need to be added using standard two’s complement adder. A single adder is needed since ‘1’ can be added as a “carry in”. Thus the specialized adder circuit of FPGAs can be used to make the conversion from RBR to binary representation.

**Figure 6 Delay in RBR to two's complement conversion unit on Xilinx platform**

Number of bit |
xc5vlx85-3ff676 |
xc4vlx80-12ff1148 |
xc5vlx85-3ff676 |
xc4vlx80-12ff1148 |

8 |
1,682 |
1,661 |
1,99 |
2,186 |

16 |
1,864 |
1,933 |
2,676 |
3,22 |

32 |
2,228 |
2,477 |
3,221 |
3,913 |

64 |
2,956 |
3,565 |
4,112 |
4,676 |

128 |
4,412 |
5,741 |
4,63 |
5,329 |

**Table 8 Delay in RBR to two's complement conversion unit on Xilinx platform**

**Figure 7 Resources used by RBR to binary converter on Xilinx platform**

xc5vlx85-3ff676 |
xc4vlx80-12ff1148 |
xc5vlx85-3ff676 |
xc4vlx80-12ff1148 |

9 |
9 |
20 |
26 |

17 |
17 |
48 |
71 |

33 |
33 |
112 |
144 |

65 |
65 |
212 |
323 |

129 |
129 |
456 |
653 |

**Table 9 Resources used by RBR to binary converter on Xilinx platform**

Specialized adder based converters present, for operands up to 128 bits, the best performances in terms of speed and resources used.

FIR filters are used in a variety of communication applications. FIR filter can be represented as a series of arithmetic operation. Because better arithmetic speed can be achieved with RBR on FPGA, an increase in FIR filter speed is expected. The FIR filter studied here is the *moving average*.

The output* * is the sum of the *n* last inputs. is implemented as a circular buffer. During each cycle a value is read and the new input value is written at the head of the circular buffer. The *moving average *filter can also be expressed recursively in the following way:

The last value is stored in a register so that can be calculated using the above formula during the next cycle. During each cycle, we need to add and subtract to . An addition unit and a subtraction unit are used to calculate the next value of .

**Figure 8 Maximum delay of moving average filter on Xilinx platform**

Number of bit |
xc5vlx85-3ff676 |
xc4vlx80-12ff1148 |
xc5vlx85-3ff676 |
xc4vlx80-12ff1148 |

8 |
2,651 |
4,264 |
2,946 |
4,748 |

16 |
2,659 |
4,79 |
3,128 |
5,079 |

32 |
2,651 |
5,192 |
3,492 |
5,993 |

64 |
2,659 |
5,219 |
4,22 |
7,204 |

128 |
2,692 |
5,518 |
5,676 |
8,357 |

**Table 10 Maximum delay of moving average filter on Xilinx platform**

**Figure 9 Maximum delay of moving average filter on Altera EP3SE80F1152C2**

Number of bit |
Two's Complement (ns) |
RBR (ns) |

8 |
3,13 |
2,89 |

16 |
3,40 |
3,33 |

32 |
4,08 |
3,11 |

64 |
5,34 |
3,80 |

128 |
7,53 |
4,11 |

**Table 11 Maximum delay of moving average filter on Altera EP3SE80F1152C2**

The result of the redundant binary unit is outstanding. Most notably, the Xilinx 5 platform performs at almost constant time using redundant binary.

The result displayed above show that redundant binary representation is useful to speed up arithmetic operation even on FPGA. Addition, subtraction and multiplication have been show to be faster when RBR is used, but sometime only for bit width of 64 bit and higher. However, this speed comes at a cost which is often acceptable considering the time critical nature of digital filter. Also, it is tolerable to use RBR in only a subset of a large circuit considering that RBR to two’s complement conversion is relatively fast when done properly. Further study on larger circuit using RBR would be interesting. Easy place and route operation are expected because of the symmetric nature of RBR apparatus.

I hereby release this source code under GPL.

You can download it here.

The source is in VHDL. `moving_avg.vhd`

is a moving average filter. `radder.vhd`

is a RBR adder. `raddsub.vhs`

is a RBR adder/substractor. `rmul.vhd`

is a RBR multiplier. `rconv.vhs`

is RBR to two's complement converter. `tworconv.vhd`

is a two's complement to RBR converter.

1. **Panami, Behrooz.** Generalized Signed-Digit Number Systems : A Unifying Framework for Redundant Number Representations. *IEEE TRANSACTIONS ON COMPUTERS, VOL. 39, NO. 1. *pp. 89-98.

2. **Wikipedia.** Positional Notation. [Online] [Cited: august 18, 2008.] http://en.wikipedia.org/wiki/Positional_notation.

3. **Pai, Yu-Ting and Chen, Yu-Kumg.** The fastest carry lookahead adder. *Electronic Design, Test and Applications. *Second IEEE International Workshop, 2004.

4. **Altera.** Stratix II: 8-Input Fracturable LUT in the ALM. *Altera Web site. *[Online] [Cited: August 20, 2008.] http://www.altera.com/products/devices/stratix-fpgas/stratix-ii/stratix-ii/features/architecture/st2-lut.html.

5. **Guoping Wang, Murad Ozaydin, Monte Tull.** High performance divider using redundant binary representation. *IEEE. *2002.

6. DSP System Design in Stratix III devices. *Altera. *[Online] [Cited: october 6, 2008.] http://www.altera.com/literature/an/an504.pdf.

7. **Iljoo Choo, R.G. Deshmukh.** A Novel Conversion Scheme From a Redundant Binary Number to Two's Complement Binary Number For Parallel Architecture. *IEEE. *2001.

8. *Systematic Design of Pipelined Recursive Filters. ***Lapointe, Marcel, Huynh, Huu Tue and Fortier, Paul.** s.l. : IEEE TRANSACTIONS ON COMPUTERS, 1993.