Power consumption optimization in Reed Solomon encoders over FPGA

Sandoval, C

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO

Links relacionados

Similares en SciELO

Otros
Otros

Permalink

Latin American applied research

versión impresa ISSN 0327-0793versión On-line ISSN 1851-8796

Lat. Am. appl. res. vol.44 no.1 Bahía Blanca ene. 2014

Power consumption optimization in Reed Solomon encoders over FPGA

C. Sandoval

Postgrado. Universidad de Carabobo - GITDAT. UNEFA, Maracay, Venezuela. csandoval1@uc.edu.ve

Abstract — This paper presents an analysis of the Reed Solomon encoder model and GF (2^m) multiplier component, with the aim of optimizing the power consumption for reconfigurable hardware. The methods used consisted of concatenation and reassignment circuit signals in the VHDL description. This treatment allowed achieving a reduction in the consumption of hardware resources and optimizing power consumption in the multiplier of 7.89%, which results in a reduction of the dynamic power of a 42.42% in the coder design optimized. With this development, it provides a design method with good performance, which can be applied to other circuits.

Keywords — Optimization; Low Power Consumption; FPGA, Reed Solomon Encoder; VHDL Design.

I. INTRODUCTION

Low-power design is nowadays a central point in the construction of integrated systems. It allows expensive packaging to be avoided, chip reliability to be increased,

cooling to be simplified, and the performance of batteries be extended or their weight to be reduced (Sutter et al., 2002). FPGA users can only optimize the dynamic power component (Sutter and Boemo, 2007), this study is based in the design of Reed Solomon code with power dynamic optimization. The three major design targets with respect to hardware realization are: optimization for area or cost, low latency that minimizes time and high throughput to multiple blocks in parallel (Amaar et al., 2011). All these design criteria involve a trade-off between area and speed (Liberatori and Bonadero, 2007).

The advent of the mobile age has heavily changed the requirements of today's communication devices. Data transmission over interference-prone wireless channels requires additional steps of data processing, such as forward error correction, to ensure reliable communication (Genser et al., 2009). The Reed-Solomon code is one of the most efficient codes; this application requires a multitude of specialized functional units to code and correct errors in message data coming from a noisy channel (Allen, 2008). The GF(2^m) multipliers perform the most basic operations in the Reed Solomon codes, where they are widely used. In this paper is presented a brief review about Galois Field multipliers and Reed Solomon code.

In previous researches (Sandoval and Fedón, 2008), we have developed the multiplier in VHDL, VHSIC (Very High Speed Integrated Circuit) hardware description language, that presented a methodology for programming functional modules for Reed Solomon Code using lookup tables as a solution for multiplier circuits. From this aspect, we have identified the need to redesign a combinatorial solution for the parallel multiplier over finite field.

For this reason, the efficient design of parallel multiplier over finite field GF (2^m) was studied. Generally the GF (2^m) multiplier architecture is more complex than that of the standard multiplier. The efficient computation of the arithmetic operations in finite fields is closely related to the particular ways in which the field elements are presented (Kim et al., 2002). Whereas, the measurement of the efficiency is related to the number of gates and also the total gate delay of the circuit (Halbutogullar and Koc, 2000), this aspect has been considered for design evaluation.

In this paper, we propose the design of some basic com-ponents in VHDL, the design is made in a modular way on hardware to calculate arithmetic for Galois field GF(2^m), it is based in concurrent circuit, and the description of RS(n,k) encoders, with adjustable parameters (n, the number of code symbols and k, data symbols). The purpose is to present standard parameters through a versatile and reusable module, which can be integrated to be used in configurations of concatenated codes. The efficiency of the proposed design is based in the model of the module behavior, with parallel processing.

II. FUNDAMENTALS OF REED SOLOMON CODE

A. CODING THEORY FOR RS CODES

Let α be a primitive element in GF(q^m) and let n=q^m -1. Let m =(m₀, m₁ , . . . , m_k-₁)∈GF(q^m)^kbe a message vector and let m(x) = m₀ + m₁ x + . . . +m_k_-1x^k^-1∈ GF(q^m) [x] be its associated polynomial. Then the encoding is defined by the mapping p : m(x) → c

In constructing BCH codes, we looked for generator polynomials over GF(q) (the small field) so we dealt with minimal polynomials. Since the minimal polynomial for an element β must have all the conjugates of β as roots, the product of the minimal polynomials usually exceeds the number 2t of roots specified (Moon, 2005).

A Reed-Solomon code is a q^m-ary BCH code of length q^m-1, In GF(q^m) the minimal polynomial for any element β is simply (x-β). The generator for a RS code is therefore for Eq. (1).

g(x) = (x - αⁱ)(x - αⁱ⁺¹)...(x - αⁱ^+2t) (1)

where, α is a primitive element. There are no extra roots of g(x) included due to conjugates in the minimal polynomials, so the degree of g is exactly equal to 2t. Thus n-k = 2t for a RS code. The design distance is d=n-k+1.

Chih (2000) presents the code words, the parity-check polynomial is represented by r(x)= m(x)x^2t mod g(x).

The Fig. 1 shows the architecture RS encoder (mod operation), it present elements as shift register, XOR gates, multiplexer and the multiplier in algebra of finite field, where the correspondent coefficients have been assigned for each multiplier over finite field; it has been configured in VHDL (Sandoval and Fedón, 2007).

Figure 1. Reed Solomon Encoder Architecture

In this investigation, several concepts have been studied to optimize the configuration of the proposed encoder (Sandoval and Fedón, 2013), this encoder requires multiplier in finite field, its will be subject of study in this section.

B. MODEL OF FINITE FIELD MULTIPLIER

Fast multipliers are essential parts of digital signal processing systems. The speed of product operation is of great importance in digital signal processing. Multiplication can be considered as a series of repeated additions. The number to be added is the multiplicand, the number of times that it is added is the multiplier, and the result is the product. Each step of addition generates a partial product. In most computers, the operand usually contains the same number of bits. When the operands are interpreted as integers, the product is generally twice the length of operands in order to preserve the information content. This repeated addition method that is suggested by the arithmetic definition is so slow that it is almost always replaced by an algorithm that makes use of positional representation. It is possible to decompose multipliers into two parts. The first part is dedicated to the generation of partial products, and the second one collects and adds them (Marimuthu and Thangaraj, 2008). One of the basic operations is the modulo m reduction, it is studied in mod reduction algorithm (Deschamps and Sutter, 2007).

A multiplier for two elements of finite field GF(q), where the field GF(q) has been defined as GF(p)[x]/ p(x), and p(x) an irreducible polynomial of degree n, the multiply of elements over GF(q) is defined as polynomial multiplication over GF(p)[x] module and p(x).

Let A(x) the coefficients of the generator polynomial, p(x) primitive polynomial, and B(x) input data to code. It can be represented as the Eq. 2.

(2)

To implement the mentioned calculation, the concept of division of polynomials was applied; the calculation is based on the operation represented by the Eq. 3.

(3)

where r(x) = A(x) mod p(x) corresponds to the residue of the division between an operand of the multiplication and the irreducible polynomial of the finite field GF(2^m).

C. DYNAMIC POWER DISSIPATION

Power dissipation is recognized to be a critical parameter in modern VLSI design field. Dynamic power dissipation, which represents a relevant part of the total power dissipation, is mainly due to the charging and discharging capacitance in the circuit. The golden formula for calculation of dynamic power dissipation is P_d= C_L V²f. Power reduction can be achieved by various manners. They are reduction of output Capacitance CL, reduction of power supply voltage V, reduction of switching activity and clock frequency f (Marimuthu and Thangaraj, 2008).

Dynamic consumption on FPGAs can be separated into three parts: datapath, synchronization, and off-chip power. The first component corresponds to the combinational blocks and associated interconnection power; the second part is the consumption by registers, clock lines, and buffers; and finally, off-chip power, is the fraction dissipated in the circuit output pads (Boemo et al., 1995)

III. POWER CONSUMPTION STUDY DESIGN PROPOSED

The first postulates that we consider in the optimization were: (i) the fastest circuits consume the least power (Genser et al., 2009; Todorovich et al., 2001), so that the comparison of the minimum period is used clk variable for comparison, considering that the clk min period is the inverse of the maximum operating frequency, (ii) the circuits have a power consumption proportional to the computational complexity and logical depth of the design (Biard and Noguet, 2008) , so optimizing hardware resources for estimating the power consumption comparison of the design.

To evaluate the power consumption of the proposed model for the design equations presented were considered. We estimate the power consumption from the computational complexity given in gate operation in this regard has been performed estimate of the design and compared the results provided in the others proposed optimization (Biard and Noguet, 2008), as shown in Table 1. Here, m is the bit number of multiplier and p is the number bits in "one" of polynomial irreducible.

Table 1. Estimation of Power Consumption for Galois Field Multipliers

Sutter and Boemo (2007) present a table comparing various combinational and sequential circuits, where is presented hardware resources, latency of the circuit and consumption of energy, we find that the consumption values estimated from this research have optimal results. As shown in Table 2.

Table 2. Area, Delay and Power Consumpton for 8 bits GF Combinational Multipliers

For the improved version of the RS encoder LUT-tables based finite field multiplications are replaced by the parallel LFSR schema, which are provided a combinational circuit, it using the concatenation operator in VHDL. This alternative allows a reduction in the hardware resource (CLBs) and power consumption. Further, in the design proposed the RS encoding algorithm to 255 cycles, obtaining the RS encoder most fast.

After the theoretical analysis, we proceeded to obtain the power consumption in mW modules designed using the tool from Xilinx IDE.

IV. RESULTS

Here, we present the power consumption of each of the parameters involved in the dynamic consumption equation, which is associated with the design, where comparison is made with respect to reduction consumption techniques used.

Figure 2 shows the results for the design of the finite field multipliers; it is based on the technique of reordering the signals.

Figure 2. Power consumption for the multiplier

First, the power consumption associated with circuit logic in both cases was 0.04 mW, while the Signal power component presented a variation, it according to the order of operands.

In the case of B(x) mod P(x)* A(x), this presents a signal power component of 0.38 mW and in the case of A(x) mod P(x)*B(x), this presents signal power component of 0.35 mW. The optimization is achieved through the technique of reordering of signals. These results were tested using the commutative property of multiplication. Moreover, consumption does not appear associated clock signal, as the multiplier is concurrent, and not presented power consumption with IO because it is an internal component and multiplier signals are not implemented on the external pins over FPGA, finding that the order of the entries has an effect in power consumption. This optimization technique should be considered to design the modules of the encoder.

From where, we get a 7.89% savings in power consumption associated with the signal in the multiplier design. Once optimized the multiplier VHDL, for the design is considered the order A(x) mod P(x) * B(x), so that the effect of this reduction in power is weighted by the number of components optimized, resulting thus a 39.66% decrease in signal power component at the encoder.

In a first approximation, it were placed in the entity declared in VHDL pin input/output feedback of the results of the multipliers, in order to observe the results through simulation, the design dynamic power was obtained in the order of 36.49 mW for the RS (255,223), the IO power component was in the order of 33.21 mW, it taking into account the outcome of the products can be handled as an internal signal. It was removing from the pins of the device, obtaining a reduction of power consumption, ie. the power dynamics of reported 18.89 mW became, corresponding to 50% savings in power consumption design.

It is important to note that renaming or reassigning tested signals, internal signals and exchanging concatenating assignments externally and internal components, in order to study the effects of these changes in the syntax of the power consumption of the implementation, with the result that the reports did not change, this due to the optimization process characteristic of the design tool during the design synthesis.

The power consumption of the encoders optimized, with parameters n=255 and k variables, this takes the values of 247, 239 and 223, under the concurrent multiplier model is illustrated in Fig. 3.

Figure 3. Power consumption for the design of the Reed Solomon encoders

For the three encoders, the power consumption per logic circuit corresponded to 0.04 mW in all cases, the power associated with signals corresponding to 0.42, 0.32 and 0.73 mW respectively, the power associated with the clock signal corresponds to 0.95, 1.19 and 1.43 mW, respectively, which allows us to conclude that the higher the redundancy ratio greater power consumption, this because the clock signal and the signals from the LFSR (Linear Feedback Shift Register) for generating redundancy symbols will be greater.

From the graphs presented through Figs. 1 and 2 respectively, we see that power consumption optimiza-tion over the design can be achieved through Signal and Logic circuit model. Mainly, this because the number of inputs - outputs required a fixed parameter for the encoder, and the clock has been applied to enable synchronization of the signals, considering the elimination of technical glitches. From the above we find that the power consumption values for tunable circuit elements through design are satisfactory while those below 1 mW, which leads to the conclusion that optimization of the design is within the margin expected to reach through the proposed model efficient Reconfigurable Systems encoders.

Figures 4 shows the graph consumption power of each of the designs for the comparison between the sequential model and the parallel model generated here, it is for study case RS(7,3), using LFCS (Sandoval, 2012).

Figure 4. Power Consumption Parallel and Sequential Model Encoder

Optimization can be observed in the power consumption associated with the design on the parallel model, since the logic circuit power is 0.05 mW for both designs, the power associated to the signals is 0.33 mW in the parallel design vs. 0.13 mW in the sequential design, but the ratio of the power consumed by the clock corresponds to 0.48 mW in the sequential design, whereas for the parallel design does not apply, so you have a dynamic power saving of 42.42%, it is noted that the dynamic power consumption has been calculated without considering the power dissipated in the input and output pins, this design because it is a component of the communication system where the IO not implemented on the pins of the device, in that case found that the number of I / O is higher for the parallel design.

Figure 5 shows the power consumption between particularized design and optimized design that handles the generalized parameters for the multipliers and the coefficients of the code, yielding a decrease in power consumption. It is for study case RS(255,223).

Figure 5. Power consumption between the particularized design and the optimized design

V. CONCLUSIONS

In this way, we have developed a fast finite field multiplier that utilizes a new concept, with an improved internal structure. One of the important features of the proposed multiplier is a higher throughput. The circuit has been design for parallel multiplication using a combinational way, all in one step; which after has been analyzed and the resources used by the FPGA on the implementation of the design the gate levels are minimum; we can you conclude about performance-based design criteria "efficient and low-cost due to the small number of register and logic gates required for its implementation, as well as for the small number of levels to be traversed in the circuit's critical execution path" (De Alba et al., 2007). This results have been compared with others previous design and it has proved to be an efficient solution about multiplier and Reed Solomon encoder performance comparison results.

The simulation produced the expected results for the multiplier over Galois finite field for 8 bits in width, this component was used to form the RS(255,k), where the tests were conducted and the results obtained for the coding was the correspondents RS. The two most significant contribution is that the components programmed have proved efficient modules; in this way were generated VHDL code required for integration of the structure of the proposed code with parallel concatenation or adaptive function that can be implementing on FPGA devices and an hybrid encoder based in fast multiplier over finite field proposed. Well as evidence of an encoder parallelization RS (7,3) to verify its efficiency parallel model developed under the LFSR.

Finally, it is important to note that the techniques for power consumption optimization studied (Sutter, 2005), have been considered by applying the enabling signal data generator power symbol coding redundancy, using parallelization to reduce the consumption associated with signal timing clock, to optimize size and speed during the design cycle, to indirectly reduce power (Todorovich et al., 2000), re-establishing the system of signals to achieve a better distribution and design performance.

REFERENCES
1. Allen, J., Energy Efficient Adaptive Reed-Solomon Decoding System, University of Massachusetts (2008).         [ Links ]
2. Amaar, A., I. Ashour and M. Shiple, "Design and Implementation A Compact AES Architecture for FPGA Technology," World Academy of Science, Engineering and Technology, 59, 8-11 (2011).         [ Links ]
3. Biard, L. and D. Noguet, "Reed-Solomon Codes for Low Power Communications," Journal of Communications, 3, 13-21 (2008).         [ Links ]
4. Boemo, E., G. Gonzalez de Rivera, S. Lopez-Buedo and J. Meneses, "Some Notes on Power Management on FPGAs," Lecture Notes in Computer Science, Springer-Verlag Berlin, 975, 149-157 (1995).         [ Links ]
5. Chih, Lung-Shih, Soft IP Generator of Reed-Solomon Codec for Communication Systems, MSc. Thesis, China (2000).         [ Links ]
6. de Alba, M., A. Andrade, J. González, J. Gómez-Tagle and A.D. García, "FPGA design of an efficient and low-cost smart phone interrupt controller," Lat. Am. Appl. Res., 37, 59-63 (2007).         [ Links ]
7. Deschamps, J-P. and G. Sutter, "Comparison of FPGA implementation of the mod M reduction," Lat. Am. Appl. Res., 37, 93-97 (2007).         [ Links ]
8. Genser, A., C. Bachmann, C. Steger, J. Hulzink and M. Berekovic, "Low-Power ASIP Architecture Exploration and Optimization for Reed-Solomon Processing," Proc. of ASAP, 177-182 (2009).         [ Links ]
9. Halbutogullari, A. and C.K. Koc, "Mastrovito Multiplier for General Irreducible Polynomials," IEEE Transactions on Computers, 5, 503-518 (2000).         [ Links ]
10. Kim, C., S. Oh and J. Lim, "A new hardware architecture for operations In GF (2)," IEEE Transactions on Computers, 51, 90-92 (2002).         [ Links ]
11. Liberatori, M.C. and J.C. Bonadero, "AES-128 cipher: Minimum area, low cost FPGA implementation," Lat. Am. Appl. Res., 37, 71-77 (2007).         [ Links ]
12. Marimuthu, C.N. and P. Thangaraj, "Low Power High Performance Multiplier," Proc. ICGST-PDCS, 8, 31-38 (2008).         [ Links ]
13. Moon, T., Error Correction Coding: Mathematical, Methods and Algorithms, Wiley (2005).         [ Links ]
14. Sandoval, C. and A. Fedón, "Codificador y decodificador Reed-Solomon programados a través de hardware reconfigurable," Revista Ingeniería y Universidad, 11, 17-32 (2007).         [ Links ]
15. Sandoval, C. and A. Fedón, "Programación VHDL de algoritmos de codificación para dispositivos de Hardware Reconfigurable," Revista Internacional de Métodos Numéricos para Cálculo y Diseño en Ingeniería, 24, 3-11 (2008).         [ Links ]
16. Sandoval-Ruiz, C., "Codificador RS(n,k) basado en LFCS: caso de estudio RS(7,3)," Revista Facultad de Ingeniería Universidad de Antioquia, 64, 68-78 (2012).         [ Links ]
17. Sandoval Ruiz, C. and A. Fedón, "Codificador RS(255,k) en hardware reconfigurable orientado a radio cognitivo," Ingeniería y Universidad, 17, 77-92 (2013).         [ Links ]
18. Sutter, G., Aportes a la Reducción de Consumo en FPGAs, Tesis Doctoral. Universidad Autónoma de Madrid (2005).         [ Links ]
19. Sutter, G. and E. Boemo, "Experiments in low power FPGA design," Lat. Am. Appl. Res., 37, 99-104 (2007).         [ Links ]
20. Sutter G., E. Todorovich, S. Lopez-Buedo and E. Boemo, "Low-Power FSMs in FPGA: Encoding Alternatives," Lecture Notes in Computer Science, Springer-Verlag, Berlin, 2451, 363-370 (2002).         [ Links ]
21. Todorovich, E., G. Sutter, N. Acosta, E. Boemo and S. López-Buedo, "End-user low-power alternatives at topological and physical levels. Some examples on FPGAs," XV Conference on Design of Circuits and Integrated Systems, 640-644 (2000).         [ Links ]
22. Todorovich, E., G. Sutter, N. Acosta, E. Boemo and S. Lopez-Buedo, "Relación entre Velocidad y Consumo en FPGAs," Proc. Iberchip, Montevideo, Uruguay (2001).         [ Links ]

Received: October 25, 2011
Accepted: June 2, 2013
Recommended by Subject Editor: José Guivant