Scielo RSS <![CDATA[Latin American applied research]]> vol. 37 num. 1 lang. es <![CDATA[SciELO Logo]]> <![CDATA[Special Issue on Programmable Logic]]> <![CDATA[Cryptographic applications in FPGA]]> This paper describes circuits for executing the most complex operations of public-key cryptography and gives estimations of their execution time within field programmable devices. The following operations are considered: mod n exponentiation, mod p division, mod f(x) multiplication of polynomials, mod f(x) division of polynomials and point multiplication over an elliptic curve. <![CDATA[Forward and inverse 2-D DCT architectures targeting HDTV for H.264/AVC video compression standard]]> This paper presents the architecture and the VHDL design of the integer Two-Dimensional Discrete Cosine Transform (2-D DCT) used in the H.264/AVC codecs. The forward and inverse 2-D DCT architectures were designed and their synthesis results mapped to Altera FPGAs are presented. The 2-D DCT calculation is performed by exploring the separability property, in such way, each 2-D DCT architecture is divided in two 1-D DCT calculations that are joined through a transpose buffer. The 1-D DCT transforms implemented and herein described are multiplierless, hence optimized shift-add operations are used. The architectures have a dedicated pipeline, optimized to process one input data per clock cycle. These architectures are able to cope with H.264/AVC encoder or decoder requirements targeting High Definition Digital Television (HDTV), with 1920x1080 pixel/frame at 30 frames per second. <![CDATA[Sum-subtract fixed point LDPC decoder]]> In this paper a low complexity logarithmic decoder for a LDPC code is presented. The performance of this decoding algorithm is similar to the original decoding algorithm's, introduced by D. J. C. MacKay and R. M. Neal. It is a simplified algorithm that can be easily implemented on programmable logic technology such as FPGA devices because of its use of only additions and subtractions, avoiding the use of quotients and products, and of float point arithmetic. The algorithm yields a very low complexity programmable logic implementation of a LDPC decoder with an excellent BER performance. <![CDATA[Real-time disparity map extraction in a dual head stereo vision system]]> This paper describes the design of an algorithm for constructing dense disparity maps using the image streams from two CMOS camera sensors. The proposed algorithm extracts information from the images based on correlation and uses the epipolar constraint. For real-time performance, the processing structure of the algorithm was built targeting implementation on programmable logic, where pipelined structures and condensed logic blocks were used. <![CDATA[An FPGA-based system for the measurement of frequency noise and resolution of QCM sensors]]> The use of Quartz Crystal Oscillators (QCM) as high accuracy microbalance sensors is limited by the frequency noise present in the circuit. This work deals with the design and implementation of an FPGA-based system for the real-time measurement of the frequency noise and resolution of QCM sensors. This reconfigurable system integrates the frequency measurement and the mass resolution computation in a single FPGA chip. Parallel processing, pipeline stages and prediction techniques are combined in order to accelerate the computations maintaining a low hardware complexity. By this way, a reconfigurable and low cost QCM sensor system working as a standalone measurement platform with communication capabilities is obtained. The implemented system was validated with a Xilinx Virtex-4 FPGA and a QCM sensor operating in damping media for electrochemical applications. <![CDATA[Analysis and implementation of localization and mapping algorithms for mobile robots based on reconfigurable computing]]> Localization and Mapping are fundamental problems in the field of mobile robotics that have been receiving considerable attention of the scientific community in the last ten years. Most of the work in this area is developed using personal computers and it still a challenge to execute these algorithms on embedded systems. This paper describes the analysis and embedded implementation of particle filter and occupancy grid algorithms, used for localization and mapping respectively. Experimental results and performance analysis were obtained using the softcore Altera Nios II running on Stratix II FPGA devices. <![CDATA[uRT51: An embedded real-time processor implemented on fpga devices]]> In this paper we describe and evaluate the main features of the uRT51 processor. The uRT51 processor was designed for embedded real-time control applications. It is a processor architecture that incorporates the specific functions of a real-time system in hardware. It was described using synthesizable VHDL and it was implemented on FPGA devices. We describe how the uRT51 processor supports time, events, task and priorities. The performance of the uRT51 processor is evaluated using a control application as a case study. The experiments show that the uRT51 processor scheduling features outperform the ones obtained using a traditional RTOS-based real-time system. <![CDATA[A Verilog HDL digital architecture for delay calculation]]> A method for the calculation of the delay between two digital signals with central frequencies in the range [20, 300] Hz is presented. The method performs a delay calculation in order to determine the bearing angle of a sound source. Computing accuracy is tested against a previous implementation of the Cross Correlation Derivative method. A Verilog RTL model of the method has been tested on a Xilinx® FPGA in order to evaluate the real performance of the method. Simulations of an ASIC design on a standard CMOS technology predict a power saving of about 25 times per delay stage over previous implementations. <![CDATA[Flexible FPGA interface for three-phase power modules]]> This article proposes the development of a flexible interface for the control of Power Inverters. A FPGA-designed system enables an implementation adaptable to any Power Module, with no need to modify the existing hardware. The interface identifies and indicates the type of faults encountered in these modules, generates switching signals on the basis of dead-time and sends a protection signal that inhibits its operation. <![CDATA[An open-source tool for SystemC to Verilog automatic translation]]> As the complexity of electronic systems increases, new ways for describing these systems are proposed. One actual trend involves the use of system level languages that allows the description of the whole system in a higher abstraction level. This type of methodology helps a designer to obtain an appropriate Hw-Sw partition, where the Sw is compiled to the target platform and the Hw is refined to bring it down to a lower level of abstraction in order to be synthesized. This last step usually requires the use of a translation tool that from a description of the system in a system level modeling language, converts it to an equivalent one in a standard Hardware Description Language, usually Verilog or VHDL. This works presents a tool that from a SystemC RTL description generates its equivalent Verilog code ready to be synthesized by any standard Verilog Synthesis Tool. <![CDATA[FPGA design of an efficient and low-cost smart phone interrupt controller]]> In this work we have designed and implemented an efficient platform-level interrupt controller for a PXA270 microprocessor-based smart phone. Although current hardware development boards include this type of controllers, for specific applications most of them are costly and include too many interrupt sources that represent a waste for a particular design. For this reason we designed our own interrupt controller which is capable of detecting interrupt sources coming from different devices that request microprocessor service. The developed interrupt controller is efficient and low-cost due to the small number of register and logic gates required for its implementation, as well as for the small number of levels to be traversed in the circuit's critical execution path. <![CDATA[Functional verification: approaches and challenges]]> It's a fact that functional verification (FV) is paramount within the hardware's design cycle. With so many new techniques available today to help with FV, which techniques should we really use? The answer is not straightforward and is often confusing and costly. The tools and techniques to be used in a project have to be decided upon early in the design cycle to get the best value for these new verification methods. This paper gives a quick survey in the form of an overview on FV, establishes the difference between verification and validation, describes the bottlenecks that appear in the verification process, examines the challenges in FV and exposes the current FV technologies and trends. <![CDATA[AES-128 cipher: Minimum area, low cost FPGA implementation]]> The Rijndael cipher, designed by Joan Daemen and Vincent Rijmen and recently selected as the official Advanced Encryption Standard (AES) is well suited for hardware use. This implementation can be carried out through several trade-offs be-tween area and speed. This paper presents an 8-bit FPGA implementation of the 128-bit block and 128 bit-key AES cipher. Selected FPGA Family is Altera Flex 10K. The cipher operates at 25 MHz and consumes 286 clock cycles for algorithm encryption or decryption, resulting in a throughput of 11 Mbps. Synthesis results in the use of 957 logic cells and 6528 memory bits. The design target was optimization of area and cost. <![CDATA[A portable hardware design of a FFT algorithm]]> In this paper, we propose a portable hardware design that implements a Fast Fourier Transform oriented to its reusability as a core. The design has parameterized the number of samples and the number of the data's bits. The module has been developed using a radix-2 decimation in time algorithm of n-point samples. Structural modelling is implemented using VHDL to describe, simulate, and perform the design. The resulting design is portable among different EDA tools and technology independent. The system has been synthesized with Quartus II from Altera and the performance results are presented. <![CDATA[A fixed-point implementation of the expanded hyperbolic CORDIC algorithm]]> The original hyperbolic CORDIC (Coordinate Rotation Digital Computer) algorithm (Walther, 1971) imposes a limitation to the inputs' domain which renders the algorithm useless for certain applications in which a greater range of the function is needed. To address this problem, Hu et al. (1991) have proposed an interesting scheme which increments the iterations of the original hyperbolic CORDIC algorithm and allows an efficient mapping of the algorithm onto hardware. A fixed-point implementation of the hyperbolic CORDIC algorithm with the expansion scheme proposed by Hu et al. (1991) is presented. Three architectures are proposed: a low cost iterative version, a fully pipelined version, and a bit serial iterative version. The architectures were described in VHDL, and to test the architecture, it was targeted to a Stratix FPGA. Various standard numerical formats for the inputs are analyzed for each hyperbolic function directly obtained: Sinh, Cosh, Tanh-1 and exp. For each numerical format and for each hyperbolic function an error analysis is performed. <![CDATA[Comparison of FPGA implementation of the mod M reduction]]> Several algorithms for computing x mod m are presented, among others the reduction mod Bk-a, the pre-computation of Bi.k mod m, a generalized version of the Barrett algorithm and a modified version of the same Barrett algorithm. The four mentioned algorithms, as well as the classical integer non-restoring division algorithm, have been synthesized and implemented within xc3s4000 components. <![CDATA[Experiments in low power FPGA design]]> This paper summarizes the utility of some low-power design (LPD) methods based on architectural and implementation modifications, for FPGA based systems. Power consumption is becoming one of the mayor design trade-off in today electronic. In this work, the contribution of spurious transitions to the overall consumption is evidenced and main strategies for its reduction are analyzed. Empirical results are present in order to show the effectiveness of pipelining and sequentialization as low-power design methodologies. The possibilities of power management techniques are explained and quantified. Algorithm level and Finite State Machines alternatives are also discussed and measured.