Latin American applied research
versión ISSN 0327-0793
Lat. Am. appl. res. v.37 n.1 Bahía Blanca ene. 2007
An FPGA-based system for the measurement of frequency noise and resolution of QCM sensors
M. J. Moure, P. Rodiz, M. D. Valdéz, L. Rodriguez-Pardo and J. Fariña
Dto. de Tecnología Electrónica-Instituto de Electrónica Aplicada, Universidad de Vigo, Vigo, España
Abstract The use of Quartz Crystal Oscillators (QCM) as high accuracy microbalance sensors is limited by the frequency noise present in the circuit. This work deals with the design and implementation of an FPGA-based system for the real-time measurement of the frequency noise and resolution of QCM sensors. This reconfigurable system integrates the frequency measurement and the mass resolution computation in a single FPGA chip. Parallel processing, pipeline stages and prediction techniques are combined in order to accelerate the computations maintaining a low hardware complexity. By this way, a reconfigurable and low cost QCM sensor system working as a standalone measurement platform with communication capabilities is obtained. The implemented system was validated with a Xilinx Virtex-4 FPGA and a QCM sensor operating in damping media for electrochemical applications.
Keywords FPGA. QCM. SoC. DSP. Frequency Noise. Mass Resolution. Allan Deviation.
A QCM sensor is an acoustic resonator used as high accuracy microbalance intended to measure mass changes in the nanogram range on the basis of frequency variations. The sensor consists of an oscillator circuit containing a thin disk of AT-cut quartz crystal with circular metallic electrodes on both sides (Fig. 1). By applying an alternating voltage to the electrodes the resonator is excited to mechanical oscillations due to the piezoelectric effect. The quartz crystal resonance frequency is sensitive to any mass change at its surface (Sauerbrey, 1959; Granstaff and Martin, 1994).
Figure 1. QCM device structure.
To characterize the behavior of QCM sensors, it is not enough to determine their experimental sensitivity, but rather it is essential to study the frequency fluctuations in order to establish the sensor resolution (Rodríguez-Pardo et al, 2004). This is fundamental in the case of oscillators for damping media, because the noise level rises due to the strong decline of the quality factor of the resonator, resulting in a lesser ability to resolve small changes in the measurand (Vig and Walls, 2000). To normalize the frequency stability measurements in the time domain, the IEEE has proposed a two-sample variance without dead time, called Allan variance (IEEE Std. 1139, 1999), whose expression is the following:
The oscillator detection limit, i.e. the smallest frequency deviation that can be detected in presence of noise is equal to:
Finally, the mass resolution can be obtained by the relationship between the detection limit and sensitivity by:
In the above equations fn(τ) (from now fn) is the n-esim sample of the average frequency calculated over a time interval τ starting from an instant tn, fo is the nominal frequency of the sensor, m is the number of samples and k=2,26·10-6·f02 (Hz g-1 cm2) is the mass sensitivity coefficient, known as the Sauerbrey coefficient (Sauerbrey, 1959).
In practice, the implementation of a real-time system to estimate the Allan deviation directly from Eq. 1 is not simple because of the amount and complexity of the required calculus when an elevate number of samples (m) is analizated. There have been some proposals in the past for the implementation of the real-time Allan deviation measurement. Kuboki and Ohtsu (1990) used a system combining a personal computer and an ad-hoc high speed data acquisition card in order to estimate the Allan deviation. Mingfu et al (2000) proposed a software algorithm for reducing the computation complexity on the basis of using a recursive formula. In this paper we propose the implementation of a SoC ("System On Chip") combining software and hardware parts to perform the complete Allan deviation calculation and the mass resolution measurement only by using the logic resources of one FPGA. The main advantage of using FPGAs for this application is the reconfiguration capability of the measurement hardware according to the specific characteristics of the QCM sensor or to the environmental conditions. Moreover, the logic resources of the FPGA can be reconfigured for other data processing tasks when the noise measurement in the calibration stage is finished.
A Virtex-4 FPGA from Xilinx was chosen as hardware platform due to its suitable architecture to support SoC and DSP ("Digital Signal Processing") applications. The developed measurement system combines the 32-bit embedded MicroBlaze soft processor with a specific hardware coprocessor designed to accelerate the computations required for real-time processing. This paper proposes a data flow based on techniques as recursive formulas and jump prediction in order to reduce the computations and memory requirements of the coprocessor. Also the coprocessor uses a pipeline architecture and time-sharing modules in order to achieve an optimal balance between velocity and area.
The rest of this paper is structured as follows: the system architecture (including the hardware and the software part) is described in section 2, the implementation is described in section 3, the performance of the system is analyzed in section 4 and finally the conclusions are presented in section 5.
II. SYSTEM ARCHITECTURE
Figure 2 shows the basic diagram of the implemented system. The first stage constitutes a frequency meter to sample the output of the QCM sensor raging in practice from 5 to 30 MHz. Second and third stages are devoted to the Allan deviation and the resolution computations. The last stage provides the sensor system with communication capabilities to represent the measurement results in an OLED display or to send processed data to an external system through an RS-232 communication.
Figure 2. Block diagram of the reconfigurable system for QCM applications.
The proposed system performs the complete Allan deviation calculation only by using the logic resources of one FPGA avoiding the requirement of other external chips as for example memory chips. The complete DSP dataflow was first analyzed in order to determine which functions must be accelerated by the hardware (hardware part) and which functions can be implemented using sequential algorithms (software part). The hardware part was completely designed in order to achieve the best performance using the programmable logic technology (Application Specific DSP Hardware in Fig. 2). By this way the digital signal processing elements included in the FPGA, as for example hardware multiplier blocks, are used to implement fast operations in order to calculate and accumulate frequency values. In the other hand the software part runs over an embedded processor built inside the FPGA using standard high level functions. This processor is able to perform floating point operations by software, so it is perfect for this application since the complete Allan deviation formula includes at least one square root and one division to finally present the results (Eq. 1, 2 and 3).
The time-critical functions were implemented using dedicated hardware (Allan Deviation Coprocessor) which is described in section A. The software part is based in the integration of a software microprocessor in the FPGA and is described in section B.
A. Allan Deviation Coprocessor
Figure 3 shows the proposed dataflow for the calculus of the Allan Deviation as a function of τ. This structure must be replicated for each time interval (τ ) analyzed so its complexity can be high. The proposed DSP system was designed with the objective of saving hardware resources compared to the direct implementation of the Allan formula (Eq. 1). First of all, the differences between consecutive frequency samples are considered as the input signal instead of the full frequency samples. This can be done because the frequency variations due to noise (less than 1 MHz) are very little compared to the nominal frequency fo of a QCM sensor (tenths of MHz). By this way, the memory cells requirements are significantly reduced.
Figure 3. The Digital signal dataflow proposed for the Allan Deviation computation.
Let be the first difference series:
The expression for first order Allan deviation (τ = 1) can be easily calculated by averaging all the samples of the signal:
Secondly, we simplify the calculus for longer integration times using recursive expressions. By this way the Allan deviation as a function τ is expressed as:
The values are obtained from the following recursive formulas:
By using these recursive formulas the input samples are calculated with three sum operations and one product by two only. From the point of view of memory usage, it is necessary to store the values of for 2(τmax-2) and for (τmax-2) time intervals. The implementation of these recursive formulas constitutes the filter block represented in Fig. 3.
The filter architecture is the performance bottleneck of the system, so special attention was paid to the speed optimization during the design process. It was implemented by multiplexing the hardware of one filter among all the different τ branches. That is, the actual hardware is only able to calculate one frequency variation value at a time, but it works faster than the remaining parts of the circuit so it is able to consecutively provide the values for all branches. Real-time processing is assured because τ takes values from one to tenths of seconds in typical QCM applications (Fig. 4 corresponds to τ values ranging from 1 to 38 s.). This architecture makes possible the use of Eq. 7 to 9, where frequency differences of a given order are calculated from the frequency differences of preceding orders.
Figure 4. Simplified scheme of the internal pipeline structure of the Allan Deviation Coprocessor.
As it can be seen from Fig. 3, the output of the filter is decimated so it would be a waste to spend time and resources calculating the values which will be discarded. The calculus of which values are going to be decimated is predicted at the first stage of the filter. Therefore the pipeline is able to avoid calculating results after the last non decimated frequency difference value for each frequency sample at the input. This, as any jump prediction technique, leads to a significant improvement of speed in the pipeline, since no useless calculations are performed. A simplified block diagram of the pipeline architecture of the coprocessor is represented in Fig. 4.
The accumulators for all the τ values are built with a dual-port block RAM in the FPGA. The memory has one position per τ value and it can be addressed in base of them. The behavior of the block is simple, when a new value arrives it is squared and the previous value of the accumulator for the order of the input is recovered from the memory. Then both values are added and stored back in the same memory position. Another secondary block memory stores the counter values for all the branches, and its values are incremented when a new valid data arrives.
All of the operations are implemented using fixed-point arithmetic because floating-point operations in FPGAs demand a huge quantity of logic resources and result in slower implementations. In order to improve the fixed-point precision, the input stage of Fig. 4 can be programmed with a constant to scale the output value. By this way the precision of the following stages can be increased.
B. Embedded Processor
Multiplication, division and square root algorithms were performed with floating-point arithmetic using an embedded microprocessor. There are two important reasons to use this solution. On the one hand, the hardware implementation of the square root and division using parallel processing algorithms demand an unacceptable amount of FPGA resources. On the other hand, these complex operations require many clock cycles in a sequential embedded processor but they are executed only once when data filtering and accumulation have been finished. At the same time, the microprocessor can be also devoted to perform other tasks like the calculus of the QCM sensor mass resolution as well as higher level control and monitoring tasks.
The main task of the embedded microprocessor of Fig. 3 is to perform the last calculations needed to obtain the values of Allan deviation for all the considered τ values. First the processor casts the data obtained from the coprocessor from integers to floating-point and then it performs the division and the square root over the result with the full software implementation of the double precision arithmetic. This takes some time since all the operations have to be made for all considered τ values, but the calculation delay is not a problem since the calibration process has already finished.
One the values of Allan deviation for all considered τ are obtained, the maximum of the function is observed as the worst scenario which limits sensor performance. The mass resolution of the QCM system can be easily estimated as the quotient of the nominal frequency over Sauerbrey coefficient multiplied by the maximum of the Allan deviation function (Eq. 3). These last operations are also included in the software written for the system, as well as some code needed for providing interactivity with the user.
III. SYSTEM IMPLEMENTATION
The design and implementation of the Allan Deviation Coprocessor were performed using the Xilinx's design environments: ISE Foundation and System Generator for DSP. The design was synthesized for the Virtex-4 family of Xilinx FPGAs but it was described using VHDL language in order to ensure the design portability to other devices. The MicroBlaze core was integrated in the FPGA as an embedded processor. This soft core is based on a 32-bit Harvard RISC microprocessor architecture with an instruction set optimized for embedded applications. The microprocessor interconnection architecture is shown in Fig. 5. The MicroBlaze software microprocessor constitutes the main core of the system and the Allan Deviation Coprocessor is connected as an embedded peripheral. The On-chip Peripheral Bus (OPB) is a fully synchronous bus that provides separate 32-bit address and up to 32-bit data buses. It is the bus commonly used in embedded systems implemented with Xilinx FPGAs. In the system described here, this is the main bus used by the processor to communicate with normal peripherals as a serial interface or an OLED matrix display.
Figure 5. Module interconnection scheme of the QCM reconfigurable system.
The Local Memory Bus (LMB) module is a fast, local bus for connecting the MicroBlaze instruction and data ports to high-speed peripherals, primarily on-chip block RAM (BRAM). The memory requirements to test the system are high, since double precision floating-point routines require a lot of space, so the maximum addressable amount of memory was used (64 Kbytes).
The Fast Simplex Link (FSL) is intended to connect high speed DSP elements to an embedded system based on MicroBlaze. The bus structure is mostly equal to a dual-port FIFO, which can be configured to run both ports with the same or different clock signals. Due to its simplicity it was chosen to interconnect the input of the Allan Deviation Coprocessor to the MicroBlaze processor and to the output of the Frequency Meter.
The connection of the output of the Allan Deviation Coprocessor to MicroBlaze presents a little more trick. In order to keep simple the interconnection scheme, the standard behavior of the FSL was modified. The dual-port RAM which can be seen in Fig. 5 is randomly accessible from the coprocessor and sequentially accessible from the processor. New values to be stored in the results memory arrive constantly and out of sequence, so the random access allows the peripheral to re-construct the sequence while the values are always available for the master processor.
The Xilinx's design tool Embedded Development Kit (EDK) was used in this project to define and configure the software part of the measurement system. The Standalone Board Support Package (BSP), included in the Xilinx EDK, is a bare-bones kernel. It provides a very thin interface to the hardware, offering minimal functionality that will be required by control applications. Some typical functions offered by the standalone BSP include setting up the interrupt system, exception system, configuring caches and other hardware specific functions. These, and the floating-point library functions, also provided with the development applications, make most of the system software assignments.
IV. SYSTEM PERFORMANCE
To validate the system performance a test platform made up of a 9 MHz QCM sensor in homogeneous liquid (distilled water) and a development board based in the Virtex-4 LX 25 FPGA was used. Nevertheless, it was verified that the design can be also compiled for other cheaper Xilinx devices like the Spartan-3 family.
An example of the results obtained for Allan deviation can be seen in Fig. 6 and 7. They were calculated by varying the averaging time (τ) from 1 s to 38 s, using 1000 samples (Fig. 6). The graph calculated by the reconfigurable system (Fig. 7) adjusts its values to the ones derived from the direct application of the definition formula for small values of τ, but the precision is not that good for higher integration times. This is due to the quantification error introduced by fixed-point arithmetic in the filters, and it disappears once a high number of samples is accumulated. Therefore, for the implemented system the statistical necessity of performing the process for a big number of frequency samples becomes more demanding than if the Allan deviation formula is used.
Figure 6. Frequency input samples for the test platform.
Figure 7. Allan Deviation results obtained form the reconfigurable coprocessor.
Figure 8 shows the numeric and graphical representation of the computations performed by the reconfigurable system using an OLED display as a peripheral. The data input corresponds to the samples of Fig. 6.
Figure 8. Results of the reconfigurable system represented in an OLED display using a numerical format (a) and a graphical function (b).
The resource utilization of the system can be observed from Tables 1 and 2. It is easily noticed that more than a half of the resources used by the full test platform are consumed by the Allan Deviation Coprocessor (Table 1). This added to the fact that in a QCM based system the Allan deviation peripheral is only needed for the calibration stage, makes possible the use of partial reconfiguration for changing the utility of the coprocessor resources.
Table 1. The resource utilization and maximum frequency for the Allan coprocessor.
Table 2. The resource utilization and maximum frequency for the whole system.
The architecture of the coprocessor is simple so it is possible to translate its functionality into other target FPGA devices with different technologies. The connection of the peripheral to the microprocessor can be easily modified so it can be also adapted to another processor interface. In fact the current interface is just a memory with random access for writing and serial access for reading.
A reconfigurable SoC system for the estimation of the mass resolution of QCM sensors were implemented and satisfactory tested. It is able to work at high speed, at least faster than needed for the prospective applications of the sensor. It also optimizes the required amount of hardware resources inside the programmable device by taking advantage of the concurrency of the calculus at the algorithm level and with the use of pipelining and jump prediction techniques. No external chips are required for the data processing and the reconfigurable systems works as an autonomous measurement equipment.
1. Granstaff, V.E. and S.J. Martin, "Characterization of a thickness-shear mode quartz resonator with multiple nonpiezoelectric layers", J. Appl. Phys., 75, 1319-1329 (1994). [ Links ]
2. IEEE Standard 1139, IEEE Standard Definitions of Physical Quantities for Fundamental Frequency and Time Metrology - Random Instabilities, (1999). [ Links ]
3. Kuboki, K. and M. Ohtsu, "An Allan variance real-time processing system for frequency stability measurements of semiconductor lasers", IEEE Transactions on Instrumentation and Measurement, 39, 637-641 (1990). [ Links ]
4. Mingfu, L., P. Hsin-Min and L. Chia-Shu, "Fast computation of time deviation and modified Allan deviation for telecommunications clock stability characterization", Proc. IEEE International Symposium on Parallel Architectures and Networks, Dallas, USA, 156-160 (2000). [ Links ]
5. Rodriguez-Pardo, L., J. Fariña, C. Gabrielli, H. Perrot and R. Brendel, "Resolution in quartz crystal oscillator circuits for high sensitivity microbalance sensors in damping media", Sensors and Actuators B, 103, 318-324 (2004). [ Links ]
6. Sauerbrey, G., "Verwendung von schwingquarzen zur wägung dünner schichten und zur mikrowägung", Z. Phys., 155, 206 (1959). [ Links ]
7. Vig, J.R. and F.L. Walls, "A review of sensor sensitivity and stability", Proc. IEEE/EIA International Frequency Control Symposium and Exhibition, 30-33 (2000). [ Links ]
Received: April 14, 2006.
Accepted: September 8, 2006.
Recommended by Special Issue Editors Hilda Larrondo, Gustavo Sutter.