Compresión de señales ECG sobre FPGA utilizando un esquema modificado de convolución de la Transformada Wavelet Discreta
Dora M. Ballesteros^{1} Diana Marcela Moreno^{1} Andrés E. Gaona^{2}
^{1}Telecommunications Engineering Department. University Militar Nueva Granada. Kra. 11 No. 10180. Bogota, Colombia. Email: dora.ballesteros@unimilitar.edu.co; diana.moreno.enciso@gmail.com
^{2}Electronic Engineering Department. University Distrital Francisco Jose de Caldas. Carrera 7 No. 4053. Bogota, Colombia. Email: aegaona@udistrital.edu.co
RESUMEN
Este documento presenta el diseño basado en FPGA para la compresión de señales ECG utilizando la Transformada Wavelet Discreta y un método de codificación sin pérdida de información. A diferencia de los trabajos clásicos para modo offline, el trabajo actual permite la compresión en tiempo real de la señal ECG por medio de la reducción de la información redundante. Se propone un modelo para el esquema de convolución en formato punto fijo, el cual tiene buen desempeño en relación a la tasa de salida, la latencia del sistema, la máxima frecuencia de operación y la calidad de la señal comprimida. La arquitectura propuesta, la cuantización utilizada y el método de codificación proporcionan un PRD que es apto para el análisis clínico.
Palabras clave: Señal ECG, transformada Wavelet Discreta, relación de compresión, esquema de convolución eficiente, codificación sin pérdida de información.
ABSTRACT
This paper presents FPGA design of ECG compression by using the Discrete Wavelet Transform (DWT) and one lossless encoding method. Unlike the classical works based on offline mode, the current work allows the realtime processing of the ECG signal to reduce the redundant information. A model is developed for a fixedpoint convolution scheme which has a good performance in relation to the throughput, the latency, the maximum frequency of operation and the quality of the compressed signal. The quantization of the coefficients of the filters and the selected fixedthreshold give a low error in relation to clinical applications.
Keywords: ECG signal, Discrete Wavelet Transform, compression ratio, efficient convolution scheme, quality score.
INTRODUCTION
Discrete Wavelet Transform (DWT) has been used in the last years in applications of signal processing like denoising, compression and coding. Methods for both offline and online mode have been proposed. In the first, the information is processed framebyframe; in the second, it is processed samplebysample.
In denoising and compression methods, the DWT is accompanied by a thresholding stage to reduce the redundant information [1]. The threshold controls the compression ratio (CR) of the system. The threshold higher, the compression ratio higher. The limit of the threshold is related to the desired PercentRootMeanSquareDifference (PRD) of the compressed (or filtered) signal [2]. When the input is a biomedical signal such as the ECG signal, the PRD should be lower than 4% to guarantee that clinical useful information is kept [3]. The selection of the threshold can be due to the energy packing efficiency [4], fixed percentage [5], [6] or universal threshold [7].
In offline mode, the algorithms can reach compression ratio up to 20:1 with a PRD lower than 10% [8][10]. However these methods are not suitable for realtime implantation neither for standalone applications.
Because in portable devices it is desirable the realtime processing, the design of new methods or the adaptation of the known methods allows the transmission or storage of the input signal, samplebysample. The problem is design a realtime architecture for the compression of ECG signal with low latency, low error of quantization, low losing of information and high compression ratio.
For the hardware realization of the DWT, the classical schemes are the based on the convolution and the lifting scheme [11][13]. The convolution scheme demands massive operations, which implies hardware consumption, and it is not efficient because the half of data is eliminated in the subsampling process; while the lifting scheme reduces the operations in three steps: split, prediction and update. The disadvantage of the lifting scheme is that the lifting coefficients are not integer; therefore, the scheme requires floatpoint multipliers and float point adders [14]. Although some modifications have been proposed, the use of floatpoint modules are necessary in most of the schemes based on the lifting design. To overcome this restriction, we propose a scheme which takes advantage of both the convolution and lifting schemes. The output of the each filter is calculated by a convolution process, but, a split step is added in our proposal. In our scheme, the modules (adder, multiplier) work in integer format and the system only calculates the outputs that are not eliminated in the subsampling process. Summarizing, we propose an integertointeger wavelet transform scheme which reduces the hardware resources of the convolution one and it does not use floatpoint modules such as the lifting scheme.
Finally, one lossless encoding method is added to the architecture of compression of the ECG signal. According to [15], Huffman encoding and Runlength (RL) encoding provides similar results of CR and PRD, but RL is more suitable for realtime applications. Because it is a lossless encoding method, the PRD is only due to the quantization error and the thresholding process. If an adequate threshold is selected and a low error of quantization is used, the compressed signal should be closer to the ECG.
ARCHITECTURE OF THE DISCRETE WAVELET TRANSFORM
The two classical schemes to perform the Discrete Wavelet Transform are the convolution (or filter bank) and lifting scheme.
Convolution scheme
It is based on two FIR filters and one subsampling process. The detail and coarse coefficients for one level of decomposition are obtained according to Figure 1.
Figure 1. Convolution scheme of DWT.
In the above Figure, h_{1} and h_{0} are the impulse response of the high pass filter and low pass filter, respectively; x[n] is the input signal; d_{1} and c_{1} are the detail and coarse coefficients of the first level. The symbol means subsampling by 2, dropping sampling with odd indexes [16]. In other words, after the convolution process between x[n] and [h_{1} h_{0}], the odd samples of the outputs are eliminated. In this scheme, half of all operations are wasted, because only the halves of the data are used.
Lifting scheme
This scheme is based on three processes: split the input data, prediction and updating. The block diagram is presented in Figure 2.
Figure 2. Lifting scheme of DWT.
The input data are split in two parts, even and odd samples; the prediction step produces the detail coefficients and the update step generates the coarse representation of the input signal. This scheme has been used with biorthogonal filters like 9/7 DWT. In that case, six constants are included in the architecture: ?, ?, ?, ?, k, 1/k. The disadvantage is that the constants are not integer and they are represented by 18 bits in fixedpoint format, 2 for the integer part and 16 for the right side of the point [14]; or by 10 bits in fixedpoint format, 8 for the right side of the point [17]. Then, the arithmetic operations are in float point.
Efficient convolution scheme
The aims of this scheme are reducing the operations of the classical convolution scheme and avoiding operations with floatpoint format of the lifting scheme. Unlike the lifting scheme which splits the input data, our scheme splits the clock signal and the filtering is calculated in alternate clock cycles. The coarse Cc_{1}) and detail Cd_{1}) coefficients are calculated according to:

(1) 

(2) 
Where M is the length of the FIR filters, [h_{1} h_{0}] are the impulse response of the low and high pass filters, respectively; and x[n] is the input signal. According to eq. (1), (2), the coarse coefficients are calculated in the even cycles of the clock signal; while the detail coefficients in the odd cycles. Since the detail coefficients are obtained one cycle after of the even positions, it is necessary to include an additional delay in its mathematic formula. Then, the even values of the detail coefficients are obtained by:

(3) 
With this approach, only the halves of the operations are calculated and the throughput of the system is the double of the classical convolution scheme. On the other hand, all of the hardware modules (adder, multiplier) can operate with integer data if [h_{0} h_{1}] are encoded in an integer binary format.
HARDWARE IMPLEMENTATION
We implemented an 8bit integertointeger efficient convolution scheme of the DWT, family sym4, using a FPGA of Xilinx. The dwt block includes the following modules: div_2, bank of register, coefficients and multiplier/adder. Additionally, the thresholding and encoding process are added to the compression scheme of the ECG signal. The high level description is written in VHDL code and it is simulated on ModelSim. Finally, the code is synthetized using a Spartan3E100, and validated with real ECG signals. The general architecture is illustrated in Figure 3.
Figure 3. Architecture of the proposed scheme.
Div_2: this block divides the frequency of the clock signal by 2. The output has the half of the frequency of the clock signal.
Figure 4. clk & div_2.
Bank of register: it calculates the eight delays of the input signal by flipflops Dtype. The output is updated each cycle of the clock signal.
Coefficients: according to the value of div_2, the coefficients of the low pass filter or the high pass filter are selected. If div_2='1' it selects h_{0}, but if div_2='0' it selects h_{1}. The binary representation of h0 is presented in Table 1.
Table 1. Binary representation of sym4.
In a similar way, the coefficients of the high pass filter are encoded with 7bits.
Multiplier/Adder: this block computes the convolution between x[n] and the impulse response of the FIR filters. If div_2='1'then it works as a low pass filter; while if div_2='0' it works as a high pass filter. Because [h_{0} h_{1}] are unsigned, the equations (1) and (2) are transformed, for the case of sym4, as:

(4) 

(5) 
Since x[n] is encoded with 8 bits and [h_{0} h_{1}] is encoded with 7 bits, the output of the convolution is represented by 16 bits. The coefficients c_{1} and d_{1} correspond to the 8 most significant bits of the output (the 8 LSBs are ignored); it means the output is divided by 256. The circuit of this block is presented in Figure 5.
Figure 5. Multiplier/adder block.
Thresholding: it sets to zero the coefficients that are lower in magnitude than the threshold. It follows the hard rule, defined as:

(6) 
Where y is the input (c_{1} or d_{1}), th is the threshold and f(y) is the thresholded coefficient. The threshold uses in this work is the proposed in [15].
According to the thresholdingencoding scheme presented in [18], three flags are calculated: b1, b2 and b3. The meaning is presented in Table 2.
Table 2. Thresholding flags.
Encoding: it is based on the runlength encoding method. The runlength is a lossless encoding method that takes advantage of the consecutive repetitions of a specific number [19]. Because the thresholding step sets to zero a large number of coefficients, the runlength scheme represents the data by the zero follows by the total of repetitions. If the coefficient is different to zero, the encoded data is equal to the wavelet coefficient. In our architecture, the output of the system is data and row; data is the encoded wavelet coefficient and row is the position into the runlength code. According to the value of the flags b_{1}, b_{2}, b_{3}, the row is updated. (Table 3).
Table 3. Output of the encoding block.
Every time that f(x)=0 and b3='1', the counter increases its value; its mean the flag count account the total of consecutive zeros. When a new data different of zero appear, the last value of the count is written in the runlength code, follows by the new data, f(x).
RESULTS
In this section we present some results related to performance of the proposed model. The quality of the hardware architecture and the compression algorithm are measured. First, the work is analyzed in terms of the metrics of hardware. Second, the CR and PRD are measured.
Performance of the Hardware Architecture: the FPGA Spartan3E100 (BASYS2 board) of Xilinx is programmed with the VHDL code. Additionally an A/D and D/A blocks are connected to the FPGA for the hardware validation of the compression scheme. Four works of hardware realizations of the DWT have been selected with the purpose of comparing the performance of the algorithm. Two of them correspond to convolution scheme and the others to lifting scheme. In Table 4, the metrics are shown.
In Table 4, Scheme corresponds to the based on convolution (conv), lifting (lif) or modified convolution (mc); Mode is offline if the data is processed framebyframe or realtime if it is processed samplebysample; Base corresponds to biorthogonal (Bior) or Orthogonal (Orth); Format of quantized data is fixedpoint (FP) or Integer (Int); Error of quantization is the produced by the quantization of the coefficients of the FIR filters (it is measured for an input signal equal to a constant); Maximum Delay is the time that the DWT block takes to calculate the detail and approximation coefficient (it is obtained from the synthesized tool); while Latency is the times of cycles of the clock signal to obtain the output from a specific input (it is tested by the simulation on ModelSim).
Table 4. Comparison to related works: hardware metrics of the DWT block.
According to Table 4, our design has the lowest error of quantization, which is desirable to obtain a low value of PRD. On the other hand, the latency of our work allows that the answer of the system will be faster than the answer of the other works. Finally, the proposed model can work with signals with higher bandwidth (such as the speech signals) than the signals in the convolution scheme, because the maximum delay is lower.
Unlike some works whose eliminated completely the detail coefficients, our work kept the coefficients higher than a fixedthreshold. In Figures 6 and 7, we present an example of one ECG signal from the Fluke PS420 Multiparameter Patient Simulator as the input of the system. It was configured with 60 beats per minute (bpm). Additionally, the coarse and detail coefficients are calculated.
Figure 6. Example of ECG signal (channel 1) and its coarse coefficients (channel 2).
Figure 7. Example of ECG signal (channel 1) and its detail coefficients (channel 2).
According to Figure 6, the highest amplitude of the coarse coefficients is the quarter of the highest of the ECG signal, for two reasons: first, the filters [h_{0} h_{1}] were multiplied by 125 but their outputs were divided by 256; it implies the half of the amplitude; second, while the input signal is 8bits in unsigned format, the wavelet coefficients [c_{1} d_{1}] are in 8bits signed format (7bits for the amplitude). Additionally, it is notice that not all the detail coefficients are set to zero. The results are in agreement with theoretical results.
Performance of the Compression Model: the second group of metrics is related to the CR and the PRD. The compression ratio measures the quantity of wavelet coefficients of the input versus the quantity of the encoded wavelet coefficients, according to the following equation:

(7) 
Where ^encoded (C_{1}) encoded (d_{1})] are the encoded coarse and detail coefficients, respectively. This value is strongly related to the value of the fixedthreshold. In Figure 8, the compression ratio & the value of the threshold (Th) is presented, for the levels of decomposition, N=1, 2, 3, 4. Because the binary amplitude of the coarse and detail coefficients is [63 63], the highest threshold (10) corresponds approximately to the 15% of the highest wavelet coefficient.
Figure 8. CR & Th.
The data are obtained from the hardware results using the Fluke PS420 Multiparameter Patient Simulator with bpm=60, 90, 120. The average is plotted in each case.
The quality of the compressed signal is measured with the PercentRootMeanSquareDifference (PRD), according to:

(8) 
Where X_{i} is the original signal from the ECG record, X_{i} is the compressed signal and L is the length of the signals. In Figure 9, the performance of the compression model related to the PRD is presented.
According to the Figures 8 and 9, the compression ratio of the proposed system is up to 8 for a threshold of 10. It could be slightly better if the PRD is in the limit of 4%. Nevertheless it is evident that if PRD increases, then CR increases too.
Figure 9. PRD & Th.
Because the PRD in the entire works is not ever in the same range, a parameter that helps to compare the tradeoff between the CR and the PRD is the Quality Score (QS) [25]. This is the relation between the CR and the PRD, represented as:

(9) 
The higher QS, the hiquer relationship between the CR and the PRD. In Figure 10, the QS for the four levels of decomposition is presented.
Figure 10. QS & Th.
Now, we compare our work to others algorithms of compression of the ECG signals. The results are presented in Table 5.
According to Table 5, our systems has better CR than [26], [28], but lower than the others. Nevertheless, our proposal can be work in realtime without units of preprocessing or postprocessing. The works that used Huffman encoding are not suitable for samplebysample mode, because they need a prior knowledge of the data. This is the main difference between our proposal and those in the literature.
Table 5. Comparison to related works: compression model.
CONCLUSIONS
This paper describes a modified scheme of the convolution one which has the same throughput of the lifting scheme, because only the even wavelet coefficients are calculated. The maximum frequency of operation is higher than in the convolution scheme and is similar than in the lifting scheme. Because our architecture not needs external memories, the system works in samplebysample mode. The low error of quantization helps to keep the quality of the signal, because the experimental values (wavelet coefficients) are similar than the theoretical values. Comparing to others compression models, our proposal has similar results in relation to the compression ratio, but the QS could be better. Nevertheless, the PRD satisfied the requirements of clinical applications. This work may improvement with a variable quantization of the wavelet coefficients.
ACKNOWLEDGEMENT
This work was supported in part by University Military Nueva Granada under Grant ING290 and ING641.
Preliminary version of this paper was presented at the 2010 IEEE ANDESCON, Bogota, Colombia.
REFERENCES
[1] D. Donoho. "DeNoising by SoftThresholding". IEEE Transactions On Information Theory. Vol. 41, No 3, pp. 613627. May, 1995. ISSN 00189448. DOI: 10.1109/18.382009.
[2] J. Chen and Sh. Itoh. "A Wavelet TransformBased ECG Compression Method Guaranteeing Desired Signal Quality". IEEE Transactions on Biomedical Engineering. Vol. 45, Issue 12, pp. 14141419. December, 1998. ISSN 00189294. DOI: 10.1109/10.730435.
[3] M.L. Hilton. "Wavelet and Wavelet Packet Compression of Electrocardiograms". IEEE Transactions on Biomedical Engineering. Vol. 44, Issue 5, pp. 394402. May, 1997. ISSN 00189294. DOI: 10.1109/10.568915.
[4] M. Sharafat Hossain, T. Aziz and M.A. Haque. "ECG Compression Using Multilevel Thresholding of Wavelet Coefficients". International Conference on Intelligent Sensors, Sensor Networks and Information Processing, pp. 321326. 2008.
[5] R. Benzid, E Marir, A. Boussaad, M. Benyoucef and D. Arar. "Fixed percentage of wavelet coefficients to be zeroed for ECG compression". Electronics Letters. Vol. 39, Issue 11, pp. 830831. May, 2003. ISSN 00135194. DOI: 10.1049/el:20030560.
[6] D.M. Ballesteros, A.E. Gaona and L.F. Pedraza. "Discrete Wavelet Transform in Compression and Filtering of Biomedical Signals, Discrete Wavelet TransformsBiomedical Applications". Hannu Olkkonen (Ed.). InTech, pp. 1732, September, 2011. ISBN: 9789533076546.
[7] M. Kania, M. Fereniec and R. Maniewski. "Wavelet Denoising for Multilead High Resolution ECG Signals". Measurement Science Review. Vol. 7, Section 2, Issue 4, pp. 3033. 2007.
[8] L. HsiehWei, H. KingChu, W. TsungChing and K. ChengTung. "A modifiedRun length Coding for Realization of WaveletBased ECG Data Compression System". 6th International Conference on Networked Computing (INC), 2010, pp. 14. May, 2010.
[9] Ch.T. Ku, K.Ch. Hung, T.Ch. Wu and H.Sh. Wang. "WaveletBased ECG Data Compression System with Linear Quality Control Scheme". IEEE Transactions on Biomedical Engineering. Vol. 57, Issue 6, pp. 13991409. June, 2010. ISSN 00189294. DOI: 10.1109/TBME.2009.2037605.
[10] B.A. Rajoub. "An Efficient Coding Algorithm for the Compression of ECG Signals Using the Wavelet Transform". IEEE Transactions on Biomedical Engineering. Vol. 49, Issue 4, pp. 355362. April, 2002. ISSN 00189448. DOI: 10.1109/10.991163.
[11] K.A. Kotteri, S. Barua, A.E. Bell and J.E. Carletta. "A Comparison of Hardware Implementations of the Biorthogonal 9/7 DWT: Convolution versus Lifting". IEEE Transactions on Circuits and Systems. Vol. 52, Issue 5, pp. 256260. May, 2005. ISSN: 15497747. DOI: 10.1109/TCSII.2005.843496.
[12] W. Sweldens. "The Lifting Scheme: A CustomDesign Construction of Biorthogonal Wavelets". Applied and Computational Harmonic Analysis. Vol. 3, Issue 2, pp. 186200, April, 1996. DOI: 10.1006/acha.1996.0015.
[13] P. Longa, A. Miri and M. Bolic. "A flexible design of Filterbank Architectures for Discrete Wavelet Transforms". 32TH IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 14411444. Hawai, USA. 2007. ISSN: 15206149. DOI: 10.1109/ICASSP.2007.367118.
[14] W. Wang, Z. Du and Y. Zeng. "HighSpeed FPGA Implementation for DWT of Lifting Scheme". IEEE 5th International Conference on Wireless Communications Networking and Mobile Computing, pp. 14. 2009. ISBN: 9781424436927. DOI: 10.1109/WICOM.2009.5302003.
[15] D. Ballesteros and A. Gaona. "Multiresolution analysis and lossless encoders in the compression of electrocardiographic signals". Visión Electrónica: Algo Más Que Un Estado Sólido. Vol. 4, fasc.1, pp. 511. 2010.
[16] M. Vetterli and C. Herley. "Wavelets and Filter Banks: Theory and Design". IEEE Transactions on Signal Processing. Vol. 40, Issue 9, pp. 22072232. September, 1992. ISSN: 1053587X. DOI: 10.1109/78.157221.
[17] S.V. Silva and S. Bampi. "Area and Throughput TradeOffs in the Design of Pipelined Discrete Wavelet Transform Architectures". IEEE Design, Automation and Test in Europe Conference and Exhibition. 2005. ISSN: 15301591. DOI: 10.1109/DATE.2005.66.
[18] D. Ballesteros, D. Moreno and A. Gaona. "Compression of biomedical signals on FPGA by DWT and runlength". IEEE ANDESCON 2010. Bogota, Colombia. ISBN: 9781424467402. DOI: 10.1109/ANDESCON.2010.5633621.
[19] S.W. Smith. "Digital Signal Processing: A practical guide for engineers and scientists". Elsevier Science. Newnes. 2003. ISBN: 075067444X.
[20] K.A. Kotteri, S. Barua, A.E. Bell and J.E. Carletta. "A comparison of Hardware Implementations of the Biorthogonal 9/7 DWT convolution versus lifting". IEEE Transactions on Circuits and Systems II: Express Briefs. Vol. 52, Issue 5, pp. 256260. May, 2005. ISSN 15497747 DOI: 10.1109/TCSII.2005.843496.
[21] K.A. Kotteri, A.E. Bell, J.E. Carletta. "Design of Multiplierless, HighPerformance, Wavelet Filter Banks With Image Compression Applications". IEEE Transactions on Circuits and SystemsI: Regular Papers. Vol. 51, Issue 3, pp. 483494. March, 2004. ISSN 15498328. DOI: 10.1109/TCSI.2003.820234.
[22] K.A. Kotteri, A.E. Bell and J.E. Carletta. "Polyphase Structures for Multiplierless Biorthogonal Filter Banks". IEEE International Conference on Acoustics, Speech and Signal Processing, 2004. DOI: 10.1109/ICASSP.2004.1327081.
[23] W. Wang, Z. Du and Y. Zeng. "HighSpeed FPGA Implementation for DWT of Lifting Scheme". International Conference on Wireless Communication, Networking and Mobile Computing. September, 2009. DOI: 10.1109/WICOM.2009.5302003.
[24] S.V. Silva and S. Bampi. "Area and Throughput Tradeoffs in the design of Pipelined Discrete 16 Wavelet Transform Architectures". Design, Automation and Test in Europe Conference and Exhibition. 2005.
[25] S. Lee, J. Kim and M. Lee. "A RealTime ECG Data Compression and Transmission Algorithm for an eHealth Device". IEEE Transactions on Biomedical Engineering. Vol. 58, Issue 9, pp. 24482455. September, 2011. ISSN 00189294. DOI: 10.1109/TBME.2011.2156794.
[26] B. Furht and A. Perez. "An Adaptive RealTime ECG Compression Algorithm With Variable Threshold". IEEE Transactions on Biomedical Engineering. Vol. 35, Issue 6, June, 1988. ISSN 00189294. DOI: 10.1109/10.2121.
[27] B.A. Rajoub. "An Efficient Coding Algorithm for the Compression of ECG Signals Using the Wavelet Transform". IEEE Transactions on Biomedical Engineering. Vol. 49, Issue 4, pp. 355362. April, 2002.
[28] N.A. Elneel and D. Schroeder. "HardwareBased Data Compression for Efficient ECG Signal Transmission over a Wireless Sensor Network". Workshop Selbstorganisierende Sensorund Datenfunknetze, Hamburg, Germany. October, 2009.
[29] H. Kim, Y. Kim and H.J. Yoo. "A Low Cost Quadratic Level ECG Compression Algorithm and Its Hardware Optimization for Body Sensor Network System". 30th Annual International IEEE EMBS Conference Vancouver. British Columbia, Canada. August 2024, 2008.
Received: October 3, 2011 Accepted: March 15, 2012