# Efficient Time and Power of 9/7 Coefficient based 2-D Discrete Wavelet Transform using Vedic Math's

# Roshni Singh, Prof. Sunny Jain

M. Tech. Scholar, Associate Professor
Department of Electronics and Communication
LNCT, Bhopal

Abstract— The different 1-D and 2-D DWT models that exist in the writing are Row-segment, parallel channel, collapsed, flipping and recursive structures. The designs fluctuate as for the computational and equipment prerequisite, the memory required to store the information picture, and the middle of the road coefficients. The principle target of this exploration work is to determine proficient VLSI structures, for the equipment usage of the 9/7 DWT, utilizing complex multiplier and improving the speed and equipment complexities of existing designs.

#### Keywords: - 1-D DWT, 2-D DWT, Complex Multiplier

#### I. INTRODUCTION

Lately, there have been critical endeavors by the business and specialists to create PC tomography (CT) imaging frameworks in terms of new preparing and reproduction calculations. A CT scanner has extensive thoughtfulness regarding improve both equipment and programming to empower high goals CT pictures to be created. The quality and speed with which they are produced is to some degree because of enhancements in proficient picture reproduction calculations [1].

This field is as yet developing and new calculations are presently utilized that adjust to the assortment of issues like model and handle the projection commotion, low identifier check, non-uniform courses of action of sensor, disperse and so forth., Two noteworthy order in picture recreation calculations are systematic and Iterative calculation. Investigative model is computationally effective and speed with the a few suppositions like scanner geometry and crude information, for example, coherence of the projections and silent estimations, and so on [2, 3].

To accomplish better picture quality from similar crude information, progressively practical suppositions about scanner geometry and commotion measurements must be made. This is done in the more computationally complex iterative remaking techniques. Such iterative recreation strategies may bring about longer remaking times yet in addition in considerably less picture clamor from a similar crude information through increasingly complex displaying of finder reaction and the measurable conduct of the estimations. Iterative reproduction calculation is much proficient than scientific calculation. Among which, Iterative reproduction is considered in this examination. Presently adays, Iterative reproduction is playing a real job in PC tomography to improve nature of picture and lessen the movement ancient rarities. Subsequently a great deal of

research work has been done so as to improve the recreated picture in both visual and mistake examination [4].

The Discrete Wavelet Transform (DWT) is one of the best strategies in the field of Image pressure and Image coding. Joint Pixel Master Group (JPEG) is the primary standard procedure for the picture pressure. The coding skill and picture quality is proficient in the Discrete Wavelet Transform (DWT) when contrasted with the customary Discrete Cosine Transform (DCT). JPEG is clarified the irreversible type of the Discrete Wavelet Transform for the productive picture pressure. Computerized Image is one of the principle prerequisites for both continuous applications also as research zone. The prerequisite of the Image pressure is generally high because of the traffic delivered by the media sources. The One Dimensional what's more, Two-Dimensional Discrete Wavelet Transform is the key capacity for picture preparing. The Multi-goals signal examination is accomplished in both time and recurrence space in Discrete Wavelet Transform. The Discrete Wavelet Transform is broadly utilized in the picture pressure in JPEG 2000 because of its time and recurrence attributes [5].

The picture remaking is characterized as the system of including two dimensional pictures into PC by examining the state of the picture. The picture remaking is predominantly utilized in different applications like Medicine, Apply autonomy, and Gaming. In Discrete Wavelet Transform, there is some arrangement of wavelet works that are utilized for the pressure, clamor decreases, and remaking process. All in all, all correspondence channels have irregular clamor because of these qualities, and these channels are influenced by terrible association from the wellspring of the channel. The picture recreation is performed by the up testing pursued by the computerized channels [6].

Multi-goals wavelet change is the conventional methodology of recreation. The primary drawback of ordinary methodology is the most astounding equipment necessities to store the halfway qualities. The computational deferral of the fixed is likewise extraordinary. To beat these issues, the multi-band wavelet change is principally utilized for the picture remaking process. By utilizing the proposed multiband wavelet change, the recurrence covering of the hardware is decreased. The summation channels are principally used to assemble the reproduction square. The picture difference and power are proficient in the multiband wavelet to change when contrasted with customary multiresolution wavelet change [7, 8].

#### II. VEDIC MULTIPLIER

The work, Vedic Mathematics or 'Sixteen Simple Mathematical Formulae from the Vedas' was composed by His Holiness Jagadguru Sankaracharya Sri Bharati Tirthaji Maharaja of Govardhana Matha, Puri (1884 - 1960). The very word 'Veda' has a determined significance, ie the origin and boundless storage facility of all learning.

# **URDHVA TIRYAKBHYAM SUTRA:-**

This sutra has been recognized for use in the present work since it gives a general recipe that is pertinent to all instances of duplication (enormous piece increase, little piece augmentation and particular augmentation) and is additionally extremely smaller in the division of a huge number by another huge number, for instance division of a 15 digit number by a 5 digit number. The mathematical standard included is clarified as pursues

Assume we need to duplicate (ax+b) by (cx+d). The item is  $acx^2 + x(ad+bc) + bd$ . This can be acquired as pursues:

Stage 1: The coefficient of  $x^2$  is gotten by the vertical increase of an and c

Stage 2: The coefficient of x is gotten by the across increase of an and d and of b and c and the expansion of the two items Stage 3: The autonomous term is touched base at by vertical increase of the supreme terms b and d.

# Example of 8×8 bit Multiplier:-

Give An a chance to be the 8 piece multiplicand and B be the 8 piece multiplier. These can be additionally partitioned into 4 piece terms as demonstrated as follows:

$$A = A_7 A_6 A_5 A_4 \qquad A_3 A_2 A_1 A_0$$
 
$$X_1 \qquad \qquad X_0$$
 
$$B = B_7 B_6 B_5 B_4 \qquad B_3 B_2 B_1 B_0$$

So,

 $A = X_1 X_0$  (8-bit Multiplicand)

 $\mathbf{Y}_0$ 

 $\mathbf{Y}_1$ 

 $B = Y_1 Y_0$ (8-bit Multiplier)

Where X1, X0, Y1, Y0, are every one of 4-bits. Duplicating, we get a 16 piece item, which is additionally separated into 4 four piece terms, F, E, D, C is appeared in figure 1.

$$X_1 X_0 \times Y_1 Y_0 = F E D C$$
 
$$CP = X_0 \times Y_0 = C$$
 
$$CP = X_1 \times Y_0 + X_0 \times Y_1 = D$$
 
$$CP = X_1 \times Y_1 = F E$$

Where F is the convey of the result of X1 x Y1 and CP is the cross item.

Note:

• Every Multiplication activity is an inserted parallel 4 x 4 duplicate module.

- The convey created in every one of the duplication modules is engendered to the following module.
- This multiplier design has the favorable position looked at of insignificant door delays and improved normality of structure.

The procedure is additionally clarified with the assistance of models. Two digit and three digit duplication models are clarified utilizing decimal numbers and the augmentation procedure is appeared with the assistance of lines. The digits on either side of the line are duplicated and the outcome is added to the past convey and the procedure is proceeded.



Figure 1: Block Diagram of 8-bit Vedic Multiplier

#### III. DISCRETE WAVELET TRANSFORM

The Multi-Resolution Analysis (MRA) capacity and timescale region attributes of the Discrete Wavelet Transform (DWT) have set up it as an amazing asset for various applications, for example, signal investigation, picture pressure and numerical examination, as expressed by Mallat (1989). This has driven various research gatherings to create calculations and equipment models to execute the DWT.



Figure 2: Three level decomposition of an image

In the customary convolution technique for DWT, a couple of Finite Impulse Response channels (FIR) is applied in parallel, to determine high pass and low-pass channel coefficients. Mallat's pyramid calculation can be utilized to speaks to the wavelet coefficients of a picture in a few spatial directions.

The designs are generally collapsed, and can be comprehensively ordered into sequential and parallel structures as talked about [7]. The engineering talked about executes a channel bank structure proficiently, utilizing digit sequential pipelining. This engineering structures the reason for the equipment execution of sub band decay, utilizing the

convolutional DWT for JPEG 2000.A general design in which DWT deteriorates the info picture is appeared underneath in Figure 2.

Every deterioration level appeared in Figure 2 includes two phases: organize 1 performs flat separating, and stage 2 performs vertical sifting. In the main level decay, the size of the info picture is N×N, and the yields are the three sub groups LH, HL, and HH, of size N/2×N/2. In the second-level deterioration, the info is the LL band, and the yields are the three sub groups LLLH, LLHL, and LLHH, of size N/4×N/4. The staggered 2-D DWT can be reached out in an undifferentiated from way. The number-crunching calculation of DWT can be communicated as essential channel convolution and down examining. The 2-D DWT is characterized as:

$$x_{LL}^{J}(n_1, n_2) = \sum_{i_1=0}^{K-1} \sum_{i_2=0}^{K-1} h(i_1)h(i_2)x_{LL}^{J-1}(2n_1 - i_1)(2n_2 - i_2)$$

(2)

 $x_{LH}^{J}\left(n_{1},n_{2}\right) = \sum_{i_{1}=0}^{K-1} \sum_{i_{2}=0}^{K-1} h(i_{1})g(i_{2})x_{LL}^{J-1}(2n_{1}-i_{1})(2n_{2}-i_{2})$ 

$$x_{HL}^{J}(n_{1},n_{2}) = \sum_{i_{1}=0}^{K-1} \sum_{i_{2}=0}^{K-1} g(i_{1})h(i_{2})x_{LL}^{J-1}(2n_{1}-i_{1})(2n_{2}-i_{2})$$

$$x_{HH}^{J}(n_1, n_2) = \sum_{i_1=0}^{K-1} \sum_{i_2=0}^{K-1} g(i_1) g(i_2) x_{LL}^{J-1} (2n_1 - i_1) (2n_2 - i_2)$$

Where XLL (n1, n2) is the info picture, J is the 2-D DWT level is channel length, h(n) is the motivation reactions of the low-pass channel, and g(n) is the drive reactions of the high-pass channel. The info succession X(n) in Figure.1.2 is convolved with the quadrature mirror channels H(z) and G(z), and the yields got are destroyed by a factor of two. After down inspecting, substitute examples of the yield succession from the low pass channel and high pass channel are dropped. This decreases the time goals significantly, and on the other hand duplicates the recurrence goals by two. During the reverse change calculation, YL(n) and YH(n) are first up-examined, by embeddings zeroes between two examples, and afterward sifted by low pass and high pass channels separately.

## IV. PROPOSED METHODOLOGY

In the Discrete Wavelet Transform, the bi-symmetrical wavelets are actualized by utilizing the lifting technique. The spatial area and lifting strategy is utilized to fabricate the lifting technique. In the lifting plan, three principle steps are for the most part played out that are, split, foresee and update. The info picture tests  $\mathbf{x}(\mathbf{n})$  are partitioned with respect to the odd and even examples in the split square. The channel is required for the odd and even examples to keep from the undesirable flagging. Lifting plan is performed by based sort of the channel.

Scaling step is utilized to locate the low pass sub-groups of the odd and even tests. Channel usage is changed into the augmentation lattices in the lifting plan. The picture pressure is performed effectively by utilizing the lifting plan, and the equipment usages are profoundly decreased by utilizing the channels.



Figure 3: Multiplier Based 1-D DWT using 9/7 Filter Coefficient

Inner product computation can be expressed by complex multiplier. The DWT formulation using convolution scheme given in can be expressed by inner product, where the 1-D DWT formulation given in (1) - (2) cannot be expressed by inner product.



Figure 4: Block Diagram of 5/3 & 9/7 2-D DWT using CSD Technique

Although, convolution DWT demands more arithmetic resources than DWT, convolution DWT is considered to take the advantages of CM-based design. CM formulation of convolution-based DWT using 9/7 biorthogonal filter is presented here.

According to (5) and (6), the 9/7 wavelet filter computation in convolution form is expressed as

$$Y_{L} = \sum_{i=0}^{8} h(i) X_{n}(i)$$
 (5)

$$Y_{H} = \sum_{i=0}^{6} g(i) X_{n}(i)$$
 (6)

The low-pass filter coefficients  $\{h(i)\}$  and high-pass filter coefficients  $\{g(i)\}$  of the 9/7 wavelet filter coefficient.  $Y_H$  is the high pass filter output and  $Y_L$  is the low pass filter output. Where

 $Y_{LL}$  is the low-low output of the 2-D DWT  $Y_{LH}$  is the low-high output of the 2-D DWT  $Y_{HL}$  is the high-low output of the 2-D DWT  $Y_{HH}$  is the high-high output of the 2-D DWT

# V. DESIGN STEPS INVOLVE IN VHDL

- Designing the each sub-module of structure
- Combinational circuits sub-modules such KSA, and Vedic Multiplier utilizing rationale entryways
- The consecutive circuits' sub-modules, for example, the D-flip-lemon and misfortune pass channel and high pass channel are planned utilizing flip-failure, KSA and Vedic multiplier doors.
- All these planned submodules are interconnected by segment instantiation utilizing an auxiliary style of displaying in Very high scale incorporated circuit equipment portrayal language (VHDL).
- Designing the information way which handles all activities
- Model configuration records conform to \*.v expansion under Xilinx incorporated condition, and further lead the planning reproduction and confirming the structure documents.

### VI. SIMULATION RESULT

As appeared in table 1 and table 2the defer result are gotten for the proposed complex design and past engineering. From the examination of the outcomes, it is discovered that the mind boggling multiplier design gives a predominant exhibition as contrasted and past engineering.



Figure 5: View Technology Schematic of 9/7 1-D DWT

From the above graphical portrayal it very well may be deduced that the proposed engineering gives the best execution for 32-piece complex multiplier.



Figure 6: Resister Transfer Level of 9/7 1-D DWT

Final Register Report

| Registers                           |       |     |    |        | :    | 512 |
|-------------------------------------|-------|-----|----|--------|------|-----|
| Flip-Flops                          |       |     |    |        | :    | 512 |
| Device utilization summary:         |       |     |    |        |      |     |
| Selected Device : 7vh290thcg1155-2  |       |     |    |        |      |     |
| Slice Logic Utilization:            |       |     |    |        |      |     |
| Number of Slice Registers:          | 512   | out | of | 437600 | 0    | ŧ   |
| Number of Slice LUTs:               | 71745 | out | of | 218800 | 32   | 8   |
| Number used as Logic:               | 71745 | out | of | 218800 | 32   | ŧ   |
| Slice Logic Distribution:           |       |     |    |        |      |     |
| Number of LUT Flip Flop pairs used: | 72073 |     |    |        |      |     |
| Number with an unused Flip Flop:    | 71561 | out | of | 72073  | 99%  |     |
| Number with an unused LUT:          | 328   | out | of | 72073  | 0%   |     |
| Number of fully used LUT-FF pairs:  | 184   | out | of | 72073  | 0%   |     |
| Number of unique control sets:      | 1     |     |    |        |      |     |
| IO Utilization:                     |       |     |    |        |      |     |
| Number of IOs:                      | 1345  |     |    |        |      |     |
| Number of bonded IOBs:              | 1345  | out | of | 300    | 448% | (*  |
| Specific Feature Utilization:       |       |     |    |        |      |     |
| Number of BUFG/BUFGCTRLs:           | 1     | out | of | 32     | 3%   |     |
| Timing Summary:                     |       |     |    |        |      |     |
|                                     |       |     |    |        |      |     |
| Speed Grade: -2                     |       |     |    |        |      |     |

Minimum period: 0.771ns (Maximum Frequency: 1297.169MHz) Minimum input arrival time before clock: 0.517ns Maximum output required time after clock: 22.852ns Maximum combinational path delay: 22.561ns

Figure 7: Device Utilization Summary of 9/7 1-D DWT

Table II: Device Utilization Summary for 9/7 1-D & 2-D Discrete Wavelet Transform

| Architecture       | 1-D DWT   | 2-D DWT    |
|--------------------|-----------|------------|
| Slice Register     | 512       | 1536       |
| Slice LUTS         | 71745     | 217363     |
| Flip Flop          | 71561     | 217363     |
| Number of IOs      | 1345      | 1601       |
| Minimum period     | 0.771 ns  | 12.683 ns  |
| Maximum Frequency  | 1297.169  | 78.847 MHz |
|                    | MHz       |            |
| Maximum            | 22.561 ns | 27.563 ns  |
| Combinational Path |           |            |
| Delay              |           |            |
| Net Power (nW)     | 186534.37 | 334397.38  |

VII. CONCLUSION

The aftereffects of the present work build up that the Vedic parallel overlay engineering, when superimposed on the Karatsuba calculation improves the general execution of the calculation. Because of variables like planning proficiency, speed, and lesser region the proposed Vedic parallel overlay design can be actualized in Arithmetic and Logic Units supplanting the conventional multiplier engineering. The structure is innovation autonomous and can be effectively changed over starting with one innovation then onto the next. Additionally because of the Vertical and Crosswise structure the plan format is basic and customary.

#### REFERENCES

- [1] Syeda Eima Iftikhar Gardezi, Fatima Aziz, Sadaf Javed, Ch. Jabbar Younis, Mehboob Alam, Yehia Massoud, "Design and VLSI Implementation of CSD based DA Architecture for 5/3 DWT", 978-1-5386-7729-2/19/\$31.00@2019 IEEE.
- [2] Mohamed Asan Basiri M and Noor Mahammad Sk, "An Efficient VLSI Architecture for Convolution Based DWT Using MAC", 31th International Conference on VLSI Design and 2018 17th International Conference on Embedded Systems, IEEE 2018.
- [3] Anirban Chakraborty, Debolina Chakraborty and Ayan Banerjee, "A Memory Efficient, High Throughput and Fastest 1D/3D VLSI Architecture for Reconfigurable 9/7 & 5/3 DWT Filters", International Conference on Current Trends in Computer, Electrical, Electronics and Communication (ICCTCEEC-2017).
- [4] Rakesh Biswas, Siddarth Reddy Malreddy and Swapna Banerjee, "A High Precision-Low Area Unified Architecture for Lossy and Lossless 3D Multi-Level Discrete Wavelet Transform", Transactions on Circuits and Systems for Video Technology, Vol. 45, No. 5, May 2017.
- [5] Satish S Bhairannawar, Rajath Kumar, "FPGA Implementation of Face Recognition System using Efficient 5/3 2D-Lifting Scheme", 2016 International Conference on VLSI Systems, Architectures, Technology and Applications (VLSI-SATA).
- [6] Maurizio Martina, Guido Masera, Massimo Ruo Roch, and Gianluca Piccinini, "Result-Biased Distributed-Arithmetic-Based Filter Architectures for Approximately Computing the DWT", IEEE Transactions on Circuits and Systems—I: Regular Papers, Vol. 62, No.8, and August 2015.
- [7] S.G. Mallat, "A Theory for Multiresolution Signal Decomposition: The Wavelet Representation", IEEE Trans. on Pattern Analysis on Machine Intelligence, 110. July1989, pp. 674-693.
- [8] M. Alam, C. A. Rahman, and G. Jullian, "Efficient distributed arithmetic based DWT architectures for multimedia applications," in Proc. IEEE Workshop on SoC for real-time applications, pp. 333 336, 2003.
- [9] X. Cao, Q. Xie, C. Peng, Q. Wang and D. Yu, "An efficient VLSI implementation of distributed architecture for DWT," in Proc. IEEE Workshop on Multimedia and Signal Process., pp. 364-367, 2006.
- [10] Senthil singh C and Manikandan. M, "Design and Implementation of an FPGA-Based Real-Time Very Low Resolution Face Recognition System", International Journal of Advanced Information Science and Technology, Vol. 7, No. 7, pp. 59-65, November 2012.
- [11] Archana Chidanandan and Magdy Bayoumi, "Area-Efficient MDA Architecture for the 1-D DCT/IDCT," ICASSP 2006.
- [12] M. Martina, and G. Masera, "Low-complexity, efficient 9/7 wavelet filters VLSI implementation," IEEE Trans. on Circuits and Syst. II, Express Brief vol. 53, no. 11, pp. 1289-1293, Nov. 2006.
- [13] M. Martina, and G. Masera, "Multiplierless, folded 9/7-5/3 wavelet VLSI architecture," IEEE Trans. on Circuits and syst. II, Express Brief vol. 54, no. 9, pp. 770-774, Sep. 2007.