# AN IMPLEMENTATION OF A LOW POWER DELTA-SIGMA A/DC WITH A MULTI-BIT QUANTIZER ON SILICON-ON-SAPPHIRE

By

#### **CHIA-MING LIU**

Bachelor of Science Oklahoma State University Stillwater, Oklahoma 1994

Master of Science Oklahoma State University Stillwater, Oklahoma 1996

Submitted to the Faculty of the Graduate College of the Oklahoma State University in partial fulfillment of the requirements for the Degree of DOCTOR OF PHILOSOPHY August, 2002

# AN IMPLEMENTATION OF A LOW POWER DELTA-SIGMA A/DC WITH A MULTI-BIT QUANTIZER ON SILICON-ON-SAPPHIRE

Thesis Advisor Dean of the Graduate College

Thesis Approved:

ii

#### ACKNOWLEDGMENTS

The completion of this dissertation marks the end of my formal education. Throughout my life, there were numerous supports that are given to me unconditionally and lovingly. At this moment, I will take this opportunity to express my sincere gratitude to who have supported and loved me all along.

To begin, I thank my parents, grandparents, aunts and uncles for their unconditional love and support throughout my life. Their concern and caring always extends to the deepest of my heart. Ten years in the U.S., I would not able to achieve this milestone of my life without expressing my sincere gratitude to my aunt Kay and uncle Giin-Fa who have taken care of me for all these years, their love and care have helped me to accommodate my new life. Thanks to my cousins Bernard and Edwin who have also walked me through my loneliness from times to times.

Especially, I would like to express my deep appreciation to my wife, and my best friend, Rwei-Hsiang Kau who has accompanied and mentally stimulated me in many aspects of my life in these years. Her encouragement and support were the driving force for the write-up of this dissertation.

iii

Five and half years in Advanced Analog VLSI lab and many summer trips to San Diego; many of my partners came and graduated. I have not forgotten the friendship, encouragement, and technical assistance they all have given to me. To my friends Zheng, Wen, Jeff, Venu, Prasanna, Dave, Derek, Steve, Reghu, Zhu, Narendra, and Ira I appreciate and thank you.

I would also like to thank the faculty of the ECEN Department at OSU, especially Dr. Louis G. Johnson, Dr. Keith A. Teague, and Dr. Jack Cartinghour of Electrical Engineering Technology for being on my Graduate Committee. I am also indebted to Ms. Rea Maltsberger who has helped me throughout most of my years at OSU and her smile always inspires me. In addition, I am grateful to Space and Warfare (SPAWAR) Systems Center (formerly NRaD), San Diego, CA for its support of this project.

Last and foremost, I would like to thank my advisor, Dr. Chris Hutchens. I appreciate all of the opportunities and assistance he has provided me over the past five and half years. His numerous guidance and advice have deeply enhanced my chances for success at OSU and my subsequent career and life. Thank you!

iv

## TABLE OF CONTENTS

| Ch | Chapter Page                                                                                                                                                                                                                                                                                                                                                 |  |  |
|----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
| 1. | INTRODUCTION 1                                                                                                                                                                                                                                                                                                                                               |  |  |
|    | 1.1 Objective.21.2 Organization.3                                                                                                                                                                                                                                                                                                                            |  |  |
| 2. | DELTA-SIGMA MODULATOR OVERVIEW                                                                                                                                                                                                                                                                                                                               |  |  |
|    | 2.1 Types of A/DCs.52.1.1 Nyquist A/DC.62.1.2 Oversampling A/DC.82.1.3 $\Delta$ - $\Sigma$ A/DCs.92.2 Architectures of $\Delta$ - $\Sigma$ A/DCs.112.2.1 Single loop modulator.122.2.2 MASH modulator.142.3 Power dissipation consideration.19                                                                                                               |  |  |
| 3. | MODULATER DESIGN AND IMPLEMENTATION                                                                                                                                                                                                                                                                                                                          |  |  |
|    | 3.1 Low power processes, Bulk or SOI?213.2 Architectural configuration and power estimation of a $\Delta$ - $\Sigma$ modulator.243.3 Modulator stability.333.4 Crucial building blocks of a $\Delta$ - $\Sigma$ modulator.433.4.1 Integrator and OTA.463.4.2 Dynamic common mode feedback.563.4.3 Quantizer.603.4.4 Serial D/AC.643.5 Device.693.6 Layout.74 |  |  |
| 4. | DECIMATION FILTER OVERVIEW                                                                                                                                                                                                                                                                                                                                   |  |  |
|    | 4.1 Moving average filter.804.2 Comb (Sinc) filter.814.3 Half-band filter.834.4 Two-path filter.86                                                                                                                                                                                                                                                           |  |  |

| Chapter |
|---------|
|---------|

# Page

| 5. | TWO-PATH FILTER DESGIN AND IMPLEMENTATION            | 89  |
|----|------------------------------------------------------|-----|
|    | 5.1 Filter in cascade                                | 91  |
|    | 5.2 Finite impulse response filter                   |     |
|    | 5.3 Decimation filter.                               |     |
|    | 5.3.1 Data multiplexer                               |     |
|    | 5.3.2 Two-path filter1                               |     |
|    | 5.4 Overall filter response                          | 10  |
| 6. | MEASUREMENT RESULTS1                                 | 18  |
|    | 6.1 Discrete Fourier Transform                       | .19 |
|    | 6.2 Modulator loop1                                  | .23 |
|    | 6.2.1 SNR and missing code1                          | .24 |
|    | 6.2.2 SFDR (Two-tone test)                           |     |
|    | 6.2.3 INL and DNL 1                                  |     |
|    | 6.3 Two-path decimation filter 1                     | 30  |
|    | 6.4 Power dissipation measurement1                   | 34  |
| 7. | CONCLUSIONS 1                                        | 37  |
|    | 7.1 Discussion                                       | 38  |
|    | 7.1.1 Leaky integrator                               |     |
|    | 7.1.2 Noisy transistors and current feedback sources |     |
|    | 7.1.3 Floating body of the transistors               |     |
|    | 7.1.3.1 Kink effect and pass-gate leakage1           |     |
|    | 7.1.3.2 History dependence                           |     |
|    | 7.2 Suggestion1                                      |     |
| RE | FERENCES 1                                           | 46  |
| ΛT | PENDIX ACHIP STATISTICS AND LAYOUT AND DIE           |     |
|    | OTOGRAPHS                                            | 52  |
|    |                                                      |     |
| VI | ТА1                                                  | 57  |

## LIST OF FIGURES

| Figure Page                                                                                   |
|-----------------------------------------------------------------------------------------------|
| 2.1 Block diagram of a general A/DC 6                                                         |
| 2.2 Noise spectrum comparison of A/DCs                                                        |
| 2.3 The 2 <sup>nd</sup> order single loop modulator12                                         |
| 2.4 Block diagram of VCO-DS-ADC14                                                             |
| 2.5 The 2 <sup>nd</sup> order MASH modulator16                                                |
| 2.6 The 2-1-1, 4-bit cascaded multi-bit delta-sigma modulator                                 |
| 3.1 Processes of Bulk and SOI. (Courtesy of IBM Corp)                                         |
| 3.1a Drain/source to substrate capacitance ( $C_{db}/C_{sb}$ ) of Bulk (left) and SOI (right) |
| (Courtesy of IBM Corp)22                                                                      |
| 3.2 Comparison of Bulk and SOI power dissipation. (Courtesy of IBM Corp)23                    |
| 3.3 A general form of a single loop $\Delta - \Sigma$ modulator                               |
| 3.4 A clock buffer driver for analog switches                                                 |
| 3.5 Oversampling ratio in various loop configurations                                         |
| 3.6 Power dissipation in various loop configurations                                          |
| 3.7 Combination plot of Figure 3.5 and 3.6                                                    |
| 3.8 Z-domain of the $2^{nd}$ order $\Delta - \Sigma$ modulator                                |
| 3.9 Block diagram of a $\Delta$ - $\Sigma$ loop                                               |
| 3.10 Root locus of the modulator NTF                                                          |
| 3.11 Root locus of the modulator NTF at various conditions                                    |

| 3.12 Frequency responses of the modulator STF and NTF                                       |
|---------------------------------------------------------------------------------------------|
| 3.13 Frequency responses of the modulator NTF at various parameter variations40             |
| 3.13a A section zoom-in of the Figure 3.1340                                                |
| 3.13b A section zoom-in of the Figure 3.1341                                                |
| 3.14 Frequency responses of the modulator STF at multiple $\alpha$ parameter variations 41  |
| 3.15 Frequency responses of the modulator STF at multiple $\beta$ parameter variations42    |
| 3.16 Output spectrum of the modulator                                                       |
| 3.16a A section zoom-in of the Figure 3.1643                                                |
| 3.17 Switched-capacitor implementation of a $2^{nd}$ order $\Delta - \Sigma$ modulator      |
| 3.18 Forward and backward Euler and Trapezoid integrations                                  |
| 3.19 Switched-capacitor implementation of a fully differential integrator                   |
| 3.20 Switched-capacitor implementation of an integrator                                     |
| 3.21 A fully differential folded cascode OTA without the CMFB circuit                       |
| 3.22 Architecture of the switched-capacitor CMFB circuit                                    |
| 3.23 Circuit diagram of the CMFB loop                                                       |
| 3.24 Equivalent circuit of the switched-capacitor section of the CMFB circuit               |
| 3.25 Circuit diagram of the low power regenerative quantizer60                              |
| 3.26 Circuit diagram of the parallel-to-serial current feedback D/AC64                      |
| 3.27 Circuit diagram of the parallel-to-serial shift register                               |
| 3.28 Circuit schematics of testing a single transistor72                                    |
| 3.29 The 1 <sup>st</sup> derivative of the simulated $I_D (g_m vs V_{ov})$                  |
| 3.30 The $2^{nd}$ derivative of the simulated I <sub>D</sub> ( $\beta$ vs V <sub>ov</sub> ) |

| Figure Pa                                                                       | ıge |
|---------------------------------------------------------------------------------|-----|
| 3.31 Layout pattern of a common-centroid                                        | 76  |
| 3.32 Layout pattern of an interdigitate                                         | 77  |
| 4.1 Block diagram of a gerneral $\Delta$ - $\Sigma$ A/DC                        | 78  |
| 4.2 Frequency responses of an A/DC output and a decimation filter               | 79  |
| 4.3 Block diagram of a general moving average filter                            | 80  |
| 4.4 Block diagram of a general Sinc filter                                      | 82  |
| 4.5 Design criteria of a half-band filter                                       | 84  |
| 4.6 Poly-phase implementation of a half-band filter                             | 86  |
| 4.7 Block diagram of a general two-path filter                                  | 87  |
| 4.8 Block diagrams of various Sinc filters                                      | 87  |
| 4.9 Comparison of filter power dissipations                                     | 88  |
| 5.1 Block diagram of a filter in (a) cascade and (b) parallel                   | 90  |
| 5.2 Block diagram of a general decimation filter                                | 92  |
| 5.3 Circuit diagram of the 1 <sup>st</sup> order FIR                            | 93  |
| 5.4 Frequency response of the 1 <sup>st</sup> order FIR                         | 94  |
| 5.5 Architecture of the 2 <sup>nd</sup> order FIR                               | 95  |
| 5.6 A triangular window as the result of convolution of two rectangular windows | 96  |
| 5.7 Frequency response of the 2 <sup>nd</sup> order FIR                         | 96  |
| 5.8 Decimation processes with and without filtering                             | 97  |
| 5.9 Single path filter approach (data filtered BEFORE frequency decimation)     | 99  |
| 5.9a Two-path filter approach (data filtered BEFORE frequency decimation)       | 99  |
| 5.9b Two-path filter approach (data filtered AFTER frequency decimation)        | 99  |

-

| 5.10 Circuit diagram of the data multiplexer (Dmux) 102                                  |
|------------------------------------------------------------------------------------------|
| 5.11 Timing diagram of the Dmux 103                                                      |
| 5.12 Circuit diagram of the improved data multiplexer104                                 |
| 5.13 Block diagram of the 1 <sup>st</sup> order two-path filter 105                      |
| 5.14 Block diagram of the 2 <sup>nd</sup> order two-path filter106                       |
| 5.15 Block diagram of the 3 <sup>rd</sup> order two-path filter108                       |
| 5.16 Block diagram of the 4 <sup>th</sup> order two-path filter                          |
| 5.16a Low power version of the 4 <sup>th</sup> order two-path filter109                  |
| 5.17 Architecture of the 64 times decimation filter                                      |
| 5.18 Frequency response of the filter and the NTF of the modulator                       |
| 5.19 Frequency response of the 1 <sup>st</sup> stage filter output BEFORE decimation 112 |
| 5.20 Frequency response of the 1 <sup>st</sup> stage filter output AFTER decimation112   |
| 5.21 Aliasing effect to the noise floor                                                  |
| 5.22 Frequency response of the 2 <sup>nd</sup> stage filter output BEFORE decimation     |
| 5.23 Frequency response of the 2 <sup>nd</sup> stage filter output115                    |
| 5.24 Frequency response of the last stage filter output115                               |
| 5.25 Frequency response of each stage at pass-band117                                    |
| 5.26 Frequency response of signal vs noise floor                                         |
| 6.1 Signal is captured in a complete cycle                                               |
| 6.2 Signal is captured in incomplete cycle                                               |
| 6.3 Comparison of window functions 121                                                   |
| 6.4 Gain and offset errors of an A/DC126                                                 |

| Figure                                                                        | Page   |
|-------------------------------------------------------------------------------|--------|
| 6.5 Frequency response of the AFE output                                      | 127    |
| 6.5a A section zoom-in of the Figure 6.5                                      | 127    |
| 6.6 The AFE output in time domain                                             | 128    |
| 6.7 Intermodulation of two signals                                            | 129    |
| 6.8 Graphical interpolation of SFDR                                           |        |
| 6.9 Frequency response of the DBE output                                      |        |
| 6.10 The DBE output in time domain                                            |        |
| 6.10a A section zoom-in of the Figure 6.10                                    |        |
| 6.11 Impulse response of the DBE                                              |        |
| 6.12 Re-plot the power dissipation per bit of the Figure 3.7 with $K_Q = 0$ . | 027134 |
| 6.13 Power dissipation and projection of the DBE                              | 135    |
| 7.1 Simulation of the NTF with the gains of OTAs at 128 and 512               |        |
| 7.2 Projected gain of the OTA using the NMOS cascode measurement              |        |
| 7.3 Projected gain of the OTA using the PMOS cascode measurement              | 140    |
| 7.4 Measured leakage current of low V <sub>th</sub> transistors               |        |
| 7.5 Circuit elements determine body bias                                      | 144    |
| 7.6 Two possible payouts of the H-gate                                        | 145    |
| 7.7 Layout of the BTS gate                                                    |        |
| A.1 Screen capture of the chip layout                                         |        |
| A.2 Die photograph of the 18-bit modulator (upper left)                       | 155    |
| A.3 Die photograph of the 16-bit modulator (upper right)                      |        |
| A.4 Die photograph of the decimation filter (lower left)                      |        |

xi

| A.5 Die photograph of the | pad driver (lower right) |  |
|---------------------------|--------------------------|--|
|---------------------------|--------------------------|--|

### LIST OF TABLES

| Table                                              | Page |
|----------------------------------------------------|------|
| 3.1 Power dissipation cost of improving resolution |      |
| 3.2 The effects of the loop coefficients           |      |
| 3.3 Clock phase assignment of the modulator        | 69   |
| 3.4 Design parameters vs. OTA parameters           | 72   |
| 5.1 Aliasing effect to the noise floor             |      |
| 6.1 Window functions and characteristics           | 122  |
| 6.2 General guideline to select applied windows    | 123  |
| 6.3 Power dissipation of the components in AFE     |      |
| A.1 Transistor count and sizes of the components   |      |

# NOMENCLATURE

| AAF             | Antialiasing filter              |
|-----------------|----------------------------------|
| A/DC            | Analog to digital converter      |
| AFE             | Analog front end                 |
| $A_0$           | Open loop DC gain                |
| В               | Quantizer number of bit          |
| BW              | Bandwidth                        |
| $C_F$           | Feedback/Integration capacitance |
| C <sub>gs</sub> | Gate to source capacitance       |
| $C_{db}$        | Drain to body capacitance        |
| $C_L$           | Load capacitance                 |
| CMFB            | Common mode feedback             |
| C <sub>OX</sub> | Gate oxide capacitance           |
| CPU             | Central processing unit          |
| Cs              | Sampling capacitance             |
| $C_{sb}$        | Source to body capacitance       |
| D/AC            | Digital to analog converter      |
| DBE             | Digital back end                 |
| Δ               | Quantization step size           |
| DFF             | D type flip-flop                 |
| DMUX            | Data multiplexer                 |
| DNL             | differential non-linearity       |
| DR              | Dynamic range                    |
| DSP             | Digital signal processor         |
| $\mathbf{f}_0$  | Base-band frequency              |
| $F_{\text{in}}$ | Input frequency                  |

xiv

| $\mathbf{f}_{c}$          | Corner frequency                                            |
|---------------------------|-------------------------------------------------------------|
| FIR                       | Finite impulse response                                     |
| FFT                       | Fast Fourier Transform                                      |
| $\mathbf{f}_{M}$          | Full power bandwidth                                        |
| $\mathbf{f}_{max}$        | Power gain frequency                                        |
| $\mathbf{f}_n$            | Nyquist frequency $(2*f_0)$                                 |
| $\mathbf{f}_{s}$          | Sampling frequency                                          |
| $\mathbf{f}_{\mathrm{T}}$ | Current gain frequency                                      |
| $\mathbf{f}_{TA}$         | Analog current gain frequency                               |
| $f_{TD}$                  | Digital current gain frequency                              |
| GBP                       | Gain bandwidth product                                      |
| g <sub>m</sub>            | Mutual transconductance                                     |
| g <sub>ds</sub>           | Drain to source conductance                                 |
| IIR                       | Infinite impulse response                                   |
| INL                       | Integral non-linearity                                      |
| $I_{\text{tail}}$         | Tail current                                                |
| k                         | Boltzmann's constant $(1.38 \times 10^{-23} J_{K^{\circ}})$ |
| Ksps                      | Kilo samples per second                                     |
| L                         | Channel length                                              |
| L <sub>eff</sub>          | Effective channel length                                    |
| LSB                       | Least significant bit                                       |
| М                         | Down sample rate                                            |
| m                         | Current source/sink number (Leg)                            |
| MSB                       | Most significant bit                                        |
| n                         | Number of bit in SNR                                        |
| MOSFET                    | Metal oxide semiconductor field effect transistor           |
| NTF                       | Noise transfer function                                     |
| 0                         | $\Delta$ - $\Sigma$ loop order                              |
| OSR                       | Oversampling ratio                                          |
| OTA                       | Operational transconductance amplifier                      |
| R <sub>in</sub>           | Input resistance                                            |
|                           |                                                             |

| R <sub>on</sub>        | Transistor on resistance                  |
|------------------------|-------------------------------------------|
| RMS                    | Root mean square                          |
| SC                     | Switched capacitor                        |
| SFDR                   | Spurious free dynamic range               |
| SNR                    | Signal to noise ratio                     |
| SOI                    | Silicon on insulator                      |
| SOS                    | Silicon on sapphire                       |
| SR                     | Slew Rate                                 |
| $\mathbf{P}_{analog}$  | Analog power dissipation                  |
| P <sub>avg</sub>       | Average noise power                       |
| $\mathbf{P}_{digital}$ | Digital power dissipation                 |
| PSRR                   | Power supply rejection ratio              |
| q                      | Electron charge $(1.6 \times 10^{-19} C)$ |
| q <sub>e</sub>         | Quantization error                        |
| τ                      | Settling time constant                    |
| t <sub>f</sub>         | Fall time                                 |
| t <sub>ox</sub>        | Oxide layer thickness                     |
| t <sub>r</sub>         | Rise time                                 |
| t <sub>s</sub>         | Sampling period                           |
| V <sub>BS</sub>        | Source to body voltage                    |
| V <sub>DS</sub>        | Drain to source voltage                   |
| V <sub>GS</sub>        | Gate to source voltage                    |
| $V_{FS}$               | Full scale voltage                        |
| $V_{G}$                | Gate voltage                              |
| V <sub>GS</sub>        | Gate to source voltage                    |
| V <sub>in</sub>        | Input voltage                             |
| V <sub>n</sub>         | Noise voltage                             |
| $V_{ref}$              | Accurate reference voltage                |
| V <sub>os</sub>        | Offset voltage                            |
| V <sub>out</sub>       | Output voltage                            |
| V <sub>OV</sub>        | Overdrive voltage ( $V_{GS}$ - $V_{th}$ ) |
|                        |                                           |

xvi

| $V_{T}$          | Thermal voltage         |
|------------------|-------------------------|
| $V_{\text{th}}$  | Threshold voltage       |
| $V_{VG}$         | Virtual ground voltage  |
| W                | Channel width           |
| W <sub>eff</sub> | Effective channel width |

#### CHAPTER 1

#### INTRODUCTION

The analog to digital converter (A/DC) and digital to analog converter (D/AC) are two essential components connecting the real world and the digital domains. Due to the high successful rate of digital circuit implementations and the fabrication process advancement of integrated circuit (IC) focusing only on digital circuits, many analog circuits (e.g. filters) are now replaced by their digital equivalents. As a result, the locations of the converters are shifted very close to the input/output of the systems, i.e. antennas, speakers and sensory devices. The concept of this approach is to 1) digitize the signal as early as possible; 2) retain the signal in digital domain as long as possible, due to the superiority of the noise immunity of the digital circuits. Potentially more significant is their superior availability to design automation.

Resolution and power dissipation are two important issues involved in A/DC designs. Due to the small and weak amplitude of the sensor output, the converters with high resolution are required to precisely interpret the information. However, poor component matching and reduced power supply levels of the advanced process technologies hinder many high precision A/DC implementations, and further reveal the design challenges. Although many error correction techniques have been developed to

implement precision Nyquist rate ADCs, it is still very difficult to achieve more than 12bit of accuracy. On the contrary, oversampled  $\Delta-\Sigma$  A/DCs are well suited for implementation in VLSI technology due to their efficient means of exchanging speed for resolution and their high tolerance to component mismatches and circuit non-idealities.

Power dissipation is the second issue of an A/DC design. As a result of low power CMOS many sensory or telecommunication devices are portable and battery powered. The improvement of the long-life battery is slow, and as a result converters with low power design are necessary to extend the life of the battery. Processes and system architectures are critical for power efficient product design. Today's advanced thin-film SOS/SOI CMOS VLSI processes with lower parasitic capacitance, smaller feature size, and insulated substrate have encouraged such converter designs and related circuits to be fabricated on one chip to lower power dissipation.

In today's standard digital CMOS process, low power and high resolution  $\Delta -\Sigma$ A/DCs have gained a unique role in cost effective mixed mode IC application. However, process power supply down scale and the high demand for lower power consumption still pose the great challenges regarding  $\Delta -\Sigma$  A/DCs research.

1.1 Objective

The objective of this project is to build high resolution and low power A/DCs. The selected delta-sigma architecture consists of the modulator or AFE and the decimation filter or DBE. The decimation filter does not improve the resolution but degrade it if not carefully designed. Therefore, the project requirements listed below are specified for the modulators addressed in this thesis.

- 18-bit 1Ksps @ 1mW
- 16-bit 2Ksps @ 0.5*mW* and 10Ksps @ 2*mW*

The decimation filter in this dissertation is designed for 64/32 times decimation, which is to accommodate the above modulators designed using the oversampling scheme and to interpolate the resolution of the coarse modulator output.

#### 1.2 Organization

Chapter 1 introduces the background and the purpose for this study.

Chapter 2 reviews the analog to digital conversion processes and available low power  $\Delta-\Sigma$  A/DC architectures.

Chapter 3 discusses the modulator in detail and its implementation. The low power design strategy, system stability, resolution improvement, and circuit consideration are included.

Chapter 4 reviews the decimation processes and available architectures of decimation filters. Two-path decimation filter is introduced.

Chapter 5 discusses the low power strategies of the decimation filter and the filter implementation. The constructions of the two-path filter and the data multiplexer are included.

Chapter 6 introduces basic A/DC characterization and sampling methods to improve the results of Fast Fourier Transform (FFT).

Chapter 7 summarizes the results of this study, conclusions, and proposes some improvements.

#### CHAPTER 2

#### DELTA-SIGMA MODULATOR OVERVIEW

Analog to digital conversion is the process by which analog values are mapped to their digital counterparts. From the viewpoint of digital algorithms, the number of bit of the processor limits the precision. From the viewpoint of the circuit implementation without CPU or DSP, the technology of the chip fabrication and circuit architecture are the limitations. The improvement of the fabrication process is beyond the scopes of circuit designers. However, understanding the technology is the key to the designer's ability to overcome the imperfection of the fabrication by creating better architectures and techniques. The oversampling technique, for example, is taking advantage of the process technology that advances (increasing  $f_T$ ) solely for the digital circuits.

#### 2.1 Types of A/DCs

Based on the sampling frequency, A/DCs can be categorized to two groups: Nyquist and oversampling. There is no specific guideline to use the certain type of A/DC, and it is the designer's advantage making the selection. In general, for applications with low resolution and wide input bandwidth, the Nyquist A/DC is a good candidate. For the

requirement of high resolution and low bandwidth, oversampled A/DCs have advantages. Both types of A/DCs have the same conversion errors (or noises) that designers strive to eliminate. In the following sections, three commonly used A/DCs (Nyquist, oversampling, and delta-sigma) will be introduced along with their techniques to improve the accuracy.

2.1.1 Nyquist A/DC

Figure 2.1 shows the general form of an A/DC. The sampled input value, x(kT), is rounded to the nearest level during the quantization. Since it is not the exact conversion, the quantized output can be described as

$$y(kT) = x(kT) + e(kT)$$
(2.1)



Figure 2.1 Block diagram of a general A/DC.

The quantization noise, e(kT), is dependent on the amplitude of x(kT) and is in the interval between  $-\Delta/2$  and  $-\Delta/2$ . Note that the quantization step size [1] is defined as

$$\Delta = \frac{V_{FS}}{2^B} \tag{2.2}$$

The noise power (variance) [1-2] can be found as

$$\sigma^{2} = E[e^{2}] = \frac{1}{\Delta} \int_{-\Delta/2}^{\Delta/2} e^{2} de = \frac{\Delta^{2}}{12}$$
(2.3)

where E denotes statistical expectation. Since the noise power spectrum is spread uniformly over the frequency range, the level of the noise power spectral density of a Nyquist A/DC can be expressed as

$$N_{Nyquist}(f) = \frac{\Delta^2}{12} \frac{1}{f_n}$$
(2.4)

Assuming that the input signal is a sinusoidal wave and its maximum peak value without clipping is  $|_{N_{n-\varepsilon}(f)}|_{=\left(2\sin\frac{\pi}{f_{n}}\right)^{\alpha}}$ , the signal power,  $P_{signal}$ , is equal to

$$P_{signal} = \left(\frac{\Delta 2^{B}}{2\sqrt{2}}\right)^{2} = \frac{\Delta^{2} 2^{2B}}{8}$$
(2.5)

The noise power of a Nyquist A/DC is

$$P_{noise} = \frac{\Delta^2}{12} \tag{2.6}$$

and the maximum *SNR* can be written as

$$SNR_{Nyquist} = 10\log\frac{P_{signal}}{P_{noise}} = 10\log\left(\frac{\Delta^2 2^{2B}/8}{\Delta^2/12}\right) = 10\log\left(\frac{3}{2}2^{2B}\right) = 6.02B + 1.76 \quad (2.7)$$

This states that the resolution of the Nyquist A/DC solely depends on the quantity of the quantizer. An additional bit of quantizer increases the resolution by 6*dB*.

The resolution of the Nyquist A/DC is directly proportional to the number of comparators in the quantizer, and matching between any of two is critical. Current technology of MOS comparators without auto-zeroing permits a minimum comparison of roughly 10mV due to the comparator's inherent offset voltage [2]. Thus, the implementation beyond 10- to 12-bit of resolution is quite difficult without using special calibration techniques, like laser trimming and auto calibration etc. Note that the purpose of presenting the Nyquist A/DC is to describe the quantization error and serve as reference for the ensuing discussions.

### 2.1.2 Oversampling A/DC

An increase in the sampling frequency can improve the resolution over the Nyquist A/DC. Equation 2.4 shows that the noise power spectral density is the function of the sampling frequency ( $f_s = OSR^*f_n$ ). Thus, the oversampled A/DC's noise power spectral density and noise power can be written as

$$N_{Oversampling}(f) = \frac{\Delta^2}{12} \frac{1}{f_n} \frac{1}{OSR}$$
(2.8)

$$P_{noise} = \frac{\Delta^2}{12} \left( \frac{1}{OSR} \right) \tag{2.9}$$

Assuming the same input as Equation 2.5, the *SNR* of the oversampling A/DC can be written as

$$SNR_{Oversampling} = 10\log\frac{P_{signal}}{P_{noise}} = 10\log\left(\frac{\Delta^2 2^{2B}/8}{\Delta^2/12 \cdot OSR}\right)$$
(2.10)  
= 10log(OSR) + 6.02B + 1.76

Equation 2.10 states that resolution is a function of the sampling frequency and the doubling the OSR increases the SNR by 3dB or quadrupling the OSR provides an additional bit of resolution.

2.1.3 Δ-Σ A/DCs

The resolution of the oversampling A/DC can be improved significantly by applying the noise shaping techniques (delta-sigma). The principle of the technique is to delay (or low pass) the signal and high pass the noise. The noise power density and noise power [3-4] are expressed as

$$\left|N_{\Delta-\Sigma}(f)\right| = \left(2\sin\frac{\pi f_n}{f_s}\right)^0 \tag{2.11}$$

$$P_{noise} = \left(\frac{\Delta^2}{12}\right) \left(\frac{\pi^{2O}}{2O+1}\right) \left(\frac{1}{OSR}\right)^{2O+1}$$
(2.12)

and the SNR of the  $\Delta$ - $\Sigma$  A/DC is written as follows

$$SNR_{\Delta-\Sigma} = 10\log\left(\frac{3}{2}\frac{(2O+1)}{\pi^{2O}}OSR^{2O+1}2^{2B}\right)$$

$$= (20O+10)\log OSR + 10\log(2O+1) + 6.02B + 1.76 - 9.94O$$
(2.13)

The attraction of the  $\Delta$ - $\Sigma$  A/DC is that the modulator order has a dramatic effect in improving resolution. The resolution is increased approximately 1.5-bit/octave for the 1<sup>st</sup> order system, 2.5-bit/octave for the 2<sup>nd</sup> order, 3.5-bit/octave for the 3<sup>rd</sup> order, etc.

A qualitative view of three A/DC quantization noise powers is shown in Figure 2.2. The figure demonstrates that the in-band noise power of a noise shaping A/DC is much lower than the other techniques. For the out-of-band noise power, the attenuation is done by post digital decimation filter. Therefore,  $\Delta-\Sigma$  A/DCs are very suitable for applications that require high resolution.



Figure 2.2 Noise spectrum comparsion of A/DCs

#### 2.2 Architectures of $\Delta$ - $\Sigma$ A/DCs

The  $\Delta$ - $\Sigma$  A/DC shows a decisive advantage in the improvement of the resolution. For the applications requiring high resolution, the  $\Delta$ - $\Sigma$  A/DC is the first selection. Over the years, the concept of delta-sigma has been developed and implemented in various forms in order to achieve higher resolution. Recently due to the great demands in the telecommunication area, mature  $\Delta$ - $\Sigma$  architectures are being modified to minimize power dissipation. The two types of architectures most frequently demonstrated in recent publications for low power and high resolution: single loop and multi-stage noise shaping (MASH). In the following sections, both architectures will be introduced and reviewed. Architectural considerations with regard to power dissipation will be discussed in the last section. This is the most basic architecture of the  $\Delta-\Sigma$  modulator. All other modulators are the derivatives with minor alterations. The advantage of this modulator is that it is always stable if the order is less than three [5]. Furthermore, the system is less sensitive to the component imperfection, such as mismatch of the capacitors of the integrators. Figure 2.3 shows the example of the 2<sup>nd</sup> order system.



Figure 2.3 The 2<sup>nd</sup> order single loop modulator.

The meaning of the delta-sigma is embedded inside the architecture; accumulate (sigma) the differences (delta) between input and output signals. Theoretically, the output is equal to the input in the long run. In reality, perfect conversion is still the goal of the ongoing research. The transfer function of the modulator is

$$Y(z) = X(z)z^{-1} + E(z)(1-z^{-1})^2$$
(2.14)

From the equation, the output of the modulator is the combination of the delay input signal and high passed quantization noise. An increase in the order of the modulator

affects the slope of the high pass filter. As the order increases, the slope steepens and the noise floor of the pass-band lowers for the system with optimal pole-zero placement.

Not just increasing the sampling frequency and modulator order, but increasing the number of bits of the quantizer can reduce the quantization step size resulting in SNR improvement. From Equation 2.13, the resolution improvement is directly proportional to the number of bits of the quantizer. However, the drawbacks of this approach are the requirement of the multi-bit D/AC that is non-linear by nature, and the limited number of quantization bits, which has the same limitation as the classical Nyquist A/DC.

The linearity of the modulator is dominated by the linearity of the D/AC, since the latter is in the feedback path. Non-linearity in the multi-bit D/AC will cause harmonic distortion and base-band noise increase due to intermodulation of high frequency noise [4]. Dynamic element matching (DEM) techniques can reduce the D/AC noise [6]. A DEM D/AC consists of an array of coarse D/ACs (D/AC cell) and the output is the sum of D/AC cells in random. This technique reduces the noise power of the D/AC by assuming that element matching error is a random, noise-like structure. By reducing the correlation among successive samples of D/AC noise, harmonic distortion is reduced. The implementation of an 8-bit D/AC with this technique has demonstrated the ability to achieve 90*dB* of SFDR or greater [6].

One paper (VCO-DS-ADC) [7] has developed the alternative approach of replacing the quantizer with the combination of a voltage control oscillator (VCO) and a counter, and a 1-bit feedback D/AC. Figure 2.4 redraws the diagram of the paper.



Figure 2.4 Block diagram of VCO-DS-ADC.

The VCO senses the voltage variation of the integrator output, converts to relative phase variation, and adjusts the frequency of the oscillator accordingly. With this approach, the number of the quantization bit is not limited by the component matching but the resolution and linearity of the VCO, which can be improved by better architecture and process advancement. Moreover, the 1-bit D/AC architecture ensures no D/AC non-linearity, and the details of this benefit will be discussed in Chapter 3.

Increasing the number of bits of the quantizer to reduce power dissipation is becoming a trend to lower the power dissipation. In the next section, the similar multi-bit quantizer approach to reduce the power dissipation will be demonstrated again in the example from references.

2.2.2 MASH modulator

The concept of this approach is to build a higher order modulator using the lower order  $\Delta$ - $\Sigma$  modulators as building blocks in cascade. Theoretically, the overall system is stable since the lower order modulators are stable [3]. Moreover, cascading more stages can improve the performance. However, at some point the improvement is limited by the uncanceled noise from the 1<sup>st</sup> stage. Beyond this point, no gain in performance will be realized by adding more stages.

An example of the  $2^{nd}$  order modulator obtained by cascading two first order modulators is shown in Figure 2.5. Note that the  $1^{st}$  stage quantization error  $e_1[n]$  is treated as the input signal of the second modulator. The final digital output y[n] is the difference of the first stage's delayed output  $y_1[n]$  and second stage's differentiated output  $y_2[n]$ . The output equation is written as

$$Y(z) = Y_1(z)z^{-1} - Y_2(z)(1 - z^{-1})$$
(2.15)

where

$$Y_1(z) = X(z)z^{-1} + E_1(z)(1 - z^{-1})$$
(2.15a)

$$Y_2(z) = E_1(z)z^{-1} + E_2(z)(1 - z^{-1})$$
(2.15b)

substituting (2.15a) and (2.15b) into (2.15) and rearranging the equation give

$$Y(z) = X(z)z^{-2} - E_2(z)(1 - z^{-1})^2$$
(2.16)

The quantization error is shaped by the  $2^{nd}$  order high pass filter and the signal is delayed, which has the same performance as the  $2^{nd}$  order single loop architecture except an additional delay (Equation 2.14).



Figure 2.5 The 2<sup>nd</sup> order MASH modulator.

The advantage of using the cascade architecture is shown in Reference [8-9]. For the modulator operated at a typical OSR, the quantization noise spectral density is a smooth continuous function of frequency and is independent of the input signal level. From Equation 2.16, increasing the order of each stage can further reduce the noise. However, in order to cancel the quantization noise of the first loop completely, the gain of the 2<sup>nd</sup> loop must be equal to the gain of the 1<sup>st</sup> loop. Due to the potential capacitor mismatch resulting in the gain error, the implementation of the stage with higher order is limited.

To illustrate the severity of the mismatch, the integrator gain of the 1<sup>st</sup> stage is assumed to be  $\delta$  and the integrator gain of the 2<sup>nd</sup> stage remains one. Equation 2.15a is rewritten as

$$Y_{1}(z) = X(z)\frac{\delta \cdot z^{-1}}{1 + (\delta - 1)z^{-1}} + E_{1}(z)\frac{(1 - z^{-1})}{1 + (\delta - 1)z^{-1}}$$
(2.17)

and the final output is

$$Y(z) = X(z) \frac{\delta \cdot z^{-2}}{1 + (\delta - 1)z^{-1}} - E_2(z) \frac{(1 - z^{-1}) + (\delta - 2)(1 - z^{-1})z^{-1}}{1 + (\delta - 1)z^{-1}} - [E_1(z) - E_2(z)] \frac{(\delta - 1)(1 - z^{-1})z^{-2}}{1 + (\delta - 1)z^{-1}}$$
(2.18)

The equation shows that gain mismatch results in not just the transfer function for signal and noise terms changed but also the additional noise term added.

Reference [10] shows the improvement by placing interstage gains. Figure 2.6 is the duplicate from the paper. The 4-bit quantizers used at the 1<sup>st</sup> and 2<sup>nd</sup> stage do not directly reduce theoretical quantization noise (TQN) due to the noise cancellation [10].

However, the smaller quantization error extracted from the preceding stage allows insertion of an interstage gain to utilize the full dynamic range of the next stage. The quantization noise is reduced as

$$N_{TQN}(z) = E_3 (1 - z^{-1})^4 / (G_{int1} G_{int2})$$
(2.19)



Figure 2.6 The 2-1-1, 4-bit cascaded multi-bit delta-sigma modulator.

The reduction of quantization noise leakage (QNL) that results from the gain and pole errors of the integrators due to finite gains of the OTAs, is the other benefit to the use of 4-bit quantizers in the first two stages. In addition, optimizing the interstage gains and the use of high gain OTAs can reduce the QNL further and shown as;

$$N_{QNL}(z) \approx 6E_1(1-z^{-1}) / A_{OL} + (9+2G_{\text{int}\,1})E_1(1-z^{-1})^2 / A_{OL}$$
(2.20)

where  $A_{OL}$  is the open loop gain of the OTA. Equation 2.20 suggests that  $G_{int1}$  should be as small as possible while Equation 2.19 suggests that  $G_{int2}$  should be as large as possible without overloading the final stage.

The advantage of the MASH A/DCs is that no stability issues exist as a result of the higher order architecture as long as the stages in cascade are all stable. The drawback is the requirement of the precise matching in order to eliminate the QNL of the 1<sup>st</sup> stage, which is still the challenge in the fabrication process. As a result, the MASH architecture is still less attractive when compared to the single loop with potentially higher successful rate in fabrication, even though the stability issue exists in the high order (beyond 3<sup>rd</sup>) single loop architecture.

#### 2.3 Power dissipation consideration

The growing trend in reducing the power dissipation is to increase the number of quantization bits. Both examples of single loop and MASH architectures show the same approach. An increase in the number of quantization bits allows the modulator to reduce the oversampling rate or order. In addition, power dissipation is reduced since the comparators of the quantizer are essentially all digital circuits and offer less complexity, consuming less power relative to integrators.

In Chapter 3, the modulator order, the oversampling ratio, and the number of bits of the quantizer of the modulator will be explored to achieve improved resolution and power dissipation. The number of bits of the modulator will play an important role in the reduction of power dissipation.

## CHAPTER 3

### MODULATOR DESIGN AND IMPLEMENTATION

This chapter describes the design concepts and circuit implementations of low power  $\Delta - \Sigma$  Modulators. Since the design objective of this project is to minimize the power consumption, the design flow and strategies will focus on the four most significant aspects of a design: process, architectures, circuits, and devices. Their effects on the power dissipation are itemized as the order increases. In addition, reducing the power supply voltage decreases the power consumption but deteriorates the potential system's dynamic range. Various techniques will be deployed to maintain the SNR while minimizing power dissipation. Moreover, layout techniques are briefly described with regard to overcoming some process variations.

#### 3.1 Low power processes, Bulk or SOI?

The process selection is the first step toward the successful low power circuit designs. Large device leakage current and coupling capacitance, or costly process features, such as GaAs, are not suitable for this project. In this case, there are only two choices: Bulk or SOI.

Both processes virtually share the same process procedures (Figure 3.1) except that an additional oxidation process, at beginning, to the wafer results in the formation of insulation layer in SOI processes. This exception is the major advantage and improvement of SOI over Bulk. Figure 3.1a shows no parasitic capacitances between drain/source and substrates. Since  $C_{db}$  is a major contribution of the transistor's output capacitance, its reduction will significantly reduce digital circuit power dissipation, based on the Equation 3.1. Lowering digital power provides more design freedom to the analog circuit designer under a power budget constraint. In addition, the reduction of  $C_{db}$  and  $C_{sb}$ reduces the transistor leakage.



Figure 3.1 Processes of Bulk and SOI. (Courtesy of IBM Corp.)



Figure 3.1a Drain/source to substrate capacitance (C<sub>db</sub>/C<sub>sb</sub>) of Bulk (left) and SOI (right).

(Courtesy of IBM Corp.)

$$P_{digital} = CV^2 f \tag{3.1}$$

Figure 3.2 shows the comparison of digital power dissipation of Bulk and SOI. At the same operating frequencies, power consumption of SOI is 2~3 times lower than of Bulk.

Based on the information above, SOI technology is considered the best candidate for the low power digital circuit implementation. In this dissertation, the designed circuits are all fabricated on the Peregrine SOS process to minimize the power consumption. Note that the only significant difference between SOI and SOS is their substrates: SiO<sub>2</sub>/Silicon for SOI and Sapphire for SOS.



Figure 3.2 Comparison of Bulk and SOI power dissipation.

(Courtesy of IBM Corp.)

Analog power dissipation is governed by

$$P_{ana\log} = IV \tag{3.2}$$

The constant currents, which are used to bias the transistors in saturation (active) region, are the primary source of the power dissipation. A wide-band OTA, for example, requires large currents since its bandwidth is proportional to the bias current. High mobility and thick gate oxide processes can help to reduce the power. However, the trend of CMOS technology is to scale down the gate oxide along with other parameters, which makes the low power analog circuit design more challengeable than ever.

3.2 Architectural configuration and power estimation of a  $\Delta - \Sigma$  modulator

In Chapter 2, the single loop  $\Delta - \Sigma$  modulator has been demonstrated with its benefits and high successful rate in fabrication. In this section, a low power configuration of the modulator will be developed from the architecture shown in Figure 3.3.



Figure 3.3 A general form of a single loop  $\Delta$ - $\Sigma$  modulator.

The modulator includes 4 major components: clock driver, integrator, quantizer, and D/AC. A clock driver as shown in Figure 3.4 is the cascade of the inverters. The purpose of this formation is to isolate the large load from the driving signal so the signal quality will not degrade as the load increase. The number of the stage depends on the ratio of the

load and input capacitance and the scaling factor. The factor,  $e \cong 2.7183$ , is an optimal number [11] but here it is rounded to 3 for simplicity.



Figure 3.4 A clock buffer driver for analog switches.

The power dissipation of the clock driver is considered as digital power and governed by Equation 3.1. It is small compared to the power of the integrator or quantizer. For example, both the clock driver and integrator are running at 100*KHz*. Note that the power supply is assumed unity to simplify the calculation. The power dissipation of the clock driver is

$$P_{digital} = CV^2 f \cong 100 \cdot 10^{-15} \cdot (1)^2 \cdot 10^5 \propto 10^{-8}$$
(3.3)

where  $100*10^{-15}$  is the total capacitance of the clock driver equivalent to the unit inverter which is in the range of *fF*. For the OTA, the supply current is approximately 10*uA* and 7 to 8 of such currents are required.

$$P_{analog} = IV \cong (7 \sim 8) \cdot 10^{-5} \cdot (1) \propto 10^{-4}$$
(3.4)

The ratio of analog to digital power is  $10^4$  and therefore, the power of the clock driver can be ignored.

A serial D/AC (details discussed later) is used in this project and consists of only three transistors. Compared to the rest of loop components, its power dissipation is so low as to be excluded in this discussion. Note that the power of the parallel-to-serial shift register used in conjunction with the serial D/AC, will be summed together with the quantizer's power.

After excluding the power of clock driver and D/AC, the total power dissipation of the modulator becomes only the summation of the integrator and quantizer. In the following discussion, the modulator order, OSR, and quantization bit will be explored based on the assumption above.

Oversampling is the first technique deployed to increase the SNR in the A/DC design. Without additional circuits, simply increase the sampling frequency boosts the resolution of the A/DC as shown the Equation 3.5

$$SNR_{oversampling} = 10\log(OSR) + 6.02B + 1.76$$
 (3.5)

For a 1-bit quantizer system to increase 1-bit of resolution, the OSR must be quadrupled. As a result, the power dissipation increases 4 times as well (Equation 3.1). The SNR/bit in Equation 3.6 can be used as a rough gauge for the power dissipation (proportional). The more precise estimation will be developed later. Note that this practice is power inefficient if the target pass-band is wide. Moreover, no fabricated circuit operates beyond  $f_{max}$ , which is the upper limit of the process.

$$SNR_{oversampling} / bit = 10\log(OSR) / 6 \cong 1.6\log(OSR)$$
(3.6)

The second technique is noise shaping. Equation 3.7 shows the SNR of a modulator loop based on the OSR modulator order, and a B-bit quantizer.

$$SNR_{\Delta-\Sigma} = (200+10)\log OSR + 10\log(20+1) + 6.02B + 1.76 - 9.94O$$
(3.7)

For the modulator architecture with a 1-bit quantizer, the SNR increases 1.5-bit for every doubling the frequency; 2.5-bit for the  $2^{nd}$  order and 3.5-bit for the  $3^{rd}$  order etc. The *SNR/bit* of the modulator is

$$SNR_{\Delta-\Sigma} / bit \cong (3.3O + 1.6) \log OSR + 1.6 \log(2O + 1) - 1.6O$$
 (3.8)

Note that as the order increases, not just the SNR and power dissipation increase but the degree of difficulty in maintaining loop stability [3].

Increasing the quantization bits is the most power efficient technique to boost the SNR since its power dissipation is governed by (3.1) and small. The capacitors used in the comparators are equivalent to 10 of unit inverter. As a result the power ratio of the

integrator to comparator is, therefore,  $\propto 10^3$ . For every quantization bit increase, the resolution increases one bit, and its SNR/bit is

$$SNR_{quantizer} / bit \cong 2^B - 1 \tag{3.9}$$

However, a multi-bit quantizer system requires a multi-bit D/AC to convert the digital output back to the analog signal to form the feedback loop. The drawback of multi-bit D/ACs is their inherent non-linearity. Additional techniques and circuits are required to linearize the conversion process [11]. More importantly, the noise (or error) generated from the D/A process reduces the SNR and as such, requires additional attention to the design.

The summarized proportional power dissipation using the three methods to increase the SNR is listed in Table 3.1.

|                        | SNR/bit                                           |
|------------------------|---------------------------------------------------|
| Oversampling frequency | 1.6log( <i>OSR</i> )                              |
| Loop order             | $(3.3O + 1.6) \log OSR + 1.6 \log(2O + 1) - 1.6O$ |
| Quantization bit       | $2^{B}-1$                                         |

Table 3.1 Power dissipation cost of improving resolution.

The most appealing technique to increase the SNR is to boost the order of the loop, but the power dissipation increase may be the largest. In the modulator design, the order is equal to the number of integrators and the core of an integrator is an OTA. The power consumption of an OTA is dictated by its gain and bandwidth requirement and can be written as

$$P_{ana\log} = mI_{BIAS} (V_{DD} - V_{SS}) \tag{3.10}$$

where  $m = 7 \sim 8$  for a fully differential OTA. Thus, operating the OTA at high frequency is not the best selection to boost the SNR if there are alternative digital circuits that can lower operating frequency of the analog circuits and achieve the same goal. The multi-bit quantizer and D/AC approach is such a circuit and its bottleneck (D/AC non-linearity) can be solved through the use of an innovated serial D/AC, which will be explained in detail in Section 3.3.4.

To further illustrate this low power design concept, the example of 18-bit resolution is set for various modulator approaches. Note that the modulators beyond 4<sup>th</sup> order are not included due to the circuit complexity and stability issues. Figure 3.5 shows that the 1<sup>st</sup> order system requires very high oversampling ratio; approximately 32 and 64 times higher than the 2<sup>nd</sup> and 3<sup>rd</sup> order systems, respectively. Since the power is proportional to the operating frequency, the 1<sup>st</sup> order system is not a good candidate for high resolution requirement design.



Figure 3.5 Oversampling ratio in various loop configurations.

An increasing the number of the quantization bits is more power efficient than an increasing the order. However, the number of required comparators for the quantizer grows exponentially, which means there is a minimum where the modulator order and the number of quantization bit are in balance. To a 1<sup>st</sup> order system, power dissipation is the summation of an integrator and quantizer powers. Note that the D/AC is a digital serial shift register dominated by digital power and as a result is insignificant. For better representation and understanding integrator power, Equation 3.2 is rewritten as

$$P_{\text{int egrator}} = IV = GBP \cdot C_L \frac{\Delta V}{2} V \tag{3.11}$$

where  $C_L$  is the effective load of the integrator and V is power supply voltage. Using integrator settling of 5 time constants (t<sub>s</sub>=5 $\tau$ ), Equation 3.11 is rearranged as follows

$$P_{\text{integrator}} = GBP \cdot C_L \frac{\Delta V}{2} V = 10\pi f_s C_L \Delta V V$$
(3.11a)

Including both digital power of the quantizer and analog power of the integrator, the total power of the modulator is

$$P_{\text{modulator}} \cong P_{\text{integrator}_{1}} + K_{\text{int}} \sum_{i=2}^{O} P_{\text{integrator}_{i}} + P_{quantizer}$$

$$= 10\pi f_{s} C_{L} \Delta VV + K_{\text{int}} \sum_{i=2}^{O} 10\pi f_{s} C_{L} \Delta VV + K C_{Q} V^{2} f 2^{B}$$
(3.12)

where  $K_{int}$  is the power weighting factor for the 2<sup>nd</sup> through n<sup>th</sup> integrators and K is a function of the flash comparator architecture. Normalizing the power of the 1<sup>st</sup> integrator to 1, rearranging the equation, and dividing by the number of bits of resolution give

$$P_{\text{modulator}} / bit \propto \left[ 1 + K_{\text{int}} (O-1) + K_{Q} K 2^{B} \right] / SNR \quad where \quad O = 2, 3, ..., n \quad (3.13)$$

where  $K_Q$  is the power weighting factor of comparators. The 1<sup>st</sup> integrator must be large enough to maintain the thermal noise floor. The 2<sup>nd</sup> and the subsequent integrators were selected at ½ of the power dissipation of the 1<sup>st</sup> integrator ( $K_{int} = 0.5$ ) since the 1<sup>st</sup> integrator dominates the noise floor and requires more power as a result of the capacitors requirement to suppress the thermal noise. As the SNR increases  $K_{int}$  can be reduced further.  $K_Q$  is defined as

$$K_{Q} = K \left( \frac{C_{Q}}{C_{L}} \right) \left( \frac{V}{10\pi\Delta V} \right) \approx 0.02$$
(3.14)

where  $C_Q$  is the value of quantizer capacitor that will be discussed later. Equation 3.13 is plotted in Figure 3.6 and 3.7 below demonstrates that an optimal power/bit architecture exists.



Figure 3.6 Power dissipation in various loop configurations.



Figure 3.7 Combination plot of Figure 3.5 and 3.6.

Assuming an 18-bit objective, Figure 3.7 reveals that  $2^{nd}$  order systems are the most power efficient until quantizer bit is increased beyond 4. At witch points, the  $3^{rd}$  order systems are a better candidate. In conclusion, the  $2^{nd}$  order  $\Delta - \Sigma$  modulator with a 4-bit quantizer, running at 64 times of OSR was chosen to achieve 18-bit of resolution (Equation 3.15).

$$SNR_{A-\Sigma} = (20O + 10) \log OSR + 10 \log (2O + 1) + 6.02B + 1.76 - 9.94O = 118.3dB$$
 (3.15)

## 3.3 Modulator stability

As described in the previous section, the 2<sup>nd</sup> order system is a very stable modulator but not in all condition. The loop can be guaranteed stable if the loop

coefficients are chosen correctly. In addition, coefficient setups must compensate for disturbances generated from process variation. Analysis of the modulator loop is performed in the Z-domain using Figure 3.8. From the transfer function of the modulator, the loop coefficients can be determined.



Figure 3.8 Z-domain of the  $2^{nd}$  order  $\Delta - \Sigma$  modulator.

In the analysis, a stable modulator for the desired base-band SNR is the primary consideration used to set the coefficients for the overall system design. To understand the system behavior, a linear approximation model of the modulator is shown in Figure 3.9.



Figure 3.9 Block diagram of a  $\Delta$ - $\Sigma$  loop.

The output of the linear system,  $v_o(n)$ , can be described as a combined responses of both input  $v_i(n)$  and quantization noise e(n).

$$V_o(z) = \frac{H(z)}{1 + H(z)} V_i(z) + \frac{1}{1 + H(z)} E(z)$$
(3.16)

where the first term on the right side of the Equation 3.16 is the STF and the second term is the NTF. By solving the loop equations of Figure 3.8, simlar expression can be found with detail parameters (Equation 3.17).

$$V_{O}(z) = \frac{\alpha_{1}\alpha_{2}z}{(\beta_{2}+1)z^{2} + (\alpha_{2}\beta_{1} - \beta_{2} - 2)z + 1}V_{i}(z) + \frac{z^{2} - 2z + 1}{(\beta_{2}+1)z^{2} + (\alpha_{2}\beta_{1} - \beta_{2} - 2)z + 1}E(z)$$
(3.17)

$$H(z) = \frac{\beta_2 z^2 + (\alpha_2 \beta_1 - \beta_2) z}{z^2 - 2z + 1}$$
(3.18)

Equation 3.17 shows that the STF is simply a low pass function and NTF is a high pass function. Note that both transfer functions share the same poles (same denominator) and only one of them can dominate. Since the STF is simply a delay of signals, the design will focus on the NTF. The noise shaping of the modulator is determined by the NTF order, placement of the poles, and distributed zeros in the signal pass-band. With the correct approach, the quantization noise can be minimized dramatically.

For the NTF design, all zeros of the function are placed at Z = 1, (i.e. DC) so that the converter can be used for various OSR [4]. In other words, the final design will not be restricted to a particular sampling frequency. In order to simplify the design and increase the successful rate, a Butterworth high pass filter with maximum flatness of pass- and stop-band was chosen to place the system poles in achieving the A/DC specifications while maintaining the stability of the modulator loop. Moreover, the poles of the Butterworth filter are relatively low Q, and thus, the filter alignment tends to be less susceptible to oscillations caused by input signals that are at the same frequency as the poles [12].

There is an inverse relationship between the gain of the NTF and the loop stability; the higher the NTF gain, the more unstable the loop is. However, the gain of the NTF is proportional to the loop's SNR; the higher the gain, the better the loop's SNR. Reference [13] discusses these relationships in detail. For a 1-bit quantizer system, a peak frequency response gain of less than 2 for the NTF is necessary to ensure stability, (i.e.  $|NTF(e^{j\omega T})|_{max} < 2$ ). In practice,  $|NTF(e^{j\omega T})|_{max} \le 1.5$  is usually set to provide a reasonable input range (about 80% of the range of the quantizer). Simulation [13] shows that the NTF gain is proportional to the number of bits in the quantizer. For a system with a 4-bit quantizer,  $|NTF(e^{j\omega T})|_{max} \le 5$  can be set.

The technique used in [13] to increase the NTF gain and boost the SNR is to push the corner frequency away from the base-band,  $f_c >> f_0$ . The noise in the base-band falls well within 12*dB*/octave slope of a 2<sup>nd</sup> order filter transfer function. Taking into account both the increased gain and corner frequency of NTF implies that quantization noise in the base-band is suppressed by an additional 24*dB* for a converter with a 4-bit quantizer compared to a single bit quantizer. Therefore, the total reduction of the quantization noise in the base-band (from 1- to 4-bit) is approximately

36

$$(4-1)bit * 6dB / bit + 25dB = 43dB \tag{3.19}$$

After matching the NTF to the proper filter architecture and the numerical values of both NTF(z) and H(z) functions found, the coefficients can be solved. In practice, dynamic scaling of the coefficients dominates the techniques discussed previously. The scaling ensures that the power levels of all nodes are equal; no large noise gains from nodes with small signal levels results in unstable operation. Dynamic scaling is achieved by proportional scaling the loop coefficients (or gains of each integrator) to avoid signal clipping, as well as power optimization by simulating the system [4]. Coefficients are determined by simulation and through iterative runs for different coefficients. The final coefficients are selected as

$$\alpha_1 = \beta_1 = \frac{1}{2} \quad and \quad \alpha_2 = \beta_2 = 2$$
(3.20)

By using the MatLab<sup>®</sup> toolbox developed by Richard Schreier of Analog Device Inc. [14], the 2<sup>nd</sup> order  $\Delta - \Sigma$  modulator with a 4-bit quantizer is simulated at OSR = 64. Figure 3.10 shows both poles are inside the unit circle and proven stable. In Figure 3.11, coefficients are varied within 20% to prove the system's tolerance to the process variation. The NTF magnitude,  $|NTF(e^{j\omega T})|_{max} \leq 0.57$ , fortifies the loop stability (Figure 3.12). Figure 3.13a shows that the results show no sign of instability within the variation. However, the noise floor rises ( $\cong 2dB$ ) when  $\alpha_2$  or  $\beta_1$  is reduced by 20% (Figure 3.13b). On the STF side, Figure 3.14 shows that either  $\alpha_1$  increase or decrease will attenuate the signal about 2*dB*, and  $\alpha_2$  can be used to adjust the width of the base-band; BW increases as  $\alpha_2$  increase. For  $\beta$  variation (Figure 3.15), only  $\beta_1$  decrease will boost the signal about 2*dB*.  $\beta_1$  and  $\beta_2$  increase and  $\beta_2$  decrease will attenuate the signal by 1.8*dB*. The FFT output reconfirms the design through time domain simulation (Figure 3.16 and 3.16a).



Figure 3.10 Root locus of the modulator NTF.



Figure 3.11 Root locus of the modulator NTF at various conditions.



Figure 3.12 Frequency responses of the modulator STF and NTF.



Figure 3.13 Frequency responses of the modulator NTF at various parameter variations.



Figure 3.13a A section zoom-in of the Figure 3.13.



Figure 3.13b A section zoom-in of the Figure 3.13.



Figure 3.14 Frequency responses of the modulator STF at multiple  $\alpha$  parameter variations.



Figure 3.15 Frequency responses of the modulator STF at multiple  $\beta$  parameter variations.



Figure 3.16 Output spectrum of the modulator.



Figure 3.16a A section zoom-in of the Figure 3.16.

Table 3.2 shows the summary of the effects. Note that 'X' means unchanged.

|          |        | $\alpha_1$   | β1             | α2               | β2             |
|----------|--------|--------------|----------------|------------------|----------------|
| Increase | Signal | -2 <i>dB</i> | -1.8 <i>dB</i> | Base-band widen  | -1.8 <i>dB</i> |
|          | Noise  | Х            | -1.8 <i>dB</i> | -1.8 <i>dB</i>   | Х              |
| Decrease | Signal | -2 <i>dB</i> | 2dB            | Base-band shrink | -1.8 <i>dB</i> |
|          | Noise  | Х            | 2dB            | 2dB              | Х              |

Table 3.2 The effects of the loop coefficients.

# 3.4 Crucial building blocks of a $\Delta\text{-}\Sigma$ modulator

The integrator, quantizer, and D/AC are the three crucial building blocks of the modulator. The sophistication level of these block designs dominates the success of the modulator. As a result, all the design techniques and considerations of these circuits are

described in detail in this section. Figure 3.17 shows the switched-capacitor implementation of the  $2^{nd}$  order single loop  $\Delta - \Sigma$  modulator.



Figure 3.17 Switched-capacitor implementation of a  $2^{nd}$  order  $\Delta - \Sigma$  modulator.

This modulator feeds a 4-bit quantizer output back to the inputs of the two integrators through an innovated serial D/AC. The same OTA is used for both integrators except the first OTA's size is twice of the second one since the second stage does not have the same stringent requirement on noise, slew rate, and high gain requirement as the first stage.

The fully digital control, serial feedback D/AC is a novel addition to the  $\Delta$ - $\Sigma$  modulator. Moreover, the serial D/AC inherits no non-linearity issue since it is a 1-bit D/AC. The shift register output is converted to a serial word used as the control signal to steer the reference currents back to both integrators in order to achieve negative feedback. The benefit for this approach is that the OSR of the modulator can be reduced compared to that of a 1-bit quantizer approach with the same SNR. Therefore, the power dissipation is lower. In addition, this robust serial D/AC approach is taking the advantage of under

44

utilized digital BW, which exists between the integrator ( $f_{analog}$ ) and the digital circuit ( $f_{digital}$ ). A very important observation is that the  $f_s$  is limited by  $f_{analog}$  and the ratio of  $f_{digital}$  over  $f_{analog}$  is related by their overdrive voltages ( $V_{OV-D}/V_{OV-A}$ ) and the square of their channel lengths ( $L_A/L_D$ ). For example, if the channel length ratio of analog over digital is 4 to 6 [4] as in the Peregrine SOS process, the existence of at least 16X more digital BW than analog BW is guaranteed. It is this resource that is exploited to operate the serial D/AC running at  $2^{B*}f_s$ .

For better understanding of the concept, Equation 3.22 and 3.23 shows the approximation of analog and digital operating frequencies using overdrive voltage, channel length, and loading capacitance.

$$GBP = \frac{g_m}{C_L} = \frac{g_m}{nC_{GS}} \propto \left(\frac{V_{OV}}{L^2}\right)$$
(3.21)

$$f_{ana\log} = \frac{gm}{2\pi C_{LA}} \approx \frac{V_{OV-A}}{2\pi C_{LA}L_A^2}$$
(3.22)

$$f_{digital} = \frac{1}{2.2RC_{LD}} \approx \frac{V_{OV-D}}{2.2C_{LD}L_{D}^{2}}$$
(3.23)

$$\frac{f_{digital}}{f_{ana \log}} = \left(\frac{C_{LA}}{C_{LD}}\right) \left(\frac{L_A}{L_D}\right)^2 \left(\frac{2.2}{2\pi}\right) \left(\frac{V_{OV-D}}{V_{OV-A}}\right)$$
(3.24)

Equation 3.24 demonstrates that the ration of the operating frequencies is proportional to the overdrive voltage and inverse proportional to the square of channel length. Assuming the worst case loading of 5 for analog and 6 for digital,  $L_A = 2um$  and  $L_D = 0.5um$ , and  $V_{OV-A} = 0.3V$  and  $V_{OV-D} = 3.6V$ . The result of Equation 3.24a shows that digital circuits can operate 56 times faster than analog counterparts. The excessive BW allows the modulator designed with a quantizer less than 6-bit (Equation 3.25).

$$\frac{f_{\text{digital}}}{f_{\text{ana log}}} = \left(\frac{5}{6}\right) \left(\frac{2}{.5}\right)^2 \left(\frac{2.2}{2\pi}\right) \left(\frac{3.6}{.3}\right) \approx 56$$
(3.24a)

$$B = \log_2 \left(\frac{f_{digital}}{f_{ana\log}}\right) = 5.8 \approx 5$$
(3.25)

Since the power of this D/AC is digital in origin, the added power cost is minimal. By increasing the number of quantization bits, the modulator can be operated at a lower frequency. In another word, the number of quantization bits for a given process can be extended by exploiting this unused digital BW. Thus, implementing a serial D/AC to a multi-bit modulator is an excellent choice for improving the SNR of the A/DC with low power budget.

3.4.1 Integrator and OTA

Shown in Figure 3.18, the modulator is implemented by using two types of integrators: Forward and backward Euler. The first integrator  $(\frac{z^{-1}}{1-z^{-1}})$  is a backward Euler and it over estimates the function. The second integrator  $(\frac{1}{1-z^{-1}})$  is a forward Euler and it under estimates the function. By combining both integrations, the output of the second integrator should be virtually the same as the Trapezoid  $(\frac{1+z^{-1}}{1-z^{-1}})$  (area under the dash line in Figure 3.18).



Figure 3.18 Forward and backward Euler and Trapezoid integrations.

The core of the integrator is an OTA. Thus, both of them are discussed concurrently in this section. Figure 3.19 shows a generic fully differential switchedcapacitor integrator. The benefits of the circuit are high PSRR, reduced clock feedthrough and switch charge injection errors, improved linearity, and increased dynamic range [15]. Note that the clock signals outside the parentheses are for non-inverting integrators and inside are for inverting version (delay-free).



Figure 3.19 Switched-capacitor implementation of a fully differential integrator.

Non-fully differential version of integrator is shown in Figure 3.20. It is a half of the fully differential version and easier for circuit analysis.



Figure 3.20 Switched-capacitor implementation of an integrator.

The major functions of the integrator are to integrate (sum) charges and to maintain the noise level. The operation of the former is simple and described in Reference [3] and the latter requires special attention. In Figure 3.20, the dominant noise sources are the thermal noise generated from the MOS switching resistance. Capacitors do not generate any noise but accumulate noise generated by other noise sources [4]. Thus,  $C_S$  must be designed sufficiently large to set the noise 9*dB* lower than the modulator noise floor [15]. Equation 3.26 shows the integrator RMS noise voltage. Note that 8 switches (P and NMOS in parallel represent one switch in Figure 3.20) involve in the sample-and-hold processes and with the NMOS assumed to be the dominant noise sources. Finally, their resistor values can be ignored in the noise calculation [4]

$$V_{n(rms)-\text{integrator}} = \sqrt{\frac{4kT}{C}}$$
(3.26)

Since the oversampling technique is used in the modulator design, the sampling capacitor can be scaled down and Equation 3.27 is modified as follows

$$V_{n(rms)-\text{int egrator}} = \sqrt{\frac{4kT}{OSR \cdot C}}$$
(3.27)

The SNR of the integrator is

$$SNR_{NF} = 10\log\left(\frac{\left(V_{FS}/2\sqrt{2}\right)^{2}}{\left(V_{n(rms)-\text{integrator}}\right)^{2}}\right) = 10\log\left(\frac{V_{FS}^{2} \cdot OSR \cdot C}{32kT}\right) \le SNR_{\Delta-\Sigma} - 9dB \quad (3.28)$$

The first integrator dominates the overall system noise floor and any noise injected at the 1<sup>st</sup> integrator will be seen as signal for the following stages. In such a case, the first

sampling capacitor must be designed large enough to minimize the noise level (Equation 3.29).

$$C_{s} \ge \frac{10^{SNR_{NF}/10} \cdot 32kT}{V_{FS}^{2} \cdot OSR}$$
(3.29)

The feedback capacitors accumulate the noises from 2 sets of switches and OTA. The OTA noise (Equation 3.30) is assumed to be small and can be ignored since the  $g_m$  of the OTA is large. In addition, 1/f noise can be reduced by increasing the channel area.

$$V_{n(rms)-OTA} \cong \sqrt{4kT\left(\frac{2}{3}\right)\frac{1}{g_m} + \frac{K}{WLC_{ox}f}}$$
(3.30)

where K is the constant and depends on device characteristics and can vary widely for different devices in the same process [4].

Since the first loop coefficient fulfilled by  $C_F$  being two times larger than  $C_S$ ( $C_F = 2C_S$ ) and  $C_F$  accumulates less noise than  $C_S$ . As a result, the integrator noise floor is dominated by the size of its sampling capacitor ( $C_S$ ).

After the sampling and feedback capacitor are sized to the desired values, sizing the sampling switches and designing the OTAs are the next tasks. The sampling switches are sized according to the settling time and resolution. Sampling frequency dictates the former and the latter depends on the SNR of the modulator. In this project, the switches are sized to meet the requirement of 18-bit resolution, settling in one-half the sampling clock period. In practice, the resolution was set to 19-bit for one bit additional of safety margin. Note that sampling switches consist of PMOS and NMOS in parallel in order to reach both power rails, and 2 sets of these switches involve in the sampling process. The constraint of the switch on resistance is

$$\frac{V_o}{V_i} = \frac{V_{FS}}{\frac{V_{FS}}{2^{(n+1)}}} \le 1 - e^{-(t_s/2)/\tau}$$
(3.31)

$$R_{on} \le \frac{1}{4\pi C_I f_s(n+1)\ln 2}$$
(3.31a)

where n is the number of bits of resolution. When the sampling process begins, the sampling switches operate from off region to either linear or saturation, depending on the initial V<sub>DS</sub> and its relationship with V<sub>GS</sub> ( $V_{DS} \ge V_{GS} - V_{th}$ ). For this project, the switches are assumed to be at their highest resistance ( $V_{DS} \cong \frac{1}{2}V_{DD}$ ), where the increasing NMOS resistance becomes larger than the decreasing PMOS. The channel resistance of a transistor is approximately equal to Equation 3.32. Combining the Equation 3.31a and 3.32, the transistor's geometries can be solved from Equation 3.32a.

$$R_{channel} \cong \frac{2}{\mu C_{ox} \frac{W}{L} V_{OV}}$$
(3.32)

$$\frac{W}{L} \ge \frac{2}{\mu C_{ox} R_{channel} V_{OV}}$$
(3.32a)

Except the common mode feedback circuit described in the next section, detail transistor level design and noise analysis of a fully differential folded cascode OTA are completely covered in Reference [15-16]. Here, the discussion involves the relationship between settling time, bandwidth, slew rate, and gain. Figure 3.21 shows the OTA without the CMFB circuit. Note that the circuit on the left side of axis of symmetry is identical to the one on the right side.



Figure 3.21 A fully differential folded cascode OTA without the CMFB circuit.

In order to simplify the equations the following OTA parameters, gain, BW, and GBP are calculated for the single sided output. Except GBP, these parameters are readily

converted for the fully differential OTA by simply doubling their values, assuming that no external component connection crosses the axis.

$$GBP = \frac{g_{m-1p}}{C_L} \tag{3.33}$$

$$A_{0} = g_{m-1p} \left( R_{3p} / / R_{3n} \right)$$
 (3.34)

The load capacitance is the prerequisite for starting the analysis. For the closed loop, which sampling and feedback capacitors forms feedback network around the OTA, the closed loop GBP is calculated as

$$\omega_{CL} = \beta \cdot GBP \tag{3.35}$$

From Figure 3.20, the feedback network and the load of the OTA are

$$\beta = \frac{\frac{1}{s(C_s + C_{gs})}}{\frac{1}{s(C_s + C_{gs})} + \frac{1}{sC_F}} = \frac{C_F}{C_s + C_{gs} + C_F}$$
(3.36)

$$C_{L-OTA} = C_L + \frac{C_F (C_S + C_{gs})}{C_F + C_S + C_{gs}}$$
(3.37)

where  $C_L$  here is the sum of  $C_{db-p}$  and  $C_{db-n}$  in parallel and the capacitance of the next stage that the OTA must drive. Note that the former are very small and thus, can be ignored. Substituting Equation 3.35 with 3.36 and 3.37 and rearranging terms give

$$\varpi_{CL} = \beta \cdot \frac{g_m}{C_{L-OTA}} = \frac{g_m}{C_L + (C_S + C_{gs}) + \frac{C_L(C_S + C_{gs})}{C_F}}$$
(3.38)

$$C_{Leff} = C_L + (C_S + C_{gs}) + \frac{C_L(C_S + C_{gs})}{C_F}$$
(3.39)

Equation 3.39 shows the effective load capacitance. Substituting  $\tau = \frac{2\pi}{\varpi_{CL}}$  into

Equation 3.31, and rearranging the equation give

$$\frac{t_s}{2} \ge \frac{2\pi}{\varpi_{CL}} (n+1) \ln 2 = \frac{2\pi C_{Leff}}{g_m} (n+1) \ln 2$$
(3.40)

In order to fulfill the bandwidth requirement (settling ½ of the sampling period), OTA transconductance must follow Equation 3.41.

$$g_m \ge 4\pi C_{\text{leff}} f_s(n+1) \ln 2$$
 (3.41)

In reference [17], simulation shows that an integrator with lower bandwidth than the sampling rate and correspondingly inaccurate settling will not impair the  $\Delta$ - $\Sigma$  modulator performance, provided that the settling process is linear. The assumption of the previous statement is that the OTA is not slew rate limited. Slew rate limiting of the integrator, in any case, is non-linear behavior and must be avoided. With use of a multibit quantizer, slew rate limiting is of little concern as will be demonstrated later. Equation 3.42 shows SR is the ratio of an OTA's differential pair tail current and its effective capacitor load and is also related to the full power bandwidth.

$$SR = \frac{I_{tail}}{C_{Leff}} = 2\pi f_M \frac{V_{FS}}{2} = \pi f_M V_{FS}$$
(3.42)

$$f_{M} = \frac{V_{OV}GBP}{\pi V_{FS}} = \frac{4V_{OV}2^{B}f_{S}}{V_{FS}}$$
(3.43)

The full power bandwidth is linear and proportional to the tail current; increasing the tail current improves the bandwidth. The drawback is that increasing tail current increases the bias current and result in overall power consumption increase.

The last parameter is the gain of an OTA. An ideal integrator has the transfer function as

$$H(z) = \frac{1}{z - 1} \tag{3.44}$$

Finite DC gain of the OTA will transform the Equation 3.44 into 3.45 and results in leaky integrator. The NTF zeros move away from the unit circuit and toward Z = 0, which reduces the amount of attenuation of the quantization error in the base-band and thus deteriorates the SNR [17].

$$H(z) = \frac{1}{z - \left(1 - \frac{1}{A_0}\right)}$$
(3.45)

In reference [17], the gain of an OTA can be as low as OSR. The SNR of the OTA with low gain shows only 1dB worse than the one with infinite DC gain. The simulation results and Equation 3.30, however, demonstrate the noise of the OTA can be minimized by large  $g_m$ , which infer large gain is beneficial.

#### 3.4.2 Dynamic common mode feedback

The requirement for a common mode feedback circuit is the main drawback of a fully differential OTA. A carelessly designed CMFB circuit will increase noise and power dissipation, and reduces the bandwidth and output swing of the OTA. The function of this circuit is to set and control the output common mode voltage at some desired levels, i.e.,  $V_{ref} = (V_{DD} - V_{SS})/2$ . Typically, the CMFB is used to control the current sources in the output stage to establish the common mode output level of the OTA. In the consideration of larger output signal swing, a switched-capacitor CMFB is realized

instead of a continuous time approach. In Figure 3.22, the design initially proposed by Reference [18] is particularly suited for low power applications.



Figure 3.22 Architecture of the switched-capacitor CMFB circuit.

The circuits inside the dash box of Figure 3.22 are the SC circuits. They sample output and then average;  $\phi 1$  for sample and  $\phi 2$  for average. Transistor M<sub>1</sub>, M<sub>2</sub>, M<sub>3</sub>, and M<sub>4</sub> form a CMFB OTA to amplify the voltage differences between V<sub>cm</sub> and V<sub>ref</sub>. The amplified output V<sub>CNTL</sub> controls adjust transistors, M<sub>1n</sub> and M<sub>2n</sub> of OTA (Figure 3.21), so the output common mode voltage is adjusted to equal V<sub>ref</sub>.

The required gain of the CMFB loop is designed high enough to correct the  $V_{cm}$ , but not to destabilize the OTA [18]. Figure 3.22 shows the diagram of the CMFB loop that consists of half of the OTA and CMFB circuit and the CMFB OTA.



Figure 3.23 Circuit diagram of the CMFB loop.

The CMFB loop gain is approximately equal to

$$A_{loop-CMFB} = \frac{\Delta V_{cm}}{\Delta V_{ref}} \cong \frac{g_{m-2n}(R_n //R_p) \frac{1}{2} \frac{g_{m-2}}{g_{m-4}}}{1 + \left(g_{m-2n}(R_n //R_p) \frac{1}{2} \frac{g_{m-2}}{g_{m-4}}\right)}$$
(3.46)

where  $R_n//R_p$  is the resistance looking into the output node  $V_0$  of the OTA. For simplicity, the ratio  $\binom{g_{m-2}}{g_{m-4}}$  can be set to 2 so the whole loop gain is dominated by the device gain of the OTA. Note that, the transistor sizes of the CMFB OTA are designed to be one half of those in the OTA. This size reduction will not degrade the GBP of the CMFB OTA since its capacitance load is small. As a result, the power dissipation is reduced. The previous paragraph just explains the primary function of the CMFB circuits. The secondary function is that the SC circuits perform the low pass operation to the control voltage so it will not result in the high frequency injected noise interfering with the feedback loop during the sampling period. Figure 3.24 shows the equivalent circuit of the dash box section of Figure 3.22.



Figure 3.24 Equivalent circuit of the switched-capacitor section of the CMFB circuit.

At DC and low frequency, the equivalent R network dominates the final  $V_{cm}$ . As the frequency increase, the C<sub>0</sub> branch takes over the effect. Both resistor and capacitor networks are working as voltage dividers: C<sub>0</sub> and C<sub>GS</sub> of M<sub>1</sub> at high frequency (stopband) and two equivalent resistors at low frequency (pass-band). Equation 3.47 shows that that C<sub>0</sub> capacitance controls the high frequency attenuation since the C<sub>GS-M1</sub> is fixed while designing the gain of the CMFB OTA.

$$V_{cm} \cong \frac{2C_o}{2C_o + C_{GS}} (V_{o+} + V_{o-}) \frac{R}{R + R}$$
(3.47)

 $C_R$  is then, sized for the desired bandwidth. Since both pole and zero are in close vicinity, simulation tools such as  $OrCAD^{\circledast}$  are required to confirm the bandwidth, which should be 2~3 times wider than sampling frequency.

3.4.3 Quantizer

The quantizer is designed using a regenerative circuit as comparator. The difference between the input signals and the reference voltages generated from the serial resistor network sets the initial condition for the regenerative process. Note that quantizer capacitors,  $C_Q$ , are used to sample the 2<sup>nd</sup> integrator output and hold the charges. Figure 3.25 shows a single regenerative quantizer.



Figure 3.25 Circuit diagram of the low power regenerative quantizer.

The number of quantizers depends on desired level of quantization and the available excess digital BW. Note that the number of quantizers need not be a power of two. Except for an extreme, the more quantizers that can fit into the wider digital BW, the better. For a B-bit quantizer, 2<sup>B</sup> quantizitation levels require 2<sup>B</sup>-1 regenerative circuits. The required resistors for the reference string are 2<sup>B</sup>. Note that the resistances on both end resistors are half of the others, which creates 0.5 LSB offset. This results in a mid-tread quantizer.

The clock sequence is as follows. At phase 1a, the quantizer capacitors sample both input and reference voltages. Concurrently, the reset switches are closed to rebalance the regenerative circuits. At phase 2a, both quantizer capacitors are connected to each side of the circuits; charging  $C_{GS}$  of every inverters. Note that power supplies are off at this stage. After the circuits settle, the power is applied at phase 2 and the regenerative process begins.

The latch time of the regenerative circuit can be found from Equation 3.48 [4].

$$t_{latch} \cong K \frac{L^2}{\mu \cdot V_{ov}} \ln \left( \frac{V_{FS}}{V_{FS}/2^{B+1}} \right) = K \frac{L^2}{\mu \cdot V_{ov}} (B+1) \ln 2$$
(3.48)

\*

where K is the loading factor. The equation shows that the latch time depends only on the technology and not on design, assuming that the overdrive voltage is maximized and the

loading capacitor is minimized. Note that if the initial voltage is smaller than  $\frac{V_{FS}}{2^{n+1}}$ , the latch time will be larger than  $t_{latch}$ , which results in metastability [4].

Since Equation 3.48 reveals that the latch time cannot be improved by designs, the focus now shifts to the voltage droop between phase 2a and 2. Before further analysis, the assumption for this circuit to work is the symmetry; the circuit looking from input and reference sides should have the same load. Under this condition, the droop rates are equal on both sides. The circuit symmetry is achieved through careful layout techniques, i.e. common-centroid and interdigitate. More details will be discussed later.

Immediately after the rise of phase 2a, the quantizer capacitors charge or discharge the gate capacitors on both sides of regenerative circuits. Based on the charge sharing and energy conservation theories, the final equilibrium voltage is equal to

$$V_f = \frac{V_1 C_{\mathcal{Q}} + V_2 \sum C_{GS-inverter}}{C_{\mathcal{Q}} + \sum C_{GS-inverter}}$$
(3.49)

where  $\Sigma C_{GS-inverter}$  is the sum of the gate capacitors of all inverters connecting to V<sub>2</sub>. The worst-case scenarios are two initial conditions: V<sub>1</sub> is at full scale and V<sub>2</sub> is zero, and vise versa. Since the quantization error is modeled in the range of  $\pm \frac{\Delta}{2}$  or  $\pm \frac{LSB}{2}$ , the quantizer capacitors must be selected large enough so the final voltage will not drop or rise out of the ranges.

Rearranging Equation 3.49 solves the capacitance of the quantizer capacitor.

$$C_{Q} = \sum C_{GS-inverter} \frac{V_{f} - V_{2}}{V_{1} - V_{f}}$$
(3.50)

Replacing  $V_1 = V_{FS}$  and  $V_2 = 0$  in Equation 3.50 for one set of initial condition and setting the error of final equilibrium voltage within the half of the quantum

$$(V_f = V_{FS} - \frac{V_{FS}}{2^{B+1}})$$
 gives

$$C_{O} \ge C_{GS}(2^{B+1} - 1) \tag{3.51}$$

Replacing V<sub>1</sub> = 0 and V<sub>2</sub> = V<sub>FS</sub> and set  $V_f = \frac{V_{FS}}{2^{B+1}}$  gives the same result as Equation 3.51.

Scaling the resistors of reference network uses the same equation as scaling the sampling resistors. Since the error (bow effect [4]) in reference voltage will be greatest at the center and decreases as moving away from the center, optimizing the resistors should start from the center. The equivalent resistance can be found in Equation 3.52.

$$R \le \frac{R_i}{\left[\left(2^{B-1} - \left|2^{B-1} - i\right|\right) / / \left(2^{B-1} + \left|2^{B-1} - i\right|\right)\right]}$$
(3.52)

where i is the reference point counting from top or bottom and satisfies the condition  $0 < i < 2^{B}$ . The resistors should be chosen as large as possible (shorten the settling time to  $1/4 \sim 1/8$  of the sampling period if necessary) so the power dissipation can be minimized.

# 3.4.4 Serial D/AC

The quantizer output is thermometer coded digital data. They then, distribute into two directions: feeding directly to the serial D/AC (Figure 3.26) and to the encoder that converts thermometer code to binary for the following decimation filters (Chapter 5). The D/AC consists of a parallel-to-serial shift register, combinational logic and three switches, and two reference currents. Note that the serial output is clocked out by the D/AC's clock ( $(2^{B+1}-2)f_s$ ) during on periods of phase 1 for 2<sup>nd</sup> integrator and 2 for 1<sup>st</sup> integrator.



Figure 3.26 Circuit diagram of the parallel-to-serial current feedback D/AC.

The parallel-to-serial shift register (Figure 3.27) is a group of D type flip-flops and data multiplexers. When the Sel signal is high, the data are loaded from the quantizer and stored into the registers. After the signal goes low, the data paths change from a parallel load to serial out register with the output of each DFF connecting to the input of the next one. The last stage outputs serve as the control signals to steer the current via the feedback switches. Note that the last stage output (Q only) is also fed back to the first stage input so the data are in circulation and used during both clock phases.



Figure 3.27 Circuit diagram of the parallel-to-serial shift register.

The feedback D/AC is just a group of current steering switches for the reference currents. While the output of the shift register (Q) is high,  $I_{FB}$  is integrated into one of the feedback capacitors of the integrators. The on time reference current is proportional to the quantizer output serially processed by the shift register. The equivalent voltage output of D/AC can be expressed as follow

$$V_{DAC} = \frac{I_{FB} t_{D/AC}}{C_F} \frac{CNT_Q}{2^B - 1}$$
(3.53)

where  $CNT_Q$  is the total number of "1s" in the thermometer code of the quantizer and  $t_{D/AC}$  is the on time of the D/AC clock.

This D/AC is a 1-bit D/AC and inherently linear. It is charge-based and monotonic by design [2]. Its accuracy, however, is at the center of this D/AC design. From Equation 3.53, the D/AC is a function of the signal resolution including the rise and fall time and jitter, the matching of the capacitors, and the resolution of the reference currents. The total D/AC error can be modeled as Equation 3.54 [15].

$$\sigma_{D/AC} = \sqrt{\left(t_{jitter}\right)^2 + \left(\frac{\Delta I_{FB}}{I_{FB}}\right)^2}$$
(3.54)

where  $\Delta I_{FB}$  is the noise of the D/AC reference current and

$$t_{jitter} = \frac{\Delta t_{D/AC}}{t_{D/AC}} = \frac{\sqrt{0.7kTR_n}}{A\sqrt{\pi t_r}}$$
(3.55)

where  $R_n$  is the equivalent noise resistance of the D/AC and A is the amplitude of the ideal sine wave used to generate the D/AC clock.

Capacitor matching can be improved by common-centroid and interdigitated layout techniques, which will be described later. Equation 3.55 shows that maximizing the rise and fall time minimizes the clock jitter by sizing the output driver (transistor's length) of the D/AC clock and the switches of the current steering. Moreover, the central limit theory is implemented for further reduction. The nature of thermometer code is that data are grouping together; the output of the shift register is a cluster of ones and followed by a period of all zero or vise versa. By gating the data with the D/AC clock, the output become a series of zeros and ones pulse trains. The jitter, thus, can be effectively reduced by

$$t_{jitter-effective} = \sqrt{\frac{\sum \left(\frac{\Delta t_{D/AC}}{t_{D/AC}}\right)^2}{n}}$$
(3.56)

where  $\Delta t_{D/AC}$  is the uncertainty of the clock signal and n is the number of pulses to achieve each feedback (2<sup>B</sup> - 1 is maximum or 2<sup>B-1</sup> - 1 on average). Note that as the jitter noise is minimized to less than  $\frac{KT}{C}$  noise, the reference current noise dominates, which has the same constraint as the jitter noise in order to maintain the SNR of the modulator set by sampling capacitor (Equation 3.29).

The D/AC error involves three practical issues to the modulators: the linear scaling error, leaky integration, and INL. The switched reference current scales the modulator feedback coefficients of the NTF zeros, and the shift in zero placement will cause degradation of noise shaping. Leaky integration resulting from leaky current steering transistors has the same impact as linear scaling error. In order to minimize INL, the D/AC error must be constrained within  $\pm \frac{\Delta}{2}$  to realize the accuracy of the modulator

[19]. Thus, fine calibration of the references and long channel length of the current steering transistors are required for the high resolution A/DC. Moreover, the fully differential architecture allows one reference current per feedback coefficient, which ensures excellent D/AC feedback accuracy and optimizes power dissipation.

Although the D/AC clock frequency is designed at  $(2^{B+1}-2)$  times the oversampling frequency, the increase in digital power dissipation is minimal as the gate count and area of the shift register and circuits to drive the D/AC are quite small.

3.4.5 Clocking sequence

In order to maximize the modulator loop efficiency and the utility of the clock period, each component in the modulator is assigned to operate at specific phase. Thus, there is no excessive delay in the loop and the whole operation is a pipelined process.

The assignment of clock phase starts from the integrators due to their needs of the whole clock on period to sample and settle the data. Since the second integrator (inverted) sampled and settled at the same period, it is the center of this scheduling. Integrator 2 can be assigned to either phase 1 or 2. Once decided, integrator 1 is assigned to the same phase and quantizer is assigned to the opposite phase to streamline the data flow.

The CMFB circuits are sampling and settling at opposite phases of the integrators due to the CMFB sampling capacitors connecting to the OTA output all the time as the load of the OTA. They sample the settled integrator output, process, and feed back to the adjust transistors at beginning of the next stage sampling phase. This adjustment time is designed as short as possible, which leaves the whole clock on period for the next stage to settle. Therefore, the CMFB OTAs must have wider bandwidth than OTAs for faster settling.

The D/AC is assigned to feed the output back to OTAs on the first half of clock on time. The other half is designed for OTAs settling. During the feedback process, the OTA input voltages are the function of the D/AC feedback currents and the OTA cannot begin settling process before the feedback complete. As a result, the OTAs must be designed to settle in <sup>1</sup>/<sub>4</sub> of the sampling period.

The above discussion is summarized in table 3.3. Note that mark 'X' indicates resetting sampling capacitors.

|         | Int 1  | CMFB   | Int 2         | CMFB   | Quantizer | D/AC  |
|---------|--------|--------|---------------|--------|-----------|-------|
|         |        | Int 1  |               | Int 2  |           |       |
| Phase 1 | Sample | Settle | Sample/Settle | Sample | Settle    | Int 2 |
| Phase 2 | Settle | Sample | X             | Settle | Sample    | Int 1 |

Table 3.3 Clock phase assignment of the modulator.

3.5 Device

For low power circuit, the designs always start from architecture and move down to device level (top to bottom) when minimizing power consumption in the total system. In Section 3.1, the modulator implemented with a multi-bit quantizer approach has demonstrated a notably significant improvement in power efficiency. After the architecture is selected, circuit implementation can be commenced. For all modulators, the OTA of the 1<sup>st</sup> integrator dominates power dissipation due to its requirement to settle the sampling capacitors that dominate the noise floor. Balancing the drive capability and accuracy (bandwidth and gain) is the major challenge of this OTA design. For better reference, Equation 3.33, 3.34, and 3.42 have been rewritten in general to show designer controlled parameters.

$$GBP = \frac{g_m}{C_L} = \frac{g_m}{nC_{GS}} \propto \frac{V_{OV}}{L^2}$$
(3.57)

$$A_0 = \frac{g_m}{g_0} = \frac{g_m}{\lambda I_{bias}} = \frac{2}{\lambda V_{OV}}$$
(3.58)

$$SR = \frac{2I_{bias}GBP}{g_m} \propto \frac{V_{OV}^2}{L^2}$$
(3.59)

The relationship of gain, GBP, and SR are closely related and they all depend on the design requirement. The following discussion focuses only on adjusting these parameters in general: bias current, transistor channel width and length, and overdrive voltage. The first parameter is the bias current (or leg). It is proportional to SR and the square of GBP but inverse to square root of the gain. Since the power dissipation is proportional to the bias current, the rule of thumb is not to over design the bandwidth. In general,  $2\sim3$  times wider BW than the requirement for low frequency applications and  $1\sim2$  times wider for high frequency ones should compensate quite well to the process variation.

Due to high gain requirement on most OTA designs, analog designs often scale transistor's channel width to obtain high transconductance or channel length for high output resistance. Widen the channel, in addition, reduces the effect of channel length modulation. The longer the channel is, the lesser the modulation affects the transistor performance. Practically, channel lengths of analog transistors must be scaled 4~5 times wider than the minimum in order to minimize the modulation and for better matching [20]. The drawback of this approach is the available bandwidth reduction approximately 56~87 times (Equation 3.24) but gain is substantially increased. After channel length is selected, channel width is scaled for required bandwidth.

Selecting overdrive voltage is a dilemma when designing OTAs. Table 3.4 shows that the voltage is involved in all three parameters. It appears increasing the voltage is a good selection for all except boosting gain. However, higher overdrive voltage results in narrower output swing of OTAs and DR reduction. In general, the voltage is selected within the range of 200 to 350mV to maintain high gain. A more precise value can be extracted from the method described in the next section. In addition, as process technologies advance and device size shrinks the power supply is also scaled down. This

trend makes it very difficult for analog designers to obtain high SNR. Therefore, selecting an optimal overdrive voltage is essential.

| Gain (A <sub>0</sub> ) | Gain bandwidth product (GBP) | Slew rate (SR)   |
|------------------------|------------------------------|------------------|
| 1                      | $\sqrt{I_{bias}}$            | $I_{bias}$       |
| $\sqrt{I_{bias}}$      |                              |                  |
| 2                      | V <sub>ov</sub>              | $V_{OV}^2$       |
| $\lambda V_{ov}$       | $L^2$                        | $\overline{L^2}$ |

Table 3.4 Design parameters vs. OTA parameters.

An OTA is just a complex version of a single transistor from the amplification point of view. In order to increase the amplification with minimum resources, the single transistor must be bias at correct overdrive voltage. Figure 3.28 shows the simulation setup.



Figure 3.28 Circuit schematics of testing a single transistor.

After the drain current ( $I_{D2}$ ) measured, the first derivative (Equation 3.60) is taken and plotted in the Figure 3.29

$$\frac{\partial I_D}{\partial V_{GS}} = g_m V_{OV} \tag{3.60}$$

The transconductance shows improvement as the gate voltage increase. However, it shows no additional information but certify that  $g_m$  is proportional to the overdrive voltage. The second derivative (Equation 3.61) is then taken and plotted in Figure 3.30

$$\frac{\partial^2 I_D}{\partial^2 V_{GS}} = \mu C_{OX} \frac{W}{L}$$
(3.61)

In the figure, there is a maxima at the specific  $V_{GS} (\cong 0.96V)$ , which is the transition point between triode and saturation. Based on Equation 3.61, the only variable is the mobility ( $\mu$ ) of electrons (or holes). At this voltage, the effective mobility is at its highest. Further increasing the overdrive voltage increases the  $g_m$  at the expense of decreasing the mobility. The transistors biased at this voltage yield the most gain with minimum power supplies. Note that throughout this dissertation, all transistors are designed operating at saturation region where they are biased at the voltage higher than that of the maxima in order to prevent the transistors from drifting out of the region during the operation and to compensate process variation. For the transistors biasing at the maxima or lower voltages for extreme low power design, details can be found in many references related to EKV modeling.



Figure 3.29 The 1<sup>st</sup> derivative of the simulated  $I_D(g_m vs V_{OV})$ .



Figure 3.30 The  $2^{nd}$  derivative of the simulated  $I_D(\beta vs V_{OV})$ .

# 3.6 Layout

To the digital layout, area and timing constraints are the two major parameters pressuring the "digital designers" beyond circuit function. However, analog designers are

not so lucky. Beside those constraints, analog designers must be concerned with noise interference, and circuit matching for  $V_{OS}$ , CMFB, CMRR, and PSRR.

Circuit isolation is important to the analog circuit design. As more and more analog and digital circuits are fabricated on the same dies and digital circuits running at higher and higher operating frequency (generating more noise), the wider spacing between circuits are required. Although there is no substrate on SOS process to couple the noise, high frequency signals can still radiate into the air due to their wavelength comparable to interconnect length (as antenna) and induce electrical coupling known as crosstalk on neighboring wires due to mutual capacitance and inductance. Wider spacing, shorter line, narrower line width (higher line resistance), GND separation trace, and shielded multi-interconnection scheme [21] can mitigate this interference.

The analog circuits are precision craftsmanship but wafer processes are not. This has been a problem since the first analog circuits were fabricated on silicon. Thus, beside the ingenious circuit designs employed analog designers require certain knowledge of the wafer processes and layout techniques. Since process is another specialty and requires a long period of time to discuss, readers can obtain such knowledge from reference [22]. The remaining of this section just describes the layout techniques briefly.

Majority of the analog circuits involve symmetry (i.e. differential pair) or mirroring (i.e. current mirror). Matching of relative circuits or components, in this case, is

important. In order to compensate the process gradients, two special layout techniques are discussed: common-centroid and interdigitate.

Figure 3.31 shows the common-centroid layout. The gradient (show in dash lines) is the same in any directions. This is the optimal layout for the matching. However, the component values, sometimes, are not dividable by  $2^{2k}$ , where k = 1, 2, ....



Figure 3.31 Layout pattern of a common-centroid.



Figure 3.32 Layout pattern of an interdigitate.

The alternative is to apply interdigitate (Figure 3.32 to the layout. Note that interdigitate technique can be used in any values that are divisible by 2.

In both layouts, the components used the circuit are surrounded by elements referred to as dummies. They are made of the same elements as the main components but not used for the circuits. They have no electrical connections. Their purpose is to increase the matching. Without these dummies, the outer most structure will not have the same process gradient as the inner. Figure 3.31 shows that the dash lines can be extended to include the dummies. However, interdigitate layout cannot reach the same symmetry to its dummy elements. It demonstrates that the common-centroid method is a better circuit layout technique.

#### CHAPTER 4

#### DECIMATION FILTER OVERVIEW

The output of the modulator represents the combination of the input signal with its out-of-band components and many other noise sources. These noise sources include quantization noise, circuit noises from transistors and resistors, interference of wire coupling, and noise injection from the power supply. This modulator output is operating at an oversampling frequency and cannot be used by the following digital circuits without being downsampled to the Nyquist frequency and filtered. Without filtering, the aliasing (high frequency contents overlapping the low frequency ones) occurs and degrades the resolution of the A/DC. Thus, the architecture of the  $\Delta-\Sigma$  A/DC is always a modulator loop, followed by a decimation filter as shown in Figure 4.1.

| Low frequency | Modulator | High frequency, high resolution | Decimation | Low frequency          |
|---------------|-----------|---------------------------------|------------|------------------------|
| analog input  | loop      | undecoded digital data          | filter     | decoded digital output |

Figure 4.1 Block diagram of a general  $\Delta$ - $\Sigma$  A/DC.

Figure 4.2 shows a classic frequency response of the A/DC and the decimation filter output. An ideal decimation filter (in dash line) removes all the out-of-band noise, except the noise that resides inside the pass-band. In reality, such filter does not exist.

Filters will always have some slope in transition-band, droop in pass-band, and/or ripple (leakage) in pass- and stop-bands. However, such "brick-wall" like filter implementation requires large area due to its requirement of high filter order. The better approach is to use multiple filters in cascade, which has identical performance with less order. More details are described in Chapter 5.



Figure 4.2 Frequency responses of an A/DC output and a decimation filter.

Based on the impulse response, digital filters can be categorized in two groups: finite impulse response (FIR) and infinite impulse response (IIR). The FIR filter inherits linear phase and is an all zero filter, which has no feedback term in the filter structure. The IIR filter, however, has both feed forward and backward terms, which result in nonlinear phase and potential instability. The IIR filter can achieve the same frequency response as the FIR filter does with smaller area, its stabilization and phase adjustment circuits can, sometimes, be overwhelming and possible require larger area than of the FIR filter. As mentioned before, circuit simplification improves successful rate of the chip fabrication. Thus, IIR will be excluded from this project.

The following sections discuss the three most popular decimation filter architectures: moving average, comb (or Sinc), and half-band filters. The innovated twopath filter, which combines all three filter techniques, will be introduced at the end of this chapter and its detailed implementation presented in Chapter 5.

4.1 Moving average filter

The filter structure (Figure 4.3) is realized directly from the equation (Equation 4.1), or called direct form I realization. Note that Equation 4.1 is the expression before the decimation process. The implementation is simple and without any sophisticated technique but it illustrates the filter can be implemented as simple as three components only: adder, register, and multiplier.

$$Y(z) = \sum_{i=0}^{n} C_{i} \cdot X(z) z^{-i}$$
(4.1)



Figure 4.3 Block diagram of a general moving average filter.

Note that no multiplier is required if the filter coefficients are divisible by 2. By varying the coefficients, the filter performs different windowing function on the data. For coefficients as simple as the rectangular window where all the coefficients are equal to  $1/C_{n-1}$ , the filter sums all the data points and then divides the total by the number of points. The resolution of this approach is proportional to the number of points averaged. As a result, high accuracy results require a large number of register to store the data points and thus, the area of the filter and power consumption can grow rapidly.

The adjustment of the coefficients to match more sophisticated window functions, such as Hanning and Kaiser, enhances the ability of the filter attenuation in transitionand stop-band. As a result, the filter can be implemented in a smaller area with reduced power using such an approach.

As mentioned before, the one stage decimation approach filter is not practical. The purpose of the illustration is to introduce the basic form of the FIR filter. Any filter, no matter its functions (low pass, high pass, or band pass), can be implemented in this form. The drawback of this approach is the size required for the implementation. Since the size of the circuit is proportional to the power dissipation for the constant operating frequency, the small area circuit consumes less power. The following filters are designed to use techniques to compact the circuit size and power dissipation.

#### 4.2 Comb (Sinc) filter

Comb filter (Figure 4.4) operation is equivalent to a rectangular window FIR filter [23-24] and its transfer function is

$$H(z) = \frac{1}{M^{O+1}} \left( \frac{1 - z^{-M}}{1 - z^{-1}} \right)^{O+1}$$
(4.2)

Note that the order of the filter must be one order higher than that of the modulator in order to efficiently attenuate the rising quantization noise. This filter is the simplest but not very effective at removing the large quantity of the out-of-band quantization noise generated by the modulator and is seldom used in practice without additional digital filters. For many applications that cannot tolerate this distortion, the comb filter must be used in conjunction with one or more additional digital filter stages [12]. Since the filter is designed to obtain maximum attenuation only at the higher frequency components which will be aliased into base-band after decimation, the characteristic of the analog input signal is preserved while the out-of-band shaped noise of the modulator output has been attenuated.



Figure 4.4 Block diagram of a general Sinc filter.

## The advantages of the comb filter are:

- 1. No storage and multiplier are required for filter coefficients.
- 2. Intermediate storage is reduced by integrating at the high sampling rate and differentiating at low sampling rate, compared to the equivalent implementation using cascaded uniform FIR filters.
- 3. The structure of comb filters is very regular consisting of two basic building blocks (integrator and differentiator).
- 4. Little external control or complicated local timing is required.
- The same filter design can easily be used for a wide range of rate change factors,
   M, with the additional of a scaling circuit and minimal changes to the filter timing.

Some problems encountered with comb filters include the following: 1) register widths can become large for large rate change factor; 2) the frequency response is fully determined by only two parameters (rate change factor and number of filter stage) resulting in a limited range of filter characteristics.

## 4.3 Half-band filter

A half-band filter is the special case of the FIR filter. The design concept is to construct a filter with its frequency response symmetrical around half of the sampling frequency.

$$H(e^{j\omega}) = 1 - H(e^{j(\pi-\omega)})$$

$$\tag{4.3}$$

in terms of H(z)

$$H(z) + H(-z) = 1 \tag{4.4}$$

Figure 4.5 shows overall criteria of a half-band filter. The frequency response of the optimal filter is symmetric around  $\omega = \pi/2$  and  $\omega_p + \omega_s = \pi$ , where  $\omega_p$  and  $\omega_s$  denotes the pass-band and stop-band edges, respectively. Moreover, the pass-band and stop-band ripples are equal ( $\delta_p = \delta_s = \delta$ ).



Figure 4.5 Design criteria of a half-band filter.

A half-band filter gives the required filter response and is computationally much simpler than conventional FIR filters [25-26]. Equation 4.5 shows the linear phase FIR

filter transfer function with an odd number coefficient (N) and the filter has symmetric coefficient values.

$$h(n) = \sum_{n=0}^{M} C_n z^{-n}$$
(4.5)

where  $M = \frac{(N-1)}{2}$ , and

$$C_n = 0 \quad \text{,where} \quad n = 1,3,5... \quad and \quad n \neq M$$

$$C_M = 0.5 \quad (4.6)$$

The impulse response sequence has every odd number sample equal to zero for n odd (except the coefficient  $C_M = 0.5$ ).

The tapped delay lines in the even and odd branches of Figure 4.6 are doubled back to exploit the symmetric impulse response of the linear phase filter and reduce the number of multiplications by a factor of two [25]. Since only half of the coefficients are required, the total number of multiplications for the half-band filter is ¼ of that needed for arbitrary FIR filter designs.



Figure 4.6 Poly-phase implementation of a half-band filter.

The advantage of the half-band filter is that its coefficients can be designed using a window method. As a result, the filter has higher attenuation at the stop-band than that of the comb filter. The minor drawback of this filter design is the requirement of the delay chain in odd data path. The provided long chain is used only one coefficient in the path, which is not very power efficient.

#### 4.4 Two-path filter

Two-path filter is the filter created from the form of the moving average filter described in section 4.1 and the cosine window is used to create the coefficients. Instead of specific pass-, transition-, and stop-band as shown in most of classic filter frequency response, the two-pass filter specifies only pass- and out-of-band as the comb filter does. Therefore, the filter has maximum attenuation only to the higher frequency components that is aliased into base-band after decimation. In order to reduce power dissipation, the

architecture of the half-band filter is used as the blueprint of the multi-path structure. Figure 4.7 shows the general filter structure of the two-path filter. Note that the sum of the filter transfer functions,  $H_0(z)$  and  $H_1(z)$ , is equal to a moving average FIR filter.



Figure 4.7 Block diagram of a general two-path filter.

The two-path filter has been demonstrated at least 50% more power efficient than those of Sinc filters [41] due to fewer add operations and lower average operating frequency. Both Figure 4.8 and 4.9 are the duplicates from the paper and the latter shows the comparison of add operation. Note that the number of the operations used to gauge the power efficiency is the result from the domination of power dissipation of adders over DFFs.



Figure 4.8 Block diagrams of various Sinc filters.

Shown in Figure 4.9, Case 4 has highest power dissipation due to its single stage approach. Case 2 uses the multistage approach with the decimation process at the last stage. The improvement is not significant enough compared to Case 1 with the decimation process embedded in each stage. This research of two-path decimation filter is to develop a low power filter that is lower than Case 1. The proven architecture with multistage and multi-path approaches has demonstrated the achievement of the power efficiency over the best case of Sinc filter in the paper. The concept and implementation of this two-path filter will be described in detailed in Chapter 5.



Figure 4.9 Comparison of filter power dissipations.

# CHAPTER 5

# TWO-PATH FILTER DESIGN AND IMPLEMENTATION

The decimation filter is a data interpolation device. By averaging the coarse output data from the modulator, it interpolates between the coarse quantization levels. Therefore, it is a signal processing device and not a conditioning one, which means the resolution of the modulator output will not be improved after decimation process. In another words, the decimation filter can only degrade the signals if not carefully designed.

As described in Chapter 3, integrators require most of the power budget in order to achieve the resolution, which leave little available for the filter. Aggressively minimizing the power dissipation, therefore, is the primary goal of the decimation filter design. Before further discussion, the equation of digital power dissipation is copied as follows;

$$P_{digital} = CV^2 f \tag{5.1}$$

From Equation 5.1, the most efficient power reduction method is to lower the power supply voltage of the filter because the voltage is squared proportional to the power

dissipation. The effective capacitance and the operating frequency, however, are only proportional. As a result, the first rule of power reduction is to minimize the power supply. For the power reduction using the other two parameters, two low power filter design strategies, cascade and multi-path, are introduced.



Figure 5.1 Block diagram of a filter in (a) cascade and (b) parallel.

The priority to lower circuit power dissipation is listed from the most to the least: architecture, sub-circuit, and device. Figure 5.1 shows two possible filter approaches. The effect of the sub-filters in cascade is multiplication ( $H_1(z)^* H_2(z)^* H_3(z)$ ). On the other hand, the parallel formation has the effect of the summation only ( $H_1(z)^+ H_2(z)^+ H_3(z)$ ). Therefore, from the architectural viewpoint, implementing a filter with smaller sub-filters in cascade improves the area efficiency, which in turn often consumes less power.

Inside the sub-filters, the parallel architecture is implemented. To parallel the processes, additional circuits may be required to direct the data (i.e. data multiplexer). However, the overall power dissipation is decreased even with adding circuit powers. The important concept is that the operating frequency is reduced proportional to the degree of filter in parallel. Two-path filter, for example, has demonstrated 50% power efficiency in this research.

With these two approaches, the filter power efficiency can be substantially improved. In the following sections, the concepts of these approaches will be described in detail, along with the circuit design and implementation.

#### 5.1 Filter in cascade

Fixed sampling frequency filter design is fairly simple. However, the filter with variable sampling frequency poses greater challenges on the designer. Aliasing is the phenomena in the digital signal process that a signal is decimated without a filter to limit its bandwidth. After decimation, the spectrum beyond half of the Nyquist frequency folds back to the lower spectrum and degrades the base-band resolution if the fold back spectrum has a higher noise floor. Placing a low pass filter before decimation can correct the problem but increases power dissipation since it is clocked at a higher frequency. Moreover, for high order filters, the benefit of the steep transition-band is at the expense of power dissipation. The filter order is equal to the number of DFF, and high order filter results in a large number of DFF. In terms of area, the filter is equivalent to a large capacitor and its power dissipation may not be suitable for this project.

Figure 5.2 shows the alternative solution. In order to obtain a steep transitionband filter, one can gradually decimate the frequency with multiple low order filter stages (wide transition-band) in cascade.



Figure 5.2 Block diagram of a general decimation filter.

For example, a single stage filter with the order of 256 can be implemented as 4 stages in cascade and each stage order is equal to 4. As a result, the filter frequency response is equivalent to its single stage counterpart, but the overall filter order is reduced dramatically, which results in a direct reduction of the power dissipation. Assuming that the single stage filter consumes 256 units of power, the cascade filter power can be expressed as

$$P_{cascade} = 4(1) + 4\left(\frac{1}{2}\right) + 4\left(\frac{1}{4}\right) + 4\left(\frac{1}{8}\right) = 7.5$$
(5.2)

and the power ratio of single stage filter to the cascade filter is  $256/7.5 \cong 34$ , which means, for this particular filter configuration, the power efficiency of the cascade filter is approximately 34 times better. Adjusting the combination of the filter order in each block will yield different results but none of them consumes more power than a single stage approach.

The power dissipation can be reduced further by implementing the filter block with a multi-path approach, which is discussed in detail in the later section. Therefore, the decimation filter design in this dissertation focuses on the multistage in cascade and multi-path approach. In addition, the filter is developed or designed to process different bandwidths and OSR of low pass  $\Delta$ - $\Sigma$  modulators by simply adding and removing the stages.

### 5.2 Finite impulse response filter

The design concept begins with the filter without decimation. The 1<sup>st</sup> order FIR filter is the basis of this whole decimation filter design (Figure 5.3). Note that the  $Z^{-1}$  is a delay and its circuit implementation is a DFF. The filter architecture consists of a DFF, an adder, and two filter coefficients. Both coefficients (taps) are  $\frac{1}{2}$  (Equation 5.3) and the sum of all coefficients of a FIR filter should be equal to one (Equation 5.4).



Figure 5.3 Circuit diagram of the 1<sup>st</sup> order FIR.

$$y[n] = \frac{1}{2}x[n] + \frac{1}{2}x[n-1]$$
(5.3)

$$Y(z) = \frac{1}{2}X(z) + \frac{1}{2}X(z)z^{-1}$$
(5.3a)

$$y[n] = \sum_{i=0}^{n} c_i x[n-i] \quad where \quad \sum_{i=0}^{n} c_i = 1$$
 (5.4)

The specific coefficients are chosen for two reasons. First, the values that are divisible by two require no storage. Since division by two in digital logic is equal to data bus right shift once, hardwiring the bus line 1-bit offset to the right of the next input requires no additional power as a ROM version would to retrieve the data coefficients. Second, it is a cosine window filter and its bandwidth is ½ of the sampling frequency.



Figure 5.4 Frequency response of the 1<sup>st</sup> order FIR.

In the time domain, the filter works as a rectangular window function by averaging two data points; sum of two data points and then divided by two. In the frequency domain, it performs as a low pass filter attenuating the higher frequency components. Figure 5.4 shows the frequency response of the  $1^{st}$  order filter. The attenuation can be as great as -50dB and phase is linear within the frequency region.

The 2<sup>nd</sup> order filters can be implemented by cascading two 1<sup>st</sup> order filters (Figure 5.5). Based on linear system theory, the overall system response is identical to that of a single filter since all the sub-filters are convolved. Note that time domain convolution is frequency domain multiplication. Equation 5.5 shows the result in frequency domain and 5.5a is its time domain representation.

$$Y(z) = X(z) \left[ \left(\frac{1}{2} + \frac{1}{2}z^{-1}\right) \left(\frac{1}{2} + \frac{1}{2}z^{-1}\right) \right] = \frac{1}{4}X(z) + \frac{1}{2}X(z)z^{-1} + \frac{1}{4}X(z)z^{-2}$$
(5.5)

$$y[n] = \frac{1}{4}x[n] + \frac{1}{2}x[n-1] + \frac{1}{4}x[n-2]$$
(5.5a)



Figure 5.5 Architecture of the  $2^{nd}$  order FIR.

As the filter order increases, the shape of the window function changes from rectangular to triangular (Figure 5.6). This weighted average technique shows an improved result in

the frequency domain with higher attenuation within the transition and stop bands and less leakage in time domain.



Figure 5.6 A triangular window as the result of convolution of two rectangular windows.

Figure 5.7 shows that the phase of the filter remains linear within the whole frequency range. The high frequency attenuation is improved to -100dB.

For the higher order systems, the method of building 2<sup>nd</sup> order system can be duplicated by cascading multiple 1<sup>st</sup> order blocks. The phase of the filter will remain linear as discussed previously. Moreover, the filter attenuation will be greatly improved.



Figure 5.7 Frequency response of the  $2^{nd}$  order FIR.

# 5.3 Decimation filter

Decimation is the process of resampling the data using lower frequencies. After the process, the numbers of data points to be removed depends on the ratio of the original and the resample frequencies or the OSR. Once the frequency is reduced, the aliasing might occur if the frequency contents of the signal exist in full range of the spectrum (0 to  $\pi$ ). Diagram (d) of Figure 5.8 shows the aliasing after the process.



Figure 5.8 Decimation processes with and without filtering.

The straightforward solution is to place a low pass filter before decimation. In general, to avoid aliasing in down sampling by a factor of M requires

$$\omega_N M < \pi \quad or \quad \omega_N < \frac{\pi}{M}$$
 (5.6)

where  $\omega_N$  is the bandwidth of the signal. If the condition does not hold, aliasing occurs. The decimation processes with M = 2 are shown in diagram of Figure 5.8.

The intuitive approach is filtering before decimating the data (Figure 5.9). This design is straight forward but with no consideration for power dissipation. The better approach is to apply the multi-path concept to the filter structure. Figure 5.9a shows the filter in two data path structure. As an example, if

$$H(z) = 4 + 3z^{-1} + 2z^{-2} + z^{-3}$$
(5.7)

then

$$H_0(z^2) = 4 + 2z^{-1}, \quad H_1(z^2) = 3 + z^{-1}$$
 (5.8)

The overall filter power dissipation remains the same since the decimation process is still after the filter structure. Reference [27] describes that the decimation process can be moved forward and placed before the filtering (Figure 5.9b). The frequencies in each path

now are  $\frac{1}{2}$  of the input frequency and the power dissipation, therefore, is  $\frac{1}{2}$  of the single path approach.

$$\xrightarrow{x(n)} H(z) = H_0(z) + H_1(z) \longrightarrow \downarrow 2 \xrightarrow{y(2n)}$$

Figure 5.9 Single path filter approach (data filtered BEFORE frequency decimation).



Figure 5.9a Two-path filter approach (data filtered BEFORE frequency decimation).



Figure 5.9b Two-path filter approach (data filtered AFTER frequency decimation).

The same design concept can be expanded to include 3, 4, 5, or more path filter designs. As a result, their operating frequency can be decimated, respectively. This is similar to the parallelism. A circuit is more highly parallel, the lower frequency it needs to operate, which in theory reduces power. Two-path decimation exploits the previous described concept; splitting the data stream into two paths and processing at ½ of the input frequency. Since there is no data point removed, the decimation process is formed intrinsically. Unlike many implementations [3] performing the filter function before the decimation process, this filter decimates the frequency first (without losing any data) and then filters. For a digital circuit, the power is proportional to the operating frequency. The two-path decimation is operating ½ of the input frequency, which reduces the power dissipation by ½.

Since no multiplier is used in this filter implementation, the power dissipation of the filter is just a function of the number of adders and DFFs. Assuming that the function  $H_0$  and  $H_1$  are just a delay (or DFF) and the power supply is a constant, the power dissipation of the Figure 5.9 and 5.9b can be expressed as follows

$$P_{figure 5.9} = P_{DFF} + P_{adder} \propto 2C_{DFF} f + C_{adder} f$$
(5.9)

$$P_{figure 5.9b} \propto \left[ 2C_{DFF} \left( \frac{f}{2} \right) + C_{DFF} f \right] + C_{adder} \left( \frac{f}{2} \right) = C_{DFF} f + C_{adder} \left( \frac{f}{2} \right)$$
(5.10)

The experimental data shows that a DFF consumes 3-6 times less power than an adder. As a result, the adders dominate the power dissipation. From Equation 5.10, the two-path architecture provides an improvement in the power efficiency approaching 50%. The same derivation can be extended to multi-path architectures and shown as

$$P_{figure 5.8b} \propto C_{adder} \left(\frac{f}{n}\right)$$
 (5.11)

where n is the number of data paths in the filter architecture.

Designing the two-path filter coefficients is simple. Since this filter has exact frequency response as FIR filters except ½ of the frequency range, designers can use the FIR filter design to obtain filter coefficients. Before transforming classic FIR filter to two-path, the crucial component, data multiplexer (Dmux) that distributes data into two streams is introduced to simplify the rest of discussion.

### 5.3.1 Data multiplexer

Data multiplexer is the circuit used to distribute the data into two streams so the process frequency is reduced by <sup>1</sup>/<sub>2</sub> and as a result, save power. During the process, no data specification or alignment is required. The circuit arbitrarily picks a data point as even or odd and the next one is automatically assigned to the opposite (odd or even) and so on so forth. The convenience of this circuit is it doesn't require reset and initialization. The simplification directly benefits layout and circuit simplicity. Figure 5.10 shows the circuit diagram of the data multiplexer.



Figure 5.10 Circuit diagram of the data multiplexer (Dmux).

The circuit consists of three rising-edge DFFs, two positive-level sensitive latches and one inverter. Note that the data are transparent during the positive level clock period of the latch. Figure 5.11 shows the timing diagram of the data multiplexer.

DFF D1 is a clock divider. Its inverted output is fed back to its input so the output states are toggled at every clock signal's rising edge. The frequency of the output is exactly  $\frac{1}{2}$  of its driving clock frequency plus delay. This delay is important to clock sequencing and will be described in detail later.

Latch L1 is transformed to a negative-level sensitive latch due to its clock inversion at its input. This latch extends each data valid for another on clock period, which gives L2 and D2 sufficient latch time. Note that both DFFs and latches require adequate setup and hold time in order to make the data storing process successful.



Figure 5.11 Timing diagram of the Dmux.

Latch L2 is clocked at  $\frac{1}{2}$  of the input clock. For example, immediately after the CLK #2 rise, the data is latched by L2 at falling edge of the half clock (Q<sub>D1</sub>). After CLK #3 rise, both latch L1 and L2 have stored the valid data. After the half clock #2 rise, both data are stored into D2 and D3 accordingly and then, the process repeats.

The delay between the input clock and half clock provides sufficient setup and hold time for the component L2, D2, and D3, which are required to avoid metastability. The critical path is when D3 is storing the data and L2 is turning transparent since both components are clocked by the half clock. The short-term improvement is to shorten the clock of D3, which increases additional hold time. The long-term solution is to insert a negative-level sensitive latch between L2 and D3 and clocked by the input clock (Figure 5.12). As a result, no critical path exists in Dmux.



Figure 5.12 Circuit diagram of the improved data multiplexer.

## 5.3.2 Two-path filter

A two-path filter is just a special case of multi-path filters; i.e. splitting the data stream into two. The technique simply applies the coefficients, derived from classic FIR filters, to the new two-path architecture. Instead of decimation after filtering, the high frequency data are divided into two streams (even and odd) and then filtered. In such a case, the power dissipation can be reduced by 50%.

Recognizing the even or odd data point is arbitrary and designers can choose based on their preferences. Once decided, the selected convention must be followed through out the complete filter design. Transforming from a classic FIR filter to a twopath filter is readily observable from its transfer function. Equation 5.12a is a duplication of Equation 5.3 for better reference.

$$y[k] = \frac{1}{2} x_e[k] + \frac{1}{2} x_o[k]$$
(5.12)

$$y[n] = \frac{1}{2}x[n] + \frac{1}{2}x[n-1]$$
(5.12a)

This design selects the first input term for even data points followed by the odd data term. Note that there is no timing difference between two data points in Equation 5.12 since they are in parallel and achieved in the same time frame in the two-path structure. Figure 5.13 shows the circuit implementation of the 1<sup>st</sup> order two-path filter.



Figure 5.13 Block diagram of the 1<sup>st</sup> order two-path filter.

For the  $2^{nd}$  order system, the first two terms are selected as the  $1^{st}$  order system. The  $3^{rd}$  term is assigned to even data points. Since only two data points are allowed at the same time frame in the two-path structure, the  $3^{rd}$  term must be assigned to be one delay after the  $1^{st}$  and  $2^{nd}$  terms. Equation 5.13 shows the newly rearranged representation for the 2<sup>nd</sup> order, two-path filter. After transformation, the circuit implementation is simply seeking the correct data path.

$$y[k] = \frac{1}{4}x_e[k] + \frac{1}{2}x_o[k] + \frac{1}{4}x_e[k-1]$$
(5.13)

$$y[n] = \frac{1}{4}x[n] + \frac{1}{2}x[n-1] + \frac{1}{4}x[n-2]$$
(5.13a)



Figure 5.14 Block diagram of the 2<sup>nd</sup> order two-path filter.

To verify the implementation of the  $2^{nd}$  order two-path filter, the equation is written by tracing the data paths and shown

$$Y(z) = \frac{1}{4}X_e(z)z^{-1} + \frac{1}{2}X_o(z)z^{-1} + \frac{1}{4}X_e(z)z^{-2}$$
(5.14)

The time domain expression is

$$y[k] = \frac{1}{4} x_e[k-1] + \frac{1}{2} x_o[k-1] + \frac{1}{4} x_e[k-2]$$
(5.14a)

The differences between Equation 5.13 and 5.14 are one more delay in the equation. This delay is necessary because the data must be stationary or stable before the last adder (one before the output), especially to the even data path. Without a DFF placed ahead of the adders, the output data may not be valid since the effects of the even output of the Dmux changes will propagate all the way to the output of the filter, which violates the pipeline process. As a result, one more DFF is required in the even data path to isolate two adders in series. Moreover, since both data paths require equal timing in order to match the original equation, an additional DFF is inserted in the odd data path. Note that such a DFF insertion in the data path does not invalidate the equation as long as the delays on both sides are equal. Adding more delay will not reduce the performance since the filter is working as a pipelined device and no interrupt will flush the complete filter except the power off.

The  $3^{rd}$  order filter implementation consists of adding a  $1^{st}$  order filter in front of the  $2^{nd}$  order two-path filter as shown in Figure 5.15. The time domain equation is

$$y[k] = \frac{1}{8}x_e[k] + \frac{3}{8}x_o[k] + \frac{3}{8}x_e[k-1] + \frac{1}{8}x_o[k-1]$$
(5.15)

$$y[n] = \frac{1}{8}x[n] + \frac{3}{8}x[n-1] + \frac{3}{8}x[n-2] + \frac{1}{8}x[n-3]$$
(5.15a)

The added filter at front has little effect on the power dissipation but the simplicity helps to expedite filter development. The same philosophy can apply to the 4<sup>th</sup> order design but a better or low power version is developed as an improved alternative.



Figure 5.15 Block diagram of the 3<sup>rd</sup> order two-path filter.

Figure 5.16 shows the straightforward version of the  $4^{th}$  order two-path filter, which is adding one more  $1^{st}$  order filter in front of the  $3^{rd}$  order filter. The output can be expressed as

$$y[k] = \frac{1}{16} x_e[k] + \frac{1}{4} x_o[k] + \frac{3}{8} x_e[k-1] + \frac{1}{4} x_o[k-1] + \frac{1}{16} x_e[k-2]$$
(5.16)

$$y[n] = \frac{1}{16}x[n] + \frac{1}{4}x[n-1] + \frac{3}{8}x[n-2] + \frac{1}{4}x[n-3] + \frac{1}{16}x[n-4]$$
(5.16a)



Figure 5.16 Block diagram of the 4<sup>th</sup> order two-path filter.



Figure 5.16a Low power version of the 4<sup>th</sup> order two-path filter.

Figure 5.16a shows the low power version of the 4<sup>th</sup> order filter, which decimates the input frequency first. The low power version is developed to be implemented as the 1<sup>st</sup> stage of the decimation filter. From Figure 5.2, the 1<sup>st</sup> stage filter block is operated at the highest frequency of the complete filter since it receives the data directly from the modulator loop that is operated at OSR\*f<sub>n</sub>. Therefore, the 1<sup>st</sup> stage filter block is the dominant term of the power dissipation. Applying the low power version at the 1<sup>st</sup> stage can dramatically improve the power efficiency of the decimation filter.

In general, the two-path filter function can be written as

$$y(k) = \sum_{i=0}^{n} c_{ei} x_e(k-i) + c_{oi} x_o(k-i) \quad where \quad \sum_{i=0}^{n} c_{ei} + c_{oi} = 1$$
(5.17)

Higher order filter can be developed through the same procedures outlined above as long as they satisfy the Equation 5.17.

## 5.4 Overall filter response

After a family of basic two-path filter blocks developed, the discussion progresses to the system level. System level design is not a trivial task and requires simulation tools for accurate design of the complete filter.

In Chapter 3, the modulator was designed to oversample 64 times (OSR = 64). Therefore, the decimation filter must include 6 stages ( $\log_2 OSR = 6$ ) of two-path filters in order to decimate the frequency back to the Nyquist rate. Note also that the filter order of the 1<sup>st</sup> stage must be, at least, one order higher than that of the modulator to effectively attenuate the noise generated by the AFE loop.

Figure 5.17 shows the overall filter structure. The filter order in each blocks is chosen based on their frequency responses; pass- and transition-band widths and attenuation. Figure 5.18 shows each filter responses and the NTF of the modulator.



Figure 5.17 Architecture of the 64 times decimation filter.

Design must start from the first stage due to the aliasing after decimation. Since the sampling frequency at this stage is 64 times higher than the base-band, the wider spectrum allows higher order filters with higher attenuation to be built. Note that two-path filters do not have specific stop-band frequencies as most filters do. The objective

here is to build the filter with no attenuation in the base-band and high attenuation at the rest of spectrum.



Figure 5.18 Frequency response of the filter and the NTF of the modulator.

The technique herein uses the transition-band attenuation to suppress noise spectrum. After the first stage filter and before the decimation, the high frequency spectrum of the NTF is attenuated as shown in Figure 5.19. The attenuation beyond  $f_s/2$  must be designed high enough so the aliased spectrum does not overwhelm the original after decimation (see Figure 5.20). In another words, the noise level of the alias must be lower than that of the original. As a result, the noise level of the original spectrum dominates post decimation.



Figure 5.19 Frequency response of the 1<sup>st</sup> stage filter output BEFORE decimation.



Figure 5.20 Frequency response of the 1<sup>st</sup> stage filter output AFTER decimation.

For the noise floor of the original spectrum to dominate, the alias noise floor must be greatly attenuated. Since high attenuation requires a filter with high order and the filter order is proportional to the power dissipation, designing an optimal filter with lower power and sufficient attenuation is important.

First is to model two spectra separated by  $f_s/4$  as two uncorrelated noise sources; one for original and the other for alias. After decimation the final noise level is [39]

$$e_{final} = \sqrt{e_{original}^2 + e_{alias}^2}$$
(5.18)

The final noise level cannot exceed ~2% of the original noise spectrum so the noise level of the overall decimation rises less than 1*dB*. Table 5.1 and Figure 5.21 shows that the final noise floor is set by the separation of the original and alias noise floors. As the separation widens, the effect of the alias noise floor to the original decreases. The 15*dB* noise floor separation satisfies the requirement but 20*dB* separation is chosen for safe margin.

| $e_{original} - e_{alias} (dB)$ |    | 0     | 5     | 10    | 15    | 20    | 25    | 30    |
|---------------------------------|----|-------|-------|-------|-------|-------|-------|-------|
| Noise floor change $(dB)$       | dB | 3.010 | 1.193 | 0.414 | 0.135 | 0.043 | 0.014 | 0.004 |
|                                 | %  | 41.42 | 14.72 | 4.88  | 1.57  | .5    | .16   | 0.05  |

Table 5.1 Aliasing effect to the noise floor.

The alias noise floor, separated from the original beyond 20dB, shows little effect on the final noise floor. As a result, designing the filter to attenuate the alias noise floor more than 20dB is not power efficient. Note that the attenuation is proportional to the filter

order and the filter order is equal to the number of DFF. In addition, each DFF is followed by an adder that is proportional to the power dissipation.



Figure 5.21 Aliasing effect to the noise floor.



Figure 5.22 Frequency response of the 2<sup>nd</sup> stage filter output BEFORE decimation.

114



Figure 5.23 Frequency response of the 2<sup>nd</sup> stage filter output.



Figure 5.24 Frequency response of the last stage filter output.

Figure 5.22 and 5.23 show the 2<sup>nd</sup> stage output spectrum. The 3<sup>rd</sup>, the 4<sup>th</sup>, and following stages are developed using the same approach. The final filter noise level is shown in Figure 5.24.

The preservation of pass-band magnitude is as equally important as the noise floor attenuation. For simplicity, previous discussions were presented under the assumption of little or no attenuation in pass-band, which is much too optimistic. As SNR is defined as the ratio of the signal and noise, this can be significant. If the filter design focuses only on the noise floor, the attenuation of the signal will degrade the final SNR. Therefore, filter design requires the balance of high stop-band attenuation and pass-band preservation, which requires numerous simulations.

Figure 5.25 shows an expended view of the filter responses of each stage at passband. Note that the last stage exerts the highest attenuation in the pass-band so the last stage cannot use a high order filter. However, the first few stages have little or no effect to the pass-band and as a result, the high order filters are recommended. Since the BW of the pass-band is narrow for large OSR, higher order filter will not degrade the pass-band signal; the wide BW makes attenuation per sampling frequency (roll-off) small even when the order is high. The final pass-band attenuation is calculated by simply summing each stage filter response of the pass-band. In Figure 5.26, the SNR of the DBE degrades at higher frequency due to the characteristic of the modulator NTF and the attenuation of the decimation filter. There are two solutions to increase the final SNR. First, design the modulator with lower noise floor (higher SNR) since the decimation filter cannot boost

116

the SNR. Second, increase the sampling frequency so the valid resolution expands beyond the required BW.



Figure 5.25 Frequency response of each stage at pass-band.



Figure 5.26 Frequency response of signal vs noise floor.

# CHAPTER 6

### MEASUREMENT RESULTS

The useful circuits require their implementation on silicon and confirming test results to prove the accuracy of the modeling and simulation. The inability of designed circuit to be confirmed by repeatable results is the same as no design. Therefore, both designed modulator and decimation filter were submitted for the fabrication on Peregrine 0.5um SOS process. The following testing methods and results for both circuits are described separately. Moreover, both discussions include signal quality and power dissipation. Before testing, the required knowledge to produce quality Discrete Fourier Transform (DFT) results is reviewed. Reference [36][40] describes some good solutions. In this Chapter, the summarized versions of the relationship between the number of sampled points and cycles and the effect of applied window are discussed for accurate results.

A specially designed probecard was also ordered and fabricated for this testing. The card is chosen to be a 4 metal layer epoxy PCB; two outer layers are for the separation of analog and digital signals and inner layers are for ground plans. The center layers channel the noise coupled from both sides of signals to the ground instead of each other. Decoupling capacitors are inserted between every  $V_{DD}$  to ground (or  $V_{SS}$ ) to

118

minimize the switching noise or simultaneous ground bounce [21]. In addition, the chip statistics along with screen capture and the wafer layouts are attached in Appendix A.

#### 6.1 Discrete Fourier Transform

The theory and equation derivation will not be described in this section. Instead, the techniques to obtain the better DFT results are discussed. Unlike continuous time Fourier Transform, discrete time versions take an arbitrary portion of the sampled data. The best scenario is that this portion of data represents the whole and repeats indefinitely [37]. However, this might not be easy without a seasoned signal process engineer. In order to overcome this drawback, some techniques can be used to screen the preliminary results and obtain quite accurate information.

The first technique is to capture the whole cycle of the signal. Figure 6.1 shows the captured data are in complete cycle; no leakage is near the signal in frequency response. In Figure 6.2, the same signal is shown but the cycle is incomplete. The additional data beyond the cycle is considered as leakage and shown as multiple spikes (skirt) near the signal spectrum in the frequency domain. To avoid such spectral leakage, a method of coherent sampling is recommended [29]. Coherent sampling requires that the input and clock frequency generators are phase locked, and the chosen input frequency is based on the following

119

$$\frac{f_{in}}{f_s} = \frac{N_{cycle}}{N_{record}}$$
(6.1)

where  $N_{cycle}$  is the number of cycles in the data window (odd or prime numbers to make all samples unique), and  $N_{record}$  is the data record length.



Figure 6.1 Signal is captured in a complete cycle.

The drawback of the 1<sup>st</sup> technique is when the multiple signals are involved, which results in the difficulty to distinguish the complete cycle. In such a case, the window function applied to the sampled data can improve the quality of the frequency response. The purpose of the windowing is to minimize the effect of the leakage, which in time domain, increase the weight of the center portion of the data and decrease the weight on both side.



Figure 6.2 Signal is captured in incomplete cycle.



Figure 6.3 Comparison of window functions.

As Figure 6.3 demonstrates, the leakage is minimized after the windowing and a Hanning window is less leaky than a Hamming window. Unfortunately, there are no definite rules

for the best window but just the rules of thumb. Table 6.1 summarizes the most useful windows.

| Window      | Characteristics                        | Maximum         | Side-lobe     |  |
|-------------|----------------------------------------|-----------------|---------------|--|
|             |                                        | side-lobe level | roll-off rate |  |
| Rectangular | Good amplitude accuracy, narrow        | -13dB           | 20dB/decade   |  |
| (no window) | main-lobe, slow roll-off rate, poor    |                 |               |  |
|             | frequency resolution, more spectral    |                 |               |  |
|             | leakage                                |                 |               |  |
| Hanning     | High maximum side-lobe level, good     | -32dB           | 60dB/decade   |  |
|             | frequency resolution, reduced leakage, |                 |               |  |
|             | faster roll-off rate                   |                 |               |  |
| Hamming     | Good spectral resolution, narrow       | -43dB           | 20dB/decade   |  |
|             | main-lobe                              |                 |               |  |

Table 6.1 Window functions and characteristics.

A window whose side-lobes have a high roll-off rate should be used if the signal contains strong interfering frequency components distant from the frequency of interest. Otherwise, a window with low maximum levels of side-lobe is more suitable. If the frequency band of interest contains two or more signals close to each other, a window with a narrow main-lobe is better. For a single frequency component in which the focus is on amplitude accuracy rather than its precise location in the frequency bin, a window with a broad main-lobe is recommended. An application consisting of only transient signals should have no spectral windows at all, because they tend to attenuate important information at the beginning of the sampling clock. The summarized window selection is in Table 6.2.

The Hanning window, which provides good frequency resolution and reduced spectral leakage, yields satisfactory results in most applications. In most testing, the output signal content is not known, in which case a good starting place is to use the Hanning window.

| Window      | Sampled signal                                                        |
|-------------|-----------------------------------------------------------------------|
| Rectangular | 1. Separation of two tones with frequencies very close to each other, |
| (no window) | but with almost equal amplitudes                                      |
|             | 2. Frequency response measurements (system analysis)                  |
|             | 3. Transitions duration is shorter than the length of window          |
|             | 4. Broad-band random, closely spaced sin-wave signals                 |
|             | 5. Accurate single tone amplitude measurements                        |
| Hanning     | 1. General purpose application                                        |
|             | 2. Frequency response measurements (system analysis)                  |
|             | 3. Transitions duration is longer than the length of window           |
|             | 4. Narrow-band random signals, nature of content is unknown, sin-     |
|             | wave or combination of sin-wave signals                               |
| Hamming     | 5. Closely spaced sin-wave                                            |

Table 6.2 General guideline to select applied windows.

### 6.2 Modulator loop

Signal to noise ratio, spurious free dynamic range, integral non-linearity,

differential non-linearity, and missing code are key parameters to characterize an A/DC before further quality testing. They can be roughly grouped in three separate tests: SNR and missing code, INL and DNL, and SFDR. In addition, power measurement is accompanied with these tests.

### 6.2.1 SNR and missing code

First of all, high resolution input sources are required for all tests. The  $\Delta$ - $\Sigma$  A/DC is a data conversion device, which maps analog values to their digital equivalents at best. It cannot condition the signals like a filter does. Thus, lower resolution input will result in low resolution output, which cannot accurately measure the true SNR of the modulator. For this project, the input source with 20-bit resolution is required.

This current feedback modulator is not built readily use like most digital circuits. It requires precise adjustment on its quantizer reference voltage supplies and the two feedback current sources. The reference voltage adjustment is for maximum input without clipping. The current adjustment is not just for fulfilling feedback coefficient requirement but useful for compensating the process variation as well.

The rough estimate of the reference voltage supplies can be calculated as

$$V_{DDREF} \cong V_{DDA} - 2V_{OV}$$
 and  $V_{SSREF} \cong V_{SSA} + 2V_{OV}$  (6.2)

The 2 overdrive voltage decrements from two voltage supplies are required for N and PMOS cascode current mirror structures of fully differential OTAs. They are not exact values since the overdrive voltages vary across the wafer.

To adjust the currents, the starting values can be calculated using Equation 3.48. The adjustment process is quite simple but requites a few iterations. The better solution is to program a DAQ<sup>®</sup> (Data Acquisition) board from Nation Instrument to capture the digital output, average the data, and automatically adjust the currents accordingly. Note that the outer loop dominates in determining the low frequency properties of the circuits, while the inner loop serves to stabilize the system, and it determines the high frequency properties [12]. Thus, the adjustment sequence is to start with the 1<sup>st</sup> feedback current for the approximate digital output and then, adjusts the 2<sup>nd</sup> current for improved settling.

At first, the input voltage is set to zero (or  $\frac{1}{2}$  of power supply) and the output digital codes verified. The output should be fluctuating within two codes, and their average value should equal to  $\frac{1}{2}(V_{DDREF} - V_{SSREF})$  after a period of time. If not, adjust I<sub>ref1</sub> first since it relates to the gain of the modulator, and then I<sub>ref2</sub> to stabilize the fluctuation within the region of two codes. After this adjustment, only systematic offset voltage has been removed (Figure 6.4).

The gain error is more difficult to adjust than the offset error since it involves two points, which require balance. Assuming that the offset error has been removed at this point and the line intersects the origin, the slope of the line is modulator gain. Two points are required to be balanced so that the positive input maps the same output codes as the negative input exclusive of the sign.

125

For the positive side, the input is set to  $\frac{1}{2}$  of the V<sub>DDREF</sub>, and the feedback currents adjusted. After that, set the input to  $\frac{1}{2}$  of the V<sub>SSREF</sub> for the negative side, and adjust accordingly. The process is repeated until both sides map equal but opposite in sign at the output.



Figure 6.4 Gain and offset errors of an A/DC.

After the adjustment, the SNR testing can commence. Using coherent sampling, a calculated sine wave is injected. The SNR and missing code are measured and plotted in Figure 6.5 and 6.6, respectively.



Figure 6.5 Frequency response of the AFE output.



Figure 6.5a A section zoom-in of the Figure 6.5.



Figure 6.6 The AFE output in time domain.

The preliminary test results show the SNR is 80dB with no missing codes.

## 6.2.2 SFDR (Two-tone test)

The definition of the SFDR is the power ratio of the signal to the third-order intermodulation products. Figure 6.7 shows the simplified version of the test. The modulator is injected with two signals of equal amplitude: one at the pass-band and the other at stop-band. After decimation, the 3<sup>rd</sup> order intermodulation term  $(2\omega_1 - \omega_2)$  cannot be removed since it resides inside the pass-band. The maximum SFDR can be obtained by adjusting the input amplitude, but finding the maximum is not a trivial task. Figure 6.8 shows that it can be extrapolated using 2 or 3 different amplitude inputs and their corresponding 3<sup>rd</sup> order intermodulation amplitudes [4][30].



Figure 6.7 Intermodulation of two signals.



Figure 6.8 Graphical interpolation of SFDR.

6.2.3 INL and DNL

Very low frequency triangular wave sweep can reveal both non-linearities. The important consideration is that an integer number of cycles must be sampled or the data will be skewed. In addition, during the capture, the input must not drift in amplitude, frequency, or wave shape. The collected data is, then, plotted in histogram and the counts in each code are expected to be equal. Otherwise, non-linearities exist.

The problem of this approach is the limited memory depth of the data capture device. The maximum capture depth of our existing test device only 64K states. Commercial devices can go as high as 4M in depth but this comes with a higher cost. The following example reveals the challenge of such testing. For an 18-bit A/DC, the total output states are  $2^{18}$ . To capture a complete cycle of a triangular wave, the required memory depth is  $2^{19}$ , or 512K. For a 1*Hz* signal, the required time to finish is  $2^{19}/(60*60*24) \cong 6$  days! However, with a commercial capture device with 4Mx32 in memory depth and a 10*Hz* triangular waveform, 8 cycles can be captured in 5 days. Note that synchronization is required for the best use of the capture device to fill in the most data.

## 6.3 Two-path decimation filter

Frequency and impulse responses are two tests required for the decimation filter; the former is to verify the resolution and the latter is to verify amplitude attenuation. Since the SNR of the A/DC is determined by its modulator loop, this frequency response

test is to reassure that the filter does not degrade the SNR. Figure 6.9 shows that the SNR is maintained. Time domain plot of the DBE output data is shown in Figure 6.10. The zoomed in section is shown in Figure 6.10a. Careful software examination of this data strongly suggests the A/DC is monotonic. The data cannot be used for non-linearity analysis since the input is not triangular wave and its frequency is not low enough. Moreover, the memory depth of the capture device is not deep enough.

The brute force method to test the amplitude attenuation is to inject one frequency at a time and measure the amplitude of the output. This takes a very long period of time if the bandwidth under test is wide or desired resolution is high. The better approach is to use impulse response testing. Note that the width of the impulse is inverse proportional to its bandwidth. If the width approaches zero, its bandwidth approach infinity.



Figure 6.9 Frequency response of the DBE output.



Figure 6.10 The DBE output in time domain.



Figure 6.10a A section zoom-in of the Figure 6.10.

The care must be taken in selecting the width of the applied pulse. A narrow pulse width (or its wide bandwidth) will saturate the filter due to the excessive bandwidth folded back

to the base-band after decimation. As a result, the frequency domain of the final output will be pulse only. Thus, the width of the pulse should be narrow enough for characterization but not too narrow to saturate the filter. Figure 6.11 shows the filter does not have attenuation until 150Hz. The filter attenuates the amplitude more than 6dB beyond 800Hz.

Three strategies can be implemented to overcome the attenuation. The first is to increase the sampling frequency by 20% and use no additional circuits. The second approach is to design an equalizer after the filter to compensate the droop. The third solution is to place a half-band filter at the last stages [25]. The latter two solutions are not considered power inefficient since the operating frequency of the last filter stages is low and the power dissipation of the decimation filter is very small compared to that of the modulator.



Figure 6.11 Impulse response of the DBE.

6.4 Power dissipation measurement

Table 6.3 shows the power dissipation of each component in the modulator. The power dissipation of the 1<sup>st</sup> integrator is 840*uW*, which is twice that of the integrator 2 by design. The power dissipation of a single comparator is 22.3uW. The power ratio of the comparator to the integrator is 0.027, which is very close to the model (K<sub>Q</sub> = 0.02). Figure 6.12 is the re-plot of Equation 3.13 with the new coefficient.

| Component       | Power dissipation |  |
|-----------------|-------------------|--|
| Integrator      | 1260 <i>uW</i>    |  |
| 4-bit Quantizer | 138 <i>uW</i>     |  |
| Shift register  | 195 <i>uW</i>     |  |
| D/AC            | 6.6uW             |  |

Table 6.3 Power dissipation of the components in AFE.



Figure 6.12 Re-plot the power dissipation per bit of Figure 3.7 with  $K_Q = 0.027$ .

The new figure shows the model is quite accurate except that a 3-bit quantizer, instead of 4-bit, has a slight advantage in power dissipation. Clearly, it is more power efficient to increase the resolution by increasing the quantization bit until 4. After that, increasing order will have the advantage.

Digital power is measured by applying the same input data with multiple clock frequencies with results plotted in Figure 6.13. The maximum operational frequency of the two-pass decimation filter is 23MHz, which is limited by the probecard. Based on the ring oscillator measurement, the projected limit of this filter is approximated at 200MHz. For the desired  $\Delta$ - $\Sigma$  A/DC operating frequency at 128KHz and a power supply of 1.5V, the standby power of the filter is 1.5uW and with an operational power of 16.95uW.



Figure 6.13 Power dissipation and projection of the DBE.

From the measurement results, the power dissipation model of the modulator has been verified. The relatively high accuracy of the model can be used to predict the power dissipation of future signal loop modulator designs before submissions. The robustness of this model is its quantizer not limited to the regenerative approach as in this work. The freedom to adopt different implementations increases the possibility of low power architectures. The resolution is the minor setback of this implementation, which is amendable in the future designs. The newly designed two-path decimation filter has achieved both resolution and low power objectives. As a result, the design can focus only to the modulator. In addition, the preliminary test results have demonstrated the low power properties still maintain within high frequency range. In which case, the filter can be applied for wide-band applications.

## CHAPTER 7

## CONCLUSIONS

The design methodology of the 2<sup>nd</sup> order  $\Delta$ – $\Sigma$  A/DC with a multi-bit quantizer has been demonstrated in achieving 18-bit resolution under the power budget of 1*mW*. The strategies of reducing power dissipation and maintaining the resolution are the multi-bit quantizer approach of the modulator and the parallel process of the decimation filter. The innovation of the parallel-to-serial D/AC implantation, which takes the advantage of underutilized bandwidth between the analog and digital circuits, makes the multi-bit approach possible without the multi-bit D/AC non-linearity. The concept of decimating the frequency before filtering is possible by implementing the two-path decimation filter. The filter is designed to split the filter functions and data streams in two and process the data in parallel at one half of the input operating frequency. In addition, the cascade approach greatly reduces the filter order, which reduces power consumption even further.

The A/DCs, including modulators and decimation filters, have been fabricated on Peregrine 0.5um SOS process and tested. The measured SNR of the modulator is 13.5-bit with power dissipation of 1.6mW at 128Ksps. For the filter measurement of the operating frequency at 128KHz and the power supply at 1.5V, the standby power of the filter is

1.5*uW* and 16.95*uW* during the operation. The filter shows no resolution degradation for the test frequencies up to the 23*MHz*, and the projected limit of 200*MHz* suggests the possible application for wideband  $\Delta$ - $\Sigma$  communication modulators.

7.1 Discussion

The measured resolution that is 4.5-bit lower than the design goal (18-bit) is the only setback of this project. The architecture and functionality of the modulator are confirmed to be accurate since the shape of the transfer function in frequency domain is shown as expected and in time domain, the output waveform shows no missing code. As a result, there are two possible sources of errors: component or device degradation and noise injection from the outer sources, and they are listed as follow:

- Leaky integrators and transmission gates
- Noisy transistors
- Noisy current feedback sources
- Floating body of the transistors

7.1.1 Leaky integrator

As described in Chapter 3, low gain OTAs can result in integrator leakage. The locations of the NTF zeros, which are the locations of the integrator poles, shifted are the results of the leaky integrator. The shifting of the system zeros affects the designed NTF noise shaping ability and results in the noise level of the pass-band rises. Figure 7.1 shows the comparison of the simulated modulator with the OTA gains of 128 and 512. The OTA gain of 128 shows the noise level raises significantly and resembles the test result. Note that as the gain level drops, the noise floor goes down, which emphasizes the statement that the OTA requires high gain to prevent the leakage.



Figure 7.1 Simulation of the NTF with the gains of OTAs at 128 and 512.

To verify the simulation results, the measurement of NMOS and PMOS cascodes (Figure 7.2 and 7.3) are used to predict the available gain of the OTA. Based on the measured bias voltages of the OTA (the vertical lines), the OTA achieve gains higher than 500 (2*um* channel length), which suggests no gain problem.



Figure 7.2 Projected gain of the OTA using the NMOS cascode measurement.



Figure 7.3 Projected gain of the OTA using the PMOS cascode measurement.

The prediction suggests that designed gain is adequate and the device failure is the possible source. Note that since this is the static test, some distinct phenomena of the

SOS process cannot be revealed without using elaborate testing schemes. For example, the kink effect reduces the gain of the OTA due to the increase of the output conductance with dependence on frequency and bias. This effects resulting from the floating body (i.e. sampling switches) will be discussed later.

7.1.2 Noisy transistors and current feedback sources

Recent publications [31] suggests that the transistor noise level is 2-3 times higher than that of the theory. As a result, the designed sampling capacitors of the 1<sup>st</sup> integrator are not large enough to reduce the thermal noise. The solution to this rising noise floor is to increase the capacitance of the capacitors so the kT/C noise remains 9dB below the modulator noise level.

The feedback current of the 1<sup>st</sup> integrator has the same constraint as the input signal and is required to be minimized 9*dB* lower than the modulator noise level. Since any noise injected into the 1<sup>st</sup> integrator shows as a signal to the following stages, the noise must be reduced lower than kT/C noise so the latter dominate. The presently used external current mirror is still noisy even when battery powered (~ -90*dB* or 15-bit). Therefore, future designs should include the current mirrors on chip to minimize such problem.

7.1.3 Floating body of the transistors

The key benefit of the SOS process over bulk is the greatly reduced drain/source to body capacitors. The insulated substrate material provides good isolation between devices resulting in the elimination of the latch-up. However, it also introduces some new circuit behaviors that do not exist in the bulk process: kink effect, pass-gate leakage, and history dependence. All these are caused by the floating body, which are possibly the only drawback of the SOS process over the bulk.

7.1.3.1 kink effect and pass-gate leakage

The kink effect is a phenomenon that the output conductance increases as impact ionization starts and  $V_{BS}$  becomes positive [32-35]. The increased conductance will reduce the gain of OTA, which is the product of transconductance and output resistance  $(g_mR_o)$ . The pass-gate leakage [32-35] occurs when both drain and source are both high initially and the body is charged up all the way to  $V_{DD}$ . The source is, then, pulled to low and the current can still flows from drain to source even though the gate is off. Note that the leakage is high on fully-depleted device since the bipolar gain of the device is high [32-35]. Figure 7.4 shows a significant amount of leakage current exists in both P and N devices. The leakage of the NMOS is approximately double that of the PMOS. The low gain OTA and leaky transmission gates result in a shift of the modulator coefficients, which in turn introduces additional noise to the system.



Figure 7.4 Measured leakage current of low  $V_{th}$  transistors.

## 7.1.3.2 History dependence

History dependence is the change in delay through a gate as a function of switching history [32-35]. Before further discussion, the circuit elements determining the  $V_{BS}$  are shown in Figure 7.5. The body voltage is determined by the p-n diode leakage and the impact ionization current (I<sub>I</sub>) (Figure 7.5(a)) and capacitive coupling to the external nodes during the switching (Figure 7.5(b)).



Figure 7.5 Circuit elements determine body bias.

During the high frequency switching,  $V_{BS}$  is determined by its capacitive coupling to the gate, drain, and source voltages ( $V_{BS_C}$ ). In steady state,  $V_{BS}$  is determined by the diode leakage and impact ionization current ( $V_{BS_S}$ ). Note that the capacitive coupling voltage,  $V_{BS_C}$ , is superimposed on top of the steady state  $V_{BS}$ ,  $V_{BS_S}$ . It is the variation of the steady state  $V_{BS}$  affecting the threshold voltage and the delay. For every switching event of a transistor, its body voltage is set by  $V_{BS_C}$ . In between switching, body voltage is slowly converged to a value set by the drain, source, and gate voltages as well as leakage and impact ionization. If the switching interval is smaller than the period of convergence, the delay will vary. In another words, no switching event during the  $V_{BS}$  convergence can eliminate the history dependence. This delay variation has greater impact on the switches of the integrators. The switching time uncertainty caused by the variation of the clock drivers varies the charge transfer time of the sampling capacitors, which results in the gain error. It has the same effect as the pass-gate leakage.

7.2 Suggestion

The solution to the body effect is to prevent the body from floating. A ready solution is to tie the body to a fix voltage point. For the transmission gate, a H-gate approach is used due to the nature of its undetermined drain and source nodes (the nodes are decided by their voltage potentials that vary constantly). The H-gate body is tied to the power supplies (Figure 7.6). For the rest of the circuits, body-tie-to-source (BTS) gate is available and its body is tied to the source node (Figure 7.7). Both implementations are available in the future Peregrine process.



Figure 7.6 Two possible payouts of the H-gate.



Figure 7.7 Layout of the BTS gate.

#### REFERENCES

- [1] Rabiner, L. R., and R. W. Schafer, *Digital Processing of Speech Signals*, Prentice-Hall Inc., Englewood Cliffs, NJ, 1978.
- [2] Rudy Van De Plassche, Integrated Analog-to-Digital and Digital-to-Analog Converters, Kluwer Academic Publishers, 1994.
- [3] Steven R. Norsworthy, Richard Schreier, and Gabor C. Temes, *Delta-Sigma Data Converters: Theory, Design, and Simulation*, IEEE Press, New York, 1997.
- [4] David A. John and Ken Martin, Analog Integrated Circuit Design, John Wiley & Son, New York, 1997.
- [5] John G. Kenney and Richard Carley, "CLANS: A High-Level Synthesis Tool for High Resolution Data Converters," *IEEE International Conference on Computer-Aided Design*, Santa Clara, California, 1988.
- [6] Henrik T. Jensen and Ian Galton, "A low-complexity dynamic element matching DAC for direct digital synthesis," *IEEE Trans. on Circuits and Systems II: Analog* and Digital Signal Processing, vol. 45, pp. 13-27, January 1998.

- [7] Atsushi Iwata et al, "Architecture of Delta Sigma Analog-to-Digital Converters Using a Voltage-Controlled Oscillator as a Multibit Quantizer," *IEEE Trans. on Circuits and Systems II: Analog and Digital Signal Processing*, vol. 46, no. 7, pp. 941-945, July 1999.
- [8] Feng Wang and Ramesh Harjani, Design of Modulators for Oversampled Converter, Kluwer Academic, Massachusetts, 1998.
- [9] Ping Wah Wong and Robert M. Gray, "Two-Stage Sigma-Delta Modulator," *IEEE Trans. on Acoustics, Speech, and Signal Processing*, vol. 38, no. 11, pp. 1937-1952, November 1990.
- [10] Ichiro Fujimori et al, "A 90-dB SNR 2.5-MHz Output-Rate ADC Using Cascade Multibit Delta-Sigma Modulator at 8x Oversampling Ratio," *IEEE J. of Solid-State Circuits*, vol. 35, no. 12, pp. 1820-1828, December 2000.
- [11] Joo-Sun Choi, Kwyro Lee, "Design of CMOS Tapered Buffer for Minimum Power-Delay Product," *IEEE J. of Solid-State Circuits*, vol. 29, no. 9, pp. 1142-1145, September 1994.
- [12] James C. Candy and Gabor C. Temes, *Oversampling Delta-Sigma Data Converters: Theory, Design, and Simulation*, John Wiley & Son, New York, 1992.

- [13] Peicheng Ju and D. G. Vallancourt, "Quantisation Noise Reduction in Multibit Oversampling  $\Sigma - \Delta$  A/D Converters," *Electronic Letters*, vol. 28, no. 12, pp. 1162-1164, June 1992.
- [14] Matlab<sup>®</sup> Delta-Sigma Toolbox, Natick, MA: MathWorks, Inc. 19999.
- [15] Soon G. Lim, A 2KSPS 1mW Low Power Multibit  $\Delta\Sigma$  Modulator with 16-bit Dynamic Range, Master Thesis, Oklahoma State University, May 1999.
- [16] Paul R. Gray et al, Analysis and Design of Analog Integrated Circuits, John Wiley & Son, New York, 2001.
- [17] James A. Cherry, Theory, Practice, and Fundamental Performance Limits of High-Speed Data Conversion Using Continuous-Time Delta-Sigma Modulator, Ph.D.
   Dissertation, Carleton University, November 1998.
- [18] Mohammad Sarhang-Nejad and Gabor C. Temes, "A High-Resolution Multibit  $\Sigma \Delta$ with Digital Correction and Relaxed Amplifier Requirement," in *IEEE J. of Solid-State Circuits*, vol. 28, no. 6, June 1993.
- [19] C. Hutchens and A. Tyagi, *Proposed 16 bit 1mW 2KSPS ADC*, Advanced Analog
   VLSI Design Center, Oklahoma State University, June 1995.

- [20] R. Jacob Baker, David E. Boyce, and Harry W. Li, *CMOS Circuit Design, Layout,* and Simulation, IEEE Press, New York, 1997.
- [21] Jie Wen, Interconnect Design with Large Transistor Constraints for Multi-Chip Modules and Large Die SOI/SOS, Master Thesis, Oklahoma State University, May 1998.
- [22] Peter Van Zant, Microchip Fabrication, McGraw-Hill, New York, 2000.
- [23] S. Chu and C. S. Burrus, "Multirate Filter Designs Using Comb Filters," *IEEE Trans. On Circuits and Systems*, vol. 31, no. 10, Nov. 1994.
- [24] E. B. Hogenauer, "Economical Class of Digital Filter for Decimation and Interpolation," *IEEE Trans on Acoustics, Speech, and Signal Processing*, vol. 29, no. 2, Apr. 1981.
- [25] Brian P. Brandt and Bruce A. Wooley, "A Low-Power, Area-Efficient Digital Filter for Decimation and Interpolation," *IEEE Journal of Solid-State Circuits*, vol.29, no.
  6, June 1994.
- [26] N. J. Fliege, Multirate Digital Signal Processing, John Wiley & Sons, New York, 1994.

- [27] P. P. Vaidyanathan, *Multirate Systems and Filter Banks*, Prentice-Hall, New Jersey, 1993.
- [28] Defining and Testing Dynamic Parameters in High-Speed ADCs, Part 1, Sunnyvale, CA: Maxim Integrated Products, Inc. 2001.
- [29] Dynamic Testing of High-Speed ADCs, Part 2, Sunnyvale, CA: Maxim, Inc. 2001.
- [30] Behzad Razavi, RF Microelectronics, Prentice-Hall, New Jersey, 1997.
- [31] P. Klein, "An Analytical Thermal Noise Model of Deep-Submicron MOSFETs for Circuit Simulation with Emphasis on the BSIM3v3 SPICE Model," Proc. of the European Solid-State Dev. Res. Conf., pp. 460-463, September 1998.
- [32] Ghavam G. Shahidi et al, "Device and Circuit Design Issues in SOI Technology," *IEEE Custom Integrated Circuits Conference*, San Diego, California, 1999.
- [33] Srinath Krishnan and Jerry G. Fossum, "Grasping SOI Floating-Body Effects," IEEE Circuit & Device, July 1998.
- [34] Andy Wei, Melanie J. Sheroney, and Dimitri A. Antoniadis, "Effect of Floating-Body Charge on SOI MOSFET Design," *IEEE Trans. on Electron Device*, vol. 45, no. 2, pp. 430-438, February 1998.

- [35] James B. Kuo and Ker-Wei Su, CMOS VLSI Engineering Silicon-on-Insulator (SOI), Kluwer Academic, Massachusetts, 1998.
- [36] David F. Hoeschele, Analog-to-Digital and Digital-to-Analog Conversion Techniques, John Wiley & Son, New York, 1994.
- [37] Alen V. Oppenheim and Ronald W. Schafer, *Discrete-Time signal Processing*, Prentice-Hall, New Jersey, 1989.
- [38] Ronald E. Crochiere and Lawrence R. Rabiner, *Multirate Digital Signal Processing*, Prentice-Hall, New Jersey, 1983.
- [39] C. D. Motchenbacher and J.A. Connelly, Low-Noise Electronic System Design, John Wiley & Son, New York, 1993.
- [40] Audrey F. Harvey and Michael Cerna, Application Note 041: The Fundamentals of FFT-Based Signal Analysis and Measurement in LabVIEW and LabWindows, National Instrument, November 1993.
- [41] Chia-Ming Liu, Soon Guan Lim, and Chris Hutchens, "Low Power Decimation Filter Design for Delta-Sigma Converters," in *IEEE Emerging Technologies* Symposium on Wireless Commutation & System, Richardson, Texas, 1999.
- [42] Chia-Ming Liu and Chris Hutchens, "Implementation of a Multi-bit  $\Delta\Sigma$  A/DC Without a Correction RAM," in *IEEE International SOI Conference*, Durango, Colorado, 2001.

- [43] Chia-Ming Liu and Chris Hutchens, "Implementation of 1.5V Low Power Two-Path Decimation Filters For Communications  $\Delta$ - $\Sigma$  Converters," in *Proceedings of the* 45<sup>th</sup> *IEEE Midwest Conference on Circuits and Systems*, Tulsa, Oklahoma, 2002.
- [44] Chris Hutchens and Chia-Ming Liu, "Power/Bit Optimization of Δ-Σ ADCs Using Multi-Bit Quantizers," in Proceedings of the 45<sup>th</sup> IEEE Midwest Conference on Circuits and Systems, Tulsa, Oklahoma, 2002.
- [45] XunYu Zhu, Chris Hutchens, and Chia-Ming Liu, "Efficient Use of Two-Path Filter in the Low Power Decimation Filter Design," in *Proceedings of the 45<sup>th</sup> IEEE Midwest Conference on Circuits and Systems*, Tulsa, Oklahoma, 2002.

# APPENDIX A

# CHIP STATISTICS AND LAYOUT AND DIE PHOTOGRAPHS

| Chip statistics   | Transistor count | Area (mm <sup>2</sup> ) |
|-------------------|------------------|-------------------------|
| Modulator         | 2,731            | 2.6                     |
| Decimation filter | 47,656           | 5.58                    |
| Pad driver        | 16,272           | 2.64                    |

Table A.1 Transistor count and sizes of the components.



Figure A.1 Screen capture of the chip layout.



Figure A.2 Die photograph of the 18-bit modulator (upper left).



Figure A.3 Die photograph of the 16-bit modulator (upper right).



Figure A.4 Die photograph of the decimation filter (lower left).



Figure A.5 Die photograph of the pad driver (lower right).

# VITA 2

## Chia-Ming Liu

### Candidate for the Degree of

#### Doctor of Philosophy

## Thesis: AN IMPLEMENTATION OF A LOW POWER DELTA-SIGMA A/DC WITH A MULTI-BIT QUANTIZER ON SILICON-ON-SAPPHIRE

Major Field: Electrical and Computer Engineering.

**Biographical Information:** 

- Personal Data: Born in City of Miao-Li, Taiwan, Republic of China on February 16, 1968.
- Education: Graduated from Chien-Hsin College of Technology, Chung-Li, Taiwan, Republic of China in June 1988. Received Bachelor of Science degree and Maser of Science in Electrical and Computer Engineering from Oklahoma State University, Stillwater, Oklahoma in December 1994 and December 1996, respectively. Completed the requirements for the Doctor of Philosophy with a major in Electrical and Computer Engineering at Oklahoma State University in May 2002.
- Experience: Employed as a System Administrator and Laboratory Monitor in Advanced Analog VLSI Lab from January 1997 to May 2002; employed by Oklahoma State University, Department of Electrical and Computer Engineering as a Graduate Research Assistant from January 1997 to present, including research work for Naval Research and Development, San Diego, California (between summer of 1997 and 2001).