University of Pennsylvania
Department of Electrical and System Engineering
Circuit-Level Modeling, Design, and Optimization for Digital Systems

Final
Tuesday, May 7

- Problem weightings shown.
- Calculators allowed.
- Closed book $=$ No text or notes allowed.
- Additional workspace in exam book. Note where to find work in exam book if relevant.


## Name:

| Q1 |  |
| :---: | :--- |
| Q2 |  |
| Q3 |  |
| Q4 |  |
| Q5 |  |
| Total |  |

Default technology:

- 22 nm Low Standby Power Process (LSTP)
- $\gamma=1$
- $V_{d d}=800 \mathrm{mV}$
- nominal $V_{t h n}=-V_{t h p}=300 \mathrm{mV}$
- $C_{0}=2 \times 10^{-17} \mathrm{~F}$ (for $W=1$ device)
- $I_{d, s a t_{0}}=10 \mu \mathrm{~A}$ (for $W=1$ device)
- $I_{\text {sd,lea }}^{0} 0=0.3 \mathrm{pA}$ (for $W=1$ device)
- velocity saturated operation
- $R_{\text {wire }}=700 \mathrm{~K} \Omega / \mathrm{cm}$
- $C_{\text {wire }}=1.7 \mathrm{pF} / \mathrm{cm}$

| Device | $V_{g s}$ | $I_{d s}$ |
| :---: | :---: | :---: |
| NMOS | $V_{g s}<V_{t h n}$ | $\left(1 \times 10^{-6}\right) W e^{\frac{V_{g s}-V_{t h n}}{40 m V}}$ |
|  | $V_{g s}>V_{t h n}$ | $2 \times 10^{-5} W\left(V_{g s}-V_{t h n}\right)$ |
| PMOS | $V_{g s}>V_{t h p}$ | $\left(-1 \times 10^{-6}\right) W e^{-\left(\frac{V_{g s}-t_{t h p}}{40 m V}\right)}$ |
|  | $V_{g s}<V_{t h p}$ | $2 \times 10^{-5} W\left(V_{g s}-V_{t h p}\right)$ |

Timing constraints:

$$
\begin{gather*}
T \geq t_{\text {clk } \rightarrow q}+t_{\text {plogic }}+t_{\text {setup }}  \tag{1}\\
t_{\text {cdlatch }}+t_{\text {cdlogic }} \geq t_{\text {hold }} \tag{2}
\end{gather*}
$$

Optimal buffering:

$$
\begin{gather*}
L_{\text {seg }}=2 \sqrt{\frac{R_{0}(\gamma+1) C_{0}}{R_{\text {wire }} C_{\text {wire }}}}  \tag{3}\\
W_{\text {buf }}=\sqrt{\frac{R_{0} C_{\text {wire }}}{2 R_{\text {wire }} C_{0}}} \tag{4}
\end{gather*}
$$

1. (20pts) Speed and Power. Consider using CMOS nand2 gates minimum sized in the default technology. Specify units in all answers.
(a) Assume the default technology and calculate $\tau=R_{0} C_{0}$.

(b) Assume the critical path in the design (including flip-flop setup time and clock-to-q delay) can be modeled as a series chain of 10 of these gates, each loaded by 4 equivalent gates. What is the maximum frequency of operation possible?

## Max Frequency

(c) Assuming chip cooling allows a maximum dynamic power dissipation of 1 W (leakage is negligible), when operating at the frequency from part (b), what is the maximum number of gates that can switch during a clock cycle, on average?

## Max gate-evals/clock

(d) Assuming the output of one of these gates drives a single gate input through an unbuffered wire with $R_{\text {wire }}=700 \mathrm{~K} \Omega / \mathrm{cm}, C_{\text {wire }}=1.7 \mathrm{pF} / \mathrm{cm}$, what is the maximum distance the signal can travel in one clock cycle at the maximum clock frequency identified (part b)?

## Max Distance

This page left nearly blank for pagination and calculations.
2. (20pts) Consider the following dynamic logic circuit. What logic function does it evaluate?
Assume the circuit is loaded by $7 C_{0}$ output. Assume $C_{\text {diff }}=0.5 C_{g a t e}, \mu_{n}=2 \mu_{p}$. Assume the CLK signal is driven strongly such that the rise time on the clock is $R_{0} C_{0}$. Use Elmore delay calculations where appropriate. For full credit (and partial credit consideration) show your delay components (stages, components of Elmore delay calculation).


| Out as a function of the |
| ---: | ---: |
| inputs? |$|$

This page left nearly blank for pagination and calculations.
3. (20pts) Below is a register built from cascading two dynamic latches. $C_{1}$ and $C_{2}$ are not explicit capacitors but just from the parasitics at each node. Assume the clocks are ideal non-overlapping clocks with a frequency of 250 MHz . Each transmission gate has a delay of 100 ps and each inverter has a delay of 200 ps .

(a) Is this a positive or negative edge-triggered device? $\square$
(b) determine the register timing parameters. Include units.
$\square$
i. Setup time
ii. Hold time

iii. Worst case clock-to-q delay $\square$
4. SRAM (20 points). 4 different 6 T SRAM cells are designed and tested for correct operation. There are three input test signals: clk, $\overline{\mathrm{Rd}} / \mathrm{Wr}$, and data. A single read or write operation occurs in a single clock period. When $\overline{\mathrm{Rd}} / \mathrm{Wr}$ is high a write operation should occur, and when it is low a read operation should occur. The data signal gives what data should be written into the cell when doing a write operation. Below are the test signals and BL and BL bar of all 4 cells. For each cell, indicate whether the cell is exhibiting correct operation. If not, explain what is not correct about the operation. Answer table on next page.


| Bitcell 1 |  |
| :--- | :--- |
| Bitcell 2 |  |
|  |  |
| Bitcell 3 |  |
| Bitcell 4 |  |

5. (20pts) Short Answer Questions: Answer the questions briefly. Include diagrams and equations as needed. Be clear in your explanation and handwriting.

A Identify and describe two differences between SRAM and DRAM memory cells.

B What is a sense amplifier and why might you need it?

C What is a memory read upset and what is one way you can avoid them?

D Draw a schematic of a tristate buffer and draw its truth table.

