# Dynamic Read V<sub>MIN</sub> and Yield Estimation for Nanoscale SRAMs

Shourya Gupta<sup>10</sup>, Graduate Student Member, IEEE, and Benton H. Calhoun<sup>10</sup>, Fellow, IEEE

Abstract—The design and verification process for SRAMs can be long and tedious due to the very large multi-dimensional design-space and the large computational time of Monte-Carlo (MC) simulations. In this work, we propose a fast analytical model, which takes into account the supply-voltage, temperature, process-variations, and array-design variables to characterize the critical read path and the small signal differential sensing and then evaluates the read-access failure probability and the corresponding  $V_{\mbox{MIN}}$  and yield. With a low evaluation time of 15 seconds and <6% error, the model is used to evaluate ~160K different SRAM designs in 20 hours. The results of the dataset are used to analyze the effect of key design-variables on yield and performance, determine inter-variable correlation, and calculate feature importance. In particular, important statistical results about sense-amplifier-enable timing and dynamic behavior of frequency correlation are presented in this work. Thus, the method can be very useful for SRAM designers to quickly calculate design feasibility and analyze the design space to optimize power, area, and speed.

Index Terms—Bit-cell, failure probability, noise margin, readaccess, SRAM, subthreshold,  $V_{MIN}$ , yield.

# I. INTRODUCTION

**R** ANDOM variations in nano-scale Static Random Access Memories (SRAM) pose a major challenge to achieving design robustness due to their large effect on bit-cell and array characteristics [1]–[3]. These variations include device threshold voltage ( $V_T$ ) mismatch due to random dopant fluctuations (RDF) and line edge roughness (LER) [4]. The device  $V_T$ mismatch in deep sub-micrometer technologies is greatest in minimum sized devices, which are often used in SRAMs [5]. The worst-case  $V_T$  mismatch, combined with the increased sensitivity of current in the subthreshold region, greatly affects the minimum operating voltage ( $V_{MIN}$ ) and yield of the memory. Since the yield and the directly related  $V_{MIN}$  parameter determine the extent of voltage supply scaling, their accurate estimation is important for maximizing energy and performance savings. Monte-Carlo (MC) simulation is a wellknown approach to determine the worst-case  $V_{MIN}$  for a given

Manuscript received July 7, 2020; revised November 8, 2020 and December 8, 2020; accepted December 10, 2020. This work was funded in part by the Defense Advanced Research Projects Agency (DARPA) under agreement no. FA8650-18-2-7844. This article was recommended by Associate Editor T. Hanyu. (*Corresponding author: Shourya Gupta.*)

Shourya Gupta is with the Charles L. Brown Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA 22903 USA (e-mail: shourya.gupta94@gmail.com).

Benton H. Calhoun is with the Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA 22903 USA.

Color versions of one or more figures in this article are available at https://doi.org/10.1109/TCSI.2020.3044836.

Digital Object Identifier 10.1109/TCSI.2020.3044836

memory. However, memory arrays can require millions of MC simulations, which is prohibitively expensive. Additionally, the SRAM design space is a multidimensional one, with variables having interdependent trade-offs and varying levels of correlation with design feasibility. This includes the SRAM read critical path which includes the largest number of design variables. Therefore, it becomes challenging and time consuming to arrive at an optimized design solution. Many analytical and semi-analytical approaches have previously been proposed to determine the  $V_{MIN}$  and yield of the memory, which we describe in Section II. While some of these approaches greatly reduce the simulation time over conventional MC simulations to help determine design feasibility more quickly, they do not help to resolve the design space or quantify the statistical importance of underlying design variables.

In this work, we propose an analytical model that evaluates the read-access failure probability and the corresponding  $V_{MIN}$ . The model takes a large number of variables into account, including supply voltage, temperature, process variations, and array design parameters including bit-cell sizing, read current, bit-line capacitance (number of rows), wordline rise time (number of columns), sense amplifier strobe timing, bit-line leakage, and sense amplifier offset voltage. The method can complete a design evaluation within a few seconds with small error (<6%).

The following are the key contributions of this work

- 1. A new analytical time-based relationship describing the average bit-line discharge rate and its corresponding distribution.
- 2. For the first time, analytically describing the read access operation using Modified Bessel function of the second kind, which is then approximated with an asymmetrical gaussian distribution with finitely limited skew and kurtosis.
- 3. These mathematical developments help the model to compute the  $V_{MIN}$  and failure probability very fast (~15 sec) and with low error (<6%)
- 4. The model is then used to create a dataset (with ~160K unique SRAM design points) in 20 hours, which otherwise would have taken >100 years to generate with MC simulations. The dataset is then used to observe the effect of SRAM design variables, quantize their importance and determine inter-variable correlation.

An SRAM designer who is accustomed to using MC simulation (or a faster equivalent tool) would be able to supplement their design approach by using this model to analyze the timing distribution of various components in the SRAM read critical

1549-8328 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information. path, target the most impactful design variables to save design time and effort, and co-optimize iteratively for speed, area, and power using a yield-aware approach.

The paper has been constructed as follows. Section II discusses the prior methods and approaches to determine yield, Section III discusses the mechanisms of the SRAM read-access operation, Section IV describes the proposed read access yield model and corresponding analysis, and Section V summarizes and concludes the paper.

# II. SRAM YIELD DETERMINATION APPROACHES

The most straightforward approach to determine the yield of SRAMs is Monte Carlo simulations. In the conventional Monte-Carlo approach, we try to find the yield for the metric of interest f(x) with the random variable being x. If  $y_{limit}$ is threshold for the performance metric, then the pass or fail function I(x) is defined as

$$I(x) = I(y_i > y_{limit}) = \begin{cases} 1 & if \ y_i > y_{limit}, \\ 0 & if \ y_i \le y_{limit} \end{cases}$$
(1)

And the probability of failure  $P_f$  can be defined as

$$P_f = P\left(y_i > y_{limit}\right) = \int_{-\infty}^{\infty} I\left(x\right) R\left(x\right) \left(dx\right)$$
(2)

where, R(x) is the probability density function of the random variable x (e.g. V<sub>T</sub>)

Since the distribution of I(x) is generally unknown, a large number of samples are needed to be generated corresponding to the random variable. To obtain an estimate with  $(1-\varepsilon)100\%$ accuracy and with  $(1 - \delta)100\%$  confidence, the required number of samples  $N(\varepsilon, \delta)$  is given by [6]

$$N\left(\varepsilon,\delta\right) \approx \frac{\log\left(\frac{1}{\delta}\right)}{\varepsilon^2 P_f} \tag{3}$$

To achieve >95% yield for a 10 Mbit memory, a failure probability of less than 1e-9 should be reached. To ascertain this probability with 95% confidence interval and 10% error, more than 1e11 samples would be required, which is not practically possible. In SRAM circuit design, where the performance metric depends on multiple variables, determining I(x) becomes even more challenging due to the large multi-dimensional design space. Therefore, there is a need to develop alternate methods to verify SRAM design yield.

Some of the alternate methods aim to reduce the simulation time for determining yield by analyzing the impact of process variations on the SRAM. The work in [7]–[11] use Sensitivity Analysis to estimate the SRAM failure probability and yield. This method simplifies the simulation significantly because only (N + 1) number of partial derivatives with respect to  $V_T$  are needed to be evaluated to estimate the sensitivities (N is the number of independent variables). The partial derivatives for the six transistor (6T) SRAM cell are shown in Fig. 1 for a bulk 32nm process. These aid to calculate the mean  $(\mu_{f(x)})$  and standard deviation  $(\sigma_{f(x)})$  of the distribution



Fig. 1. Partial derivatives with respect to  $V_T$  using Sensitivity Analysis for a 6T SRAM cell.



Fig. 2. Importance Sampling using the Mean-Shift approach [12].

as

$$\mu_{f(\mathbf{x})} \approx f(\mu_x) + \sum_{i=1}^n \left(\frac{1}{2} \frac{\partial^2 f(\mu_x)}{\partial x_i^2}\right) \sigma_{x_i}^2 \tag{4}$$

$$\sigma_{f(\mathbf{x})}^2 \approx \sum_{i=1}^n \left[ \left( \frac{\partial f(\mu_x)}{\partial x_i} \right) \sigma_{x_i} \right]^2 \tag{5}$$

However, this method can be quite inaccurate, because the Taylor expansion can yield errors in approximations away from the nominal point. The sensitivity of the metric with respect to  $V_T$  can be highly non-linear in some processes and for some design points, leading to large inaccuracies. Additionally, applying this method to large circuits can be unwieldy, due to the large number of variables involved.

Another method that aids to reduce the simulation time is Importance Sampling (IS). In one variation of this method, known as Mean-Shift IS, samples are generated away from the mean where failures are much more likely to occur as opposed to the mean of the distribution, where usually no failures occur [12] (shown in Fig. 2). The Mean-Shift IS, in which the center of the original distribution p(x) with zero mean and standard deviation  $\sigma_j$  is shifted by a shift-vector  $s = (s_1, \ldots, s_M)$ , is represented as

$$g(x) = \prod_{j=1}^{M} \frac{1}{\sqrt{2\pi\sigma_j}} exp\left(-\frac{(x_j - s_j)^2}{2\sigma_j^2}\right)$$
(6)

2

GUPTA AND CALHOUN: DYNAMIC READ VMIN AND YIELD ESTIMATION FOR NANOSCALE SRAMS



Fig. 3. Optimization problem in 2D-space illustrating the Most Probable Failure Point [14].

Then the probability estimated using IS becomes

$$P_{IS} = \frac{1}{N} \sum_{i=1}^{n} I(x_i) . w(x_i) \quad (x_i \in G(x))$$
(7)

where, the weight function w(x) is represented as

$$w(x_{i}) = \frac{p(x)}{g(x)} = exp\left(-\sum_{j=1}^{M} \frac{s_{j}(2x_{j} - s_{j})}{2\sigma_{j}^{2}}\right)$$
(8)

The main disadvantage of this approach is the ambiguity in determining the shift-vector. This is because it is difficult to estimate where the failure region might lie. Additionally, the search region might be too wide and therefore, difficult to explore with a few number of samples. In another IS method [13], a mixture of distributions  $g_{\lambda}(x)$  is used to model the shifted density function.

$$g_{\lambda}(x) = \lambda_1 R(x) + \lambda_1 U(x) + (1 - \lambda_1 - \lambda_2) R(x - s_j)$$
(9)

where,  $0 < \lambda_1 + \lambda_2 < 1$ . This method enables efficient sampling without leaving any non-sampled regions in the event of outliers. Another IS approach improves over mean-shift IS by using norm minimization to reduce the variance [6]. Still, the overall efficiency of all Importance Sampling methods depends on the shift-vector because sampling of the modified distribution function must occur where maximum number of failure points are likely to occur. This makes it hard to implement IS based methods to assess the yield of SRAMs.

Most Probable Failure Point (MPFP) is another method that is used to evaluate the yield of SRAMs [14]. In this method, the failure probability determination is treated as a process of optimization as shown in Fig. 3. It aims to find the worst-case variations which maximize the failure probability  $P_{fail}$ .

$$P_{fail} = \prod_{i=1}^{6} P\left(\Delta V_{t_i} > k_i \sigma_{V_{t_i}}\right)$$
(10)

where,  $k_i$  represents the threshold voltage variation for the SRAM bit-cell's transistor's V<sub>T</sub> with respect to the standard deviation at the most probable failure point. In this approach,



Fig. 4. (a) Illustration of the Statistical Blockade (SB), showing the body and tail in the parameter space. The region inside the body, marked by the classifier solid line is blocked [15] (b) Steps to perform SB analysis [15].

the search region is divided into a six-dimensional space (assuming 6T SRAM bit-cell), with sixty-four regions. The search is then performed in only those regions where failures are more likely to occur. Although this method is applicable to large number of cases, even where the failure region might not be known, this brute-force approach can quickly become unwieldy as the number of variables increase.

Another method that is used to quickly estimate the yield of SRAMs is Statistical Blockade. In this approach, an initial sampling using MC or other sampling methods is performed to build a classifier for the metric of interest as shown in Fig. 4 [15]. Only points that are beyond the classifier threshold are simulated, and all other points are blocked. This allows a huge speed-up of simulation by only simulating points which are more likely to fail. In an improved version called the recursive statistical blockade, the search starts with a lower threshold classifier, which is then used to estimate a higher threshold classifier multiple times until the target threshold classifier is reached [16]. This method reduces the simulation time for larger memories where the regular statistical blockade can become unwieldy. Although, the statistical blockade method enables a huge speedup over conventional MC simulations, it can still require up to sixty hours to determine the yield for a given design [10].

Some methods reduce simulation time by modelling the behavior of the SRAM [17], [18]. However, these methods can still require up to several hundred thousand MC simulations

for pre-characterization and evaluation. Another method [4] also models the SRAM behavior, but it has been shown in [7] that the analytical method presented underestimates the failure probability.

### **III. SRAM READ-ACCESS**

Read-access time is defined as the time required to generate a potential difference between the two bit-lines (e.g. 100mV). If more time is elapsed to generate this voltage difference than the given word-line pulse width, then the SA might not be able to evaluate the correct data, thereby resulting in a read-access failure. The conventional method for determining the readaccess failure probability  $P_A$  can be expressed as [9], [11], [19]

$$P_A = Prob\left(T_A > T^A_{WL}\right) \tag{11}$$

where,  $T_A$  is the read-access time and  $T_{WL}^A$  is the word-line pulse width.  $T_A$  can be evaluated using

$$T_A = \int_{V_{DD}}^{V_{DD} - \Delta V_{BL}} \frac{C_{BL} dV_{BL}}{I_{BL}}$$
(12)

where,  $V_{BL}$  is the bit-line voltage,  $C_{BL}$  is the bit-line capacitance, and  $I_{BL}$  is the read access current. It has previously been established that the read-access time  $T_A$  does not follow a normal distribution, but  $1/T_A$  does [9]. Therefore, the read access failure probability can be expressed as [9]

$$P_A = Prob\left(\frac{1}{T_A} < \frac{1}{T_{WL}^A}\right) = \Phi\left(\frac{\left(\frac{1}{T_A}\right)_{nom} - \frac{1}{T_{WL}^A}}{\sigma_A}\right)$$
(13)

Here  $\Phi$  represents the standard normal cumulative density function,  $T_A$  is the access time and,  $T_{WL}^A$  is the word-line pulse width. However, the conventional approach fails to consider many of the failure mechanisms that affect the read-access operation. We briefly discuss these mechanisms below.

The method described above considers a pre-defined bitline differential voltage threshold point and ignores the sense amplifier offset distribution. Therefore, this approach considers an arbitrary worst-case point, which leads to overdesign and loss of performance. It also does not consider the negative effect of the bit-line leakage current which reduces the effective read current for a bit-cell. For a column with N bit-cells, the effective read current  $I_{eff}$  is

$$I_{eff} = I_{read} - \sum_{i=1}^{N-1} I_{off-PG_i}$$
(14)

$$I_{eff} = I_{read} - (N-1) \,\mu_{I_{off-PG}} \tag{15}$$

$$I_{eff} \approx I_{read} - (N-1) I_{off-PG} \left(1 + \frac{ln^2 (10)}{2} \left(\frac{\sigma_{V_{TH}}}{S}\right)^2\right)$$
(16)

where  $I_{read}$  is the bit-cell read current,  $I_{off-PG}$  is ithe access transistor leakage current,  $\sigma_{V_{TH}}$  is the standard deviation of threshold voltage and, S is the subthreshold slope. The effect of bit-line leakage is especially great in near-threshold and



Fig. 5. (a) Schematic of the SRAM read-accessed column (b) Timing Diagram for the SRAM read-access operation [18].



Fig. 6. Comparison between the failure probability evaluated using Monte-Carlo simulations and the conventional access method highlights the large error.

sub-threshold regions of operation where the  $I_{on/off}$  ratio is severely degraded.

The above method also does not consider the sensing window, which is determined by the time elapsed between the word-line enable and sense amplifier strobe enable. This window of time determines the total time available to develop a differential voltage on the bit-lines, as opposed to the word-line pulse width indicated in the method above. The read-accessed column and the sensing window are shown in Fig. 5. Small changes in the sensing window can greatly affect the read-access performance. The timing variations in both the word-line and sense amplifier strobe signal can greatly alter the sensing window, which is why it becomes imperative to consider sensing window variations when assessing the read-access failure probability. Therefore, the above method fails to capture many of the read-access failure mechanisms, thereby resulting in an underestimation of the failure probability. This results in large errors as shown in Fig. 6.

#### IV. PROPOSED READ-ACCESS YIELD MODEL

### A. Read-Access Model Description and Analysis

The variation in threshold voltage due to Random Dopant Fluctuations ( $\sigma_{VT,RDF}$ ), transistor length variations ( $\sigma_{VT,L}$ ), Random Telegraphic Noise ( $\sigma_{VT,RTN}$ ), and other sources of variability ( $\sigma_{VT,Other}$ ), which affect stability and performance



Fig. 7. Sources of variation that affect the read-access operation [18].

of the cell can be modelled as given in [20].

$$\sigma_{VT} = \sqrt{\sigma_{VT,RDF}^2 + \sigma_{VT,L}^2 + \sigma_{VT,RTN}^2 + \sigma_{VT,Other}^2} \quad (17)$$

For a given technology with given minimum transistor sizing ( $W_{min}$  and  $L_{min}$ ), the deviation in threshold voltage ( $\sigma_{V_{l_i}}$ ) for any transistor can be calculated by using Pelgrom's Law [21]. However, advanced technologies exhibit deviation from Pelgrom's Law. Therefore, to accurately model V<sub>T</sub> variations, we use modified Pelgrom's Law [22], [23], which is given as

$$\sigma_{V_{t_i}} = \sigma_{VT} \times \sqrt{\frac{L_{min}W_{min}}{(W)^{\alpha} (L)^{\beta}}}$$
(18)

where  $\alpha$  and  $\beta$  are technology constants.

The sources of variation which affect the read access operation are shown in Fig. 7. Process variations in the logic circuitry path of the word-line signal cause deviations in timing, which change the time after which the bit-line starts to discharge, thereby affecting read-access performance. The word-line logic path timing variations can be analyzed by modelling it as a chain of inverters. Let  $\mu_{t_d}$  and  $\sigma_{t_d}$  be the mean and standard deviation, respectively, for the delay of a minimum sized inverter. For a chain of inverters, the standard deviation of delay grows as the square root of the number of stages [24]. If the word-line logic path is modelled as a chain of q inverters, then the distribution for the delay can be expressed as  $Z_{WL} \sim \mathcal{N}(\mu_{t_{WL}}, q\sigma_{t_d}^2)$ . This distribution can be scaled accordingly with change in inverter sizing [22]. Similarly, the Sense-Amplifier strobe signal (SAE) can be modelled as a chain of r inverters. Then the distribution for it can be expressed as  $Z_{SAE} \sim \mathcal{N}(\mu_{t_{SAE}}, r\sigma_{t_d}^2)$ . Therefore, the amount of time elapsed between word-line enable and SAE enable can be modelled as

$$Z_t \sim \mathcal{N}\left(\mu_t, \sigma_t^2\right) \tag{19}$$

where

$$\mu_{t} = \mu_{t_{SAE}} - \mu_{t_{WL}} = r(\mu_{t_{d}}) - q(\mu_{t_{d}}) = (r - q) \mu_{t_{d}}$$
(20)  
$$\sigma_{t}^{2} = \sigma_{t_{WL}}^{2} + \sigma_{t_{SAE}}^{2} = \left[\sqrt{q} (\sigma_{t_{d}})\right]^{2} + \left[\sqrt{r} (\sigma_{t_{d}})\right]^{2}$$
$$= (q + r) \sigma_{t_{d}}^{2}$$
(21)



Fig. 8. Sense-Amp-Enable distribution with varying inverter chain length (b) Read Access failure prob. (produced using model) as a function of the number of SAE inverters across supply voltage.

Alternatively, for a singular inverter chain of length r that is tapped at different locations to generate the SAE (at  $r^{th}$ inverter) and word-line (at  $q^{th}$  inverter; q < r) timing signals, the distribution can be modelled as

$$\mu_{t} = \mu_{t_{SAE}} - \mu_{t_{WL}} = r(\mu_{t_d}) - q(\mu_{t_d}) = (r - q)\mu_{t_d}$$
(22)  
$$\sigma_t^2 = (r - q)\sigma_{t_d}^2$$
(23)

The above expressions indicate that the mean of the elapsed time depends on the difference between the number of inverters in both paths, and the standard deviation depends on the number of inverters. This means that the uncertainty in timing can be quite large, thereby worsening the read-access yield. This effect is shown in Fig. 8 (a), where increasing the inverter chain length results in greater deviation, which dampens its intended positive effect. The read access failure probability as a function of length of the sense-amp-enable is shown in Fig. 8 (b). As seen in Fig. 8 (b), the failure probability decreases slowly with increase in inverter chain length, indicating that sense amplifier strobe signal timing does not have a very strong impact on yield. A large change in the inverter chain length is therefore required to achieve a given yield threshold. The analysis shown in the figure can thus be very useful to precisely ascertain the sense amplifier strobe signal timing to meet specific yield targets corresponding to various memory sizes.

Another method that can be used to generate the SAE signal is the replica-bit line [25]. In this technique, the sense amplifier enable signal is generated using replica bit-line



Fig. 9. (a) Variations in sampling (SAE timing) distribution of the mean with increase in number of samples (*ON* bit-cells) (b) Variations in SAE signal timing using various techniques of generation (c) Resultant distribution of bit-line discharge and SAE for various techniques.

capacitance and pre-tied bit-cells. The number of bit-cells on the replica bit-line define its discharge time and consequently the SAE timing. The timing for the replica bit-line can be modelled using the principles of the sampling distribution as

$$\mu_t = \frac{\mu_{t_{cell}}}{N} \tag{24}$$

$$\sigma_t = \frac{\sigma_{t_{cell}}}{\sqrt{N}} \tag{25}$$

where,  $t_{cell}$  is the replica bit-line discharge time for a single ON bit-cell on the replica bit-line and N is the number of bit-cells that are turned on. These relationships depict exactly the opposite trend in variability in comparison to inverter chain-based techniques in which the variability increased with increase in number of elements. Another interesting observation to note about the replica bit-line technique is that its resultant timing distribution will always tend towards a gaussian distribution irrespective of region of operation due to the Central Limit Theorem. This not only makes it easier to model, but also impedes the far-off outliers as in heavy long tailed distributions.

The replica bit-line can either be constructed using a short fractional bit-line with very few ON bit-cells or using a full array length bit-line with larger number of ON bit-cells. Fig. 9 (a) shows the effect of increase in number of ON bit-cells in a replica bit-line on the variations. As the number of ON cells increase per bit-line, the variations in the timing decrease. As such, it would be desirable to have a long replica bit-line with a large capacitive load and a large number of ON bit-cells to minimize the variations. However, this will also increase the area and power. Fig. 9 (b) shows the effect of using a long bit-line and a large number of ON bit-cells on the SAE timing. As seen in Fig. 9 (b), the replica bit-line technique nearly halved the variations in the enable

signal in comparison to other inverter chain-based methods, suggesting its viability in timing sensitive circuits. However, despite this large improvement, the resultant distribution of bit-line discharge and SAE sees only a modest improvement due to no change in the variations of the bit-line discharge of the accessed bit-line as seen in Fig. 9 (c). The overall improvement in timing variations using replica bit-line technique is then observed to be about 15%.

The read performance depends on the variations in bit-cell read current ( $I_{read}$ ). Since the bit-line discharge rate ( $V_r$ ) depends on the read current, the statistical distribution of rate of bit-line discharge will follow the distribution of read current [19]. This can be expressed as

$$\left. \frac{\sigma}{\mu} \right|_{V_r} = \left. \frac{\sigma}{\mu} \right|_{I_{read}} \tag{26}$$

The bit-line discharge rate can be defined as the change in bit-line voltage per unit time. Its distribution is calculated based on the supply voltage, array design variables, temperature, and process variations. Although the bit-line discharge rate is nearly constant at the beginning of the read operation, it quickly falls as the bit-line voltage reduces further. The average bit-line discharge rate ( $\alpha_t$ ) can then be approximated by the following derived relationship.

$$\alpha_{t} = \int_{0}^{\mu_{t}} \frac{V_{DD}}{\mu_{t}} \left[ \frac{d}{dx} \left( 1 - \frac{tan^{-1} \left( \left( \frac{\Delta V_{BL}}{\Delta t} \right) x \right)}{\lim_{x \to \infty} tan^{-1} \left( \left( \frac{\Delta V_{BL}}{\Delta t} \right) x \right)} \right) \right] (dx)$$
(27)

Here,  $(\Delta V_{BL}/\Delta t)$  represents the initial constant slope of the bit-line discharge voltage. Consequently, we can derive the approximated distribution of the bit-line discharge GUPTA AND CALHOUN: DYNAMIC READ  $\mathsf{V}_{\mathrm{MIN}}$  AND YIELD ESTIMATION FOR NANOSCALE SRAMs

TABLE I SUMMARY OF EVALUATION

| S.<br>No. | Metric                | Capacity | P <sub>FAIL</sub> for<br>> 95% yield | Method                     | Case I<br>(r = 20) | Case II<br>(r = 30) | Case III<br>(r = 40) |
|-----------|-----------------------|----------|--------------------------------------|----------------------------|--------------------|---------------------|----------------------|
|           |                       | 10771    | 1. (                                 | MC                         | 855                | 809                 | 776                  |
| 1         | V <sub>MIN</sub> (mV) | TUKbit   | ~1e-0                                | Prop. Method               | 851                | 805                 | 771                  |
|           |                       | 100Kbit  | ~1e-7                                | MC                         | 881                | 836                 | 803                  |
|           |                       |          |                                      | Prop. Method               | 876                | 832                 | 795                  |
|           |                       | 10Mbit   | ~1e-9                                | MC                         | 922                | 882                 | 846                  |
|           |                       |          |                                      | Prop. Method               | 916                | 874                 | 838                  |
|           |                       |          |                                      | Min                        | 0.47               | 0.48                | 0.64                 |
| 2         | Percentage Error      | -        | -                                    | Max                        | 0.65               | 0.91                | 1.00                 |
|           |                       |          |                                      | Mean                       | 0.56               | 0.63                | 0.87                 |
| 3         | Time to Evaluate      | -        | -                                    | Prop. Method (sec)         | 14.68              | 14.58               | 14.61                |
|           |                       |          |                                      | MC (hours)<br>(~1.3M runs) | 72.3               | 72.5                | 72.6                 |

r: Number of inverters in the Sense-Amplifier Enable strobe signal

rate 
$$Z_{V_r} \sim \mathcal{N}(\mu_{V_r}, \sigma_{V_r}^2)$$
 as  

$$\mu_{V_r} = |\alpha_t| \left( 1 - \frac{(N-1) \mu_{I_{off}-PG}}{\mu_{I_{read}}} \right)$$
(28)  

$$\sigma_{V_r} = \left| V_{DD} \frac{d}{dx} \left( 1 - \frac{tan^{-1} \left( \left( \frac{\Delta V_{BL}}{\Delta t} \right) x \right)}{\lim_{x \to \infty} tan^{-1} \left( \left( \frac{\Delta V_{BL}}{\Delta t} \right) x \right)} \right) \right|_{x=\mu_t} \right|$$

$$\times \left( \frac{\sigma_{I_{read}}}{\mu_{I_{read}}} \right)$$
(29)

where the read current  $Z_{I_{read}} \sim \mathcal{N}\left(\mu_{I_{read}}, \sigma_{I_{read}}^2\right)$  follows

$$\mu_{I_{read}} = I_{read} + \sum_{i=1}^{n} \left( \frac{1}{2} \frac{\partial^2 I_{read}}{\partial V_{t_i}^2} \right) \sigma_{V_{t_i}}^2 + \sum_{k=1}^{n} \sum_{\substack{i=1\\i \neq k}}^{n} \frac{\partial^2 I_{read}}{\partial V_{t_i} \partial V_{t_k}} r(i,k) \sigma_{V_{t_i}} \sigma_{V_{t_k}}$$
(30)

$$\sigma_{I_{read}}^{2} = \sum_{i=1}^{n} \left[ \left( \frac{\partial I_{read}}{\partial V_{t_{i}}} \right) \sigma_{V_{t_{i}}} \right]^{2} + 2 \sum_{k=1}^{n} \sum_{\substack{i=1\\i \neq k}}^{n} \left( \frac{\partial I_{read}}{\partial V_{t_{i}}} \right) \left( \frac{\partial I_{read}}{\partial V_{t_{k}}} \right) r(i,k) \sigma_{V_{t_{i}}} \sigma_{V_{t_{k}}}$$
(31)

Here, r(i, k) is the correlation coefficient and  $V_{t_i}$  represents the threshold voltage of the  $i^{th}$  transistor.

The read-access failure probability is the probability of the voltage differential developed between the bit-lines being less than the Sense Amplifier offset ( $V_{OS}$ ). This can be expressed as

$$P_{FAIL} = Prob \{ V_{SA_{in}} < V_{OS} \} = Prob \{ (V_r \cdot t) < V_{OS} \}$$
$$= Prob \{ V'_r < V_{OS} \}$$
(32)

Here, both bit-line discharge rate  $V_r$  and time t have been assumed to have a gaussian distribution. The distribution for the product of two gaussian variables with zero mean can be expressed as [26]

$$P_{XY}(u) = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \frac{e^{-\frac{x^2}{(2\sigma_x^2)}}}{\sigma_x \sqrt{2\pi}} \frac{e^{-\frac{y^2}{(2\sigma_y^2)}}}{\sigma_y \sqrt{2\pi}} \delta(xy - u) \, dx \, dy \quad (33)$$
$$P_{XY}(u) = \frac{K_0\left(\frac{|u|}{\sigma_x \sigma_y}\right)}{\pi \sigma_x \sigma_y} \quad (34)$$

where  $\delta(x)$  is a delta function and  $K_n(Z)$  is the modified Bessel function of the second kind. Similarly, for two variables with non-zero mean, the distribution can be expressed as

$$P_{XY}(z) = \frac{1}{\pi} K_0(\bar{z})$$
(35)

where

$$\bar{z} = \left(\frac{x - \mu_x}{\sigma_x}\right) \left(\frac{y - \mu_y}{\sigma_y}\right) \tag{36}$$

To solve these equations, we calculate the first two moments of Q = XY (In context of SRAM, Q represents  $V_{SA\_in}$ ), and then find a distribution whose parameters match the moments of Q. We shall derive the moment-generating function for Q, and show that Q can be approximated by a normal curve. We previously showed (in Fig. 10) how this distribution is nearly normal using data based on SRAM functional behavior. Here, we mathematically derive this approximation and quantify the limits of these assumptions. The moment-generating function for Q = XY can be written as

$$M_{Q}(t) = \frac{1}{2\pi} \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} \left( e^{-\frac{(x-\mu_{x})^{2}}{2\sigma_{x}^{2}} - \frac{(x-\mu_{y})^{2}}{2\sigma_{y}^{2}}} \right) e^{xyt} dx dy \quad (37)$$
$$M_{Q}(t) = \frac{exp\left\{ \frac{t\mu_{x}\mu_{y} + \frac{1}{2} \left( \mu_{y}^{2}\sigma_{x}^{2} + \mu_{x}^{2}\sigma_{y}^{2} \right) t^{2} \right\}}{1 - t^{2}\sigma_{x}^{2}\sigma_{y}^{2}} \right\}}{\sqrt{1 - t^{2}\sigma_{x}^{2}\sigma_{y}^{2}}} \quad (38)$$



Fig. 10. Interaction of bit-line discharge and sense-amplifier-enable results in a near-gaussian distribution across majority of the small signal sensing operation. The same analysis also shows that change in sense amplifier strobe signal does not have a very strong effect on the resulting distribution. This trend will be similar irrespective of region of operation since resultant distribution will always spread in both X and Y directions.

Defining the variables  $\delta_x = \frac{\mu_x}{\sigma_x}$  and  $\delta_y = \frac{\mu_y}{\sigma_y}$ , and rewriting the moment-generating function as

$$M_{Q}(t) = \frac{exp\left\{\frac{t\mu_{x}\mu_{y} + \left(t\delta_{y}^{2}\mu_{x}\mu_{y} + \delta_{x}^{2}\left(2\delta_{y}^{2} + t\mu_{x}\mu_{y}\right)\right)}{2\delta_{x}^{2}\delta_{y}^{2} - 2t^{2}\mu_{x}^{2}\mu_{y}^{2}}\right\}}{\sqrt{1 - \frac{t^{2}\mu_{x}^{2}\mu_{y}^{2}}{\delta_{x}^{2}\delta_{y}^{2}}}}$$
(39)

Although the product of two normal variables is not normally distributed, the limit of the moment-generating function is normally distributed [27]. If  $\delta$  tends to increase, the momentgenerating function tends to

$$M_{Q}(t) = exp\left\{t\mu_{x}\mu_{y} + \frac{1}{2}\left(\mu_{x}^{2}\sigma_{y}^{2} + \mu_{y}^{2}\sigma_{x}^{2}\right)t^{2}\right\}$$
(40)

The corresponding first four moments can then be written as

١

$$E(Q) = \mu_{V'_r} = \mu_{V_r} \mu_t$$
 (41)

$$V(Q) = \sigma_{V_r'}^2 = \mu_{V_r}^2 \sigma_t^2 + \mu_t^2 \sigma_{V_r}^2 + \sigma_{V_r}^2 \sigma_t^2 = \left(1 + \delta_{V_r}^2 + \delta_t^2\right) \sigma_{V_r}^2 \sigma_t^2$$
(42)

$$\xi_{3}(Q) = \frac{6\delta_{V_{r}}\delta_{t}\sigma_{V_{r}}^{3}\sigma_{t}^{3}}{\left(\left(1+\delta_{V_{r}}^{2}+\delta_{t}^{2}\right)\sigma_{V_{r}}^{2}\sigma_{t}^{2}\right)^{\frac{3}{2}}}$$
(43)

$$\zeta_{4}(Q) = \frac{6\sigma_{V_{r}}^{4}\sigma_{t}^{4}\left\{2\left(\delta_{V_{r}}^{2}+\delta_{t}^{2}\right)+1\right\}}{\left(\left(1+\delta_{V_{r}}^{2}+\delta_{t}^{2}\right)\sigma_{V_{r}}^{2}\sigma_{t}^{2}\right)^{2}}$$
(44)

The moments obtained in eqn. (41)-(44) represent the distribution of the resultant bit-line voltage  $(V'_r)$ . This resultant voltage is input to the Sense Amplifier and should be less than its offset  $(V_{OS})$  for a successful read. The model is evaluated and shown in Fig. 11. As seen in Fig. 11, the model shows near normal behavior in the super-threshold region with very little error in comparison with distributions obtained from MC simulations. The error increases in the sub-threshold region, with the model predicting the failure pessimistically. The normal probability plot shows a deviation in the right tail of about 8% when comparing the model and MC simulations in sub-threshold region. Despite this deviation, the model can provide insightful results as shown later in section *B*.

The skewness and kurtosis of the resulting distribution depend on the value of  $\delta$ . For small  $\delta$ , the skewness becomes large but is always  $\leq \frac{2\sqrt{3}}{3}$ . The excess or kurtosis is always  $\leq 6$ . As  $\delta \rightarrow \infty$ , the skewness tends to zero. As shown in Fig. 12, the skewness ranges from 0.04 to 0.16 and the excess or kurtosis ranges from 0.02 to 0.18. These results suggest that the normal approximation for the product of variables is very close. The deviation from values obtained from MC simulations is also small (< 0.6).

The read-access failure probability can be calculated as

$$P_{FAIL} = Prob\left\{V_r' < V_{OS}\right\} \tag{45}$$

GUPTA AND CALHOUN: DYNAMIC READ  $V_{\rm MIN}$  AND YIELD ESTIMATION FOR NANOSCALE SRAMs



Fig. 11. Comparison of read-access distribution using MC simulations and proposed method in super-threshold and subthreshold regions.



Fig. 12. Comparison of skew and kurtosis evaluated using model and MC simulations. Evaluation depicts a deviation of less than 0.6 skew and kurtosis in the worst case.

If 
$$Z = V_{OS} - V'_r$$
, then  $Z \sim \mathcal{N} \left( \mu_{V_{OS}} - \mu_{V'_r}, \sigma^2_{V'_r} + \sigma^2_{V_{OS}} \right)$   
 $P\left(Z > 0\right) = \int_0^\infty \frac{1}{\sqrt{2\pi \left(\sigma^2_{V'_r} + \sigma^2_{V_{OS}}\right)}} e^{\left(\frac{-\left(z + \mu_{V'_r} - \mu_{V_{OS}}\right)^2}{2\left(\sigma^2_{V'_r} + \sigma^2_{V_{OS}}\right)}\right)} (dz)$ 
(46)

With  $t = \frac{z + \mu_{V'_r} - \mu_{V_{OS}}}{\sqrt{2(\sigma_{V'_r} + \sigma_{V_{OS}}^2)}}$  and using the complimentary error

function,

$$erfc(x) = \frac{2}{\sqrt{\pi}} \int_{x}^{\infty} e^{-t^2} (dt)$$
(47)

We get,

$$P(Z > 0) = P_{FAIL} = \frac{1}{2} erfc \left( \frac{\mu_{V'_r} - \mu_{V_{OS}}}{\sqrt{2 \left(\sigma_{V'_r}^2 + \sigma_{V_{OS}}^2\right)}} \right)$$
(48)

where the moments  $\mu_{V'_r}$  and  $\sigma^2_{V'_r}$  are shown in Eqn. (41) and (42) respectively. The yield for a given memory size (*N* number of cells) can then be expressed as

$$Yield = (1 - P_{FAIL})^N \tag{49}$$

# B. Dataset-Based Dimensional Analysis

To analyze the multidimensional SRAM design space, the model is used to create a dataset with nearly 160K unique SRAM design datapoints in a bulk 65nm CMOS technology. Each datapoint is a set of values of design variables for a given design and the corresponding failure probability. All design

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-I: REGULAR PAPERS



Fig. 13. (a) Correlation between various SRAM design variables and failure probability (b) Dynamic behavior of correlation between frequency and failure probability as a function of separation.

TABLE II Summary of Dataset-Based Design Space Analysis

| Method                                    | MC Sim.                           | This Work |  |
|-------------------------------------------|-----------------------------------|-----------|--|
| Time to<br>Compute                        | 102 years                         | 20 hours  |  |
| Number of data poin<br>Assuming each data | ts:159,746<br>point requires 100K | MC runs   |  |
| Number of CPU thre                        | ads for MC eval. = 1              | 6         |  |
| Number of CPU thre                        | ads for Model $= 1$               |           |  |

variables are swept across a wide range to generate the dataset. To evaluate each datapoint using MC simulations would require  $\sim$ 5.6 hours (assuming 100K runs), which would equate to >100 years for a dataset of this size. In comparison, the model is able to generate the dataset in about 20 hours as shown in Table II. The results of the dataset are used to observe the effect of SRAM design variables, quantize their importance, and determine inter-variable correlation. The results are summarized in a correlation-matrix in Fig. 13 (a) and arranged in descending order of importance in Fig. 14.

The frequency is the only variable which spans several orders of magnitude, and thus, its effect is analyzed separately across several regions in accordance to its relative magnitude with respect to the critical path delay. Its correlation with failure probability is analyzed across three regions, when clock pulse width  $\ll$  critical path delay, pulse width  $\approx$  critical path delay, and pulse width  $\gg$  critical path delay, as shown in Fig. 13 (b). The results indicate that there is weak correlation between frequency and failure probability when pulse width  $\gg$  critical path delay, suggesting the bottleneck in such a case could be the design variables involved



Fig. 14. Importance of various SRAM design variables in descending order.

in the critical path. The correlation peaks when the pulse width  $\approx$  critical path delay and then falls rapidly when pulse width is reduced further. This analysis can also explain why some design points can show little to no decrease in failure rate even when the pulse width is increased indefinitely. In such a case, the unoptimized critical path variable(s) might be causing a high dynamic failure rate irrespective of the frequency.

#### C. Evaluation

The read-access failure probability given by (48) has been evaluated and compared against results from Monte-Carlo simulations. The MC sims consider the entire read path including variations in the peripheral circuitry. The resulting distributions from MC sims are imported into MATLAB and then used to evaluate the final failure probability. Three cases with varying sense amplifier strobe timing has been considered



Fig. 15. Comparison of various yield prediction methods based on speed and error.

as shown in Table I. The  $V_{MIN}$  is calculated for various capacities (10Kbit to 10Mbit) and their corresponding failure rate. As seen in Table I, the time required to evaluate the  $V_{MIN}$  is less than 15 seconds in all cases, with low error.

The comparison of various yield prediction methods based on speed and percentage error is shown in Fig. 15. Based on this comparison, including the ones shown in Table I and II, we can observe the method's convenience and effectiveness for SRAM design evaluation and exploration.

# V. CONCLUSION

In this work, we discussed the major mechanisms which affect the SRAM read-access operation and the common evaluation techniques which help to analyze them. The benefits and issues of various such analytical and semi-analytical techniques were discussed. To evaluate the read-access failure probability and the corresponding V<sub>MIN</sub>, we presented a fast analytical model which investigates key SRAM read-access components and analytically models their behavior in the small signal sensing region of operation. This model takes into account several variables, such as the supply voltage, temperature, process variations and, array design variables i.e. bit-cell sizing, read current, bit-line capacitance (number of rows), word-line rise time (number of columns), sense amplifier strobe timing, bit-line leakage, and sense amplifier offset voltage. Simulations in a commercial bulk 65nm technology showed that the proposed method is able to evaluate the failure probability within a few seconds ( $\sim 15$  sec) with small error. With this gain in speed over other evaluation methods, the model is used to evaluate about 160K different SRAM designs. The results of this evaluation were used to analyze the multidimensional SRAM design space and determine the importance of various design variables. This analysis also provided insightful results about the effect of operating frequency and sense-amplifier strobe timing on read access failure probability.

Thus, the proposed model can be very useful for SRAM designers to quickly calculate design feasibility and analyze the design space to optimize power, area, and speed.

#### REFERENCES

 K. Cho, J. Park, T. W. Oh, and S.-O. Jung, "One-sided Schmitt-Triggerbased 9T SRAM cell for near-threshold operation," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 67, no. 5, pp. 1551–1561, May 2020.

- [2] Y.-C. Chien and J.-S. Wang, "A 0.2 V 32-kb 10T SRAM with 41 nW standby power for IoT applications," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 65, no. 8, pp. 2443–2454, Aug. 2018.
- [3] K. Shin, W. Choi, and J. Park, "Half-select free and bit-line sharing 9T SRAM for reliable supply voltage scaling," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 64, no. 8, pp. 2036–2048, Aug. 2017.
- [4] B. H. Calhoun and A. P. Chandrakasan, "Static noise margin variation for sub-threshold SRAM in 65-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 41, no. 7, pp. 1673–1679, Jul. 2006.
- [5] A. Sheikholeslami, "Process variation and Pelgrom's law," *IEEE Solid-State Circuits Mag.*, vol. 7, no. 1, pp. 8–9, Feb. 2015.
- [6] L. Dolecek, M. Qazi, D. Shah, and A. Chandrakasan, "Breaking the simulation barrier: SRAM evaluation through norm minimization," in *Proc. IEEE/ACM Int. Conf. Comput.-Aided Design*, Nov. 2008, pp. 322–329.
- [7] R. Saeidi, M. Sharifkhani, and K. Hajsadeghi, "Statistical analysis of read static noise margin for near/sub-threshold SRAM cell," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 61, no. 12, pp. 3386–3393, Dec. 2014.
- [8] H. Makino *et al.*, "Reexamination of SRAM cell write margin definitions in view of predicting the distribution," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 58, no. 4, pp. 230–234, Apr. 2011.
- [9] K. Agarwal and S. Nassif, "Statistical analysis of SRAM cell stability," in Proc. 43rd Annu. Conf. Design Autom. (DAC), 2006, pp. 57–62.
- [10] J. Boley, V. Chandra, R. Aitken, and B. Calhoun, "Leveraging sensitivity analysis for fast, accurate estimation of SRAM dynamic write VMIN," in *Proc. Design, Autom. Test Eur. Conf. Exhib. (DATE)*, 2013, pp. 1819–1824.
- [11] S. Mukhopadhyay, H. Mahmoodi, and K. Roy, "Modeling of failure probability and statistical design of SRAM array for yield enhancement in nanoscaled CMOS," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 24, no. 12, pp. 1859–1880, Dec. 2005.
- [12] T. Date, S. Hagiwara, K. Masu, and T. Sato, "Robust importance sampling for efficient SRAM yield analysis," in *Proc. 11th Int. Symp. Qual. Electron. Design (ISQED)*, Mar. 2010, pp. 15–21.
- [13] R. Kanj, R. Joshi, and S. Nassif, "Mixture importance sampling and its application to the analysis of SRAM designs in the presence of rare failure events," in *Proc. 43rd ACM/IEEE Design Autom. Conf.*, 2006, pp. 69–72.
- [14] D. Khalil, M. Khellah, N.-S. Kim, Y. Ismail, T. Karnik, and V. K. De, "Accurate estimation of SRAM dynamic stability," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 16, no. 12, pp. 1639–1647, Dec. 2008.
- [15] A. Singhee and R. A. Rutenbar, "Statistical blockade: A novel method for very fast Monte Carlo simulation of rare circuit events, and its application," in *Proc. Design, Autom., Test Eur.*, Apr. 2008, pp. 235–251.
- [16] J. Wang, A. Singhee, R. A. Rutenbar, and B. H. Calhoun, "Two fast methods for estimating the minimum standby supply voltage for large SRAMs," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 29, no. 12, pp. 1908–1920, Dec. 2010.
- [17] J. Wang, A. Singhee, R. A. Rutenbar, and B. H. Calhoun, "Statistical modeling for the minimum standby supply voltage of a full SRAM array," in *Proc. 33rd Eur. Solid-State Circuits Conf. (ESSCIRC)*, Sep. 2007, pp. 400–403.
- [18] M. H. Abu-Rahma, K. Chowdhury, J. Wang, Z. Chen, S. S. Yoon, and M. Anis, "A methodology for statistical estimation of read access yield in SRAMs," in *Proc. 45th Annu. Conf. Design Autom. DAC*, Anaheim, CA, USA, 2008, pp. 205–210.
- [19] J. P. Kulkarni and K. Roy, "Ultralow-voltage process-variation-tolerant Schmitt-Trigger-based SRAM design," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 20, no. 2, pp. 319–332, Feb. 2012.
- [20] M. H. Abu-Rahma and M. Anis, "A statistical design-oriented delay variation model accounting for within-die variations," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 27, no. 11, pp. 1983–1995, Nov. 2008.
- [21] M. J. M. Pelgrom, A. C. J. Duinmaijer, and A. P. G. Welbers, "Matching properties of MOS transistors," *IEEE J. Solid-State Circuits*, vol. 24, no. 5, pp. 1433–1440, Oct. 1989.
- [22] V. Wang, K. Agarwal, S. R. Nassif, K. J. Nowka, and D. Markovic, "A simplified design model for random process variability," *IEEE Trans. Semicond. Manuf.*, vol. 22, no. 1, pp. 12–21, Feb. 2009.
- [23] C. Couso et al., "Dependence of MOSFETs threshold voltage variability on channel dimensions," in Proc. Joint Int. Workshop Int. Conf. Ultimate Integr. Silicon (EUROSOI-ULIS), Apr. 2017, pp. 87–90.

- [24] A. Datta, S. Bhunia, S. Mukhopadhyay, N. Banerjee, and K. Roy, "Statistical modeling of pipeline delay and design of pipeline under process variation to enhance yield in sub-100nm technologies," in *Proc. Design, Autom. Test Eur.*, Munich, Germany, vol. 2, 2005, pp. 926–931.
- [25] B. S. Amrutur and M. A. Horowitz, "A replica technique for wordline and sense control in low-power SRAM's," *IEEE J. Solid-State Circuits*, vol. 33, no. 8, pp. 1208–1219, Aug. 1998.
- [26] C. C. Craig, "On the frequency function of xy," Ann. Math. Statist., vol. 7, no. 1, pp. 1–15, Mar. 1936.
- [27] A. Oliveira and A. Seijas-Macias, "An approach to distribution of the product of two normal variables," *Discussiones Mathematicae Probab. Statist.*, vol. 32, nos. 1–2, p. 87, 2012.



Benton H. Calhoun (Fellow, IEEE) received the B.S. degree from the University of Virginia in 2000 and the M.S. and Ph.D. degrees in electrical engineering from the Massachusetts Institute of Technology in 2002 and 2006, respectively. He is currently a Professor with the Department of Electrical and Computer Engineering, University of Virginia. He is the coauthor of *Sub-threshold Design* for Ultra Low-Power Systems (Springer, 2006) and the author of Design Principles for Digital CMOS Integrated Circuit Design (NTS Press, 2012). His

research interests include body area sensor nodes (BSNs), wireless sensor networks (WSNs), low power digital circuit design, sub-threshold digital circuits, sub-threshold FPGAs, SRAM design for end-of-the-roadmap silicon, power delivery circuits and architectures, variation tolerant circuit design methodologies, and low energy electronics for medical applications. He is also a Campus Director and a Technical Thrust Leader with the NSF Nanosystems Engineering Research Center for Advanced Self-Powered Systems of Integrated Sensors and Technologies (ASSIST).



Shourya Gupta (Graduate Student Member, IEEE) was born in New Delhi, India, in 1994. He received the B.Tech. degree in electronics and communication engineering from Guru Gobind Singh Indraprastha University, New Delhi, in 2017. He is currently pursuing the Ph.D. degree in electrical engineering with the University of Virginia, Charlottesville, VA, USA.

His current research interests include the design of low power logic and memory circuits, and circuit design automation.