# Experimental Assessment of Logic Circuit Performance Variability with Regular Fabrics at 90nm Technology Node

Sungdae Choi, Katsuyuki Ikeuchi, Hyunkyung Kim, Kenichi Inagaki, Masami Murakata\*, Nobuyuki Nishiguchi\*, Makoto Takamiya and Takayasu Sakurai

The University of Tokyo, Japan

\* Semiconductor Technology Academic Research Center (STARC), Japan

(sungdae@iis.u-tokyo.ac.jp)

*Abstract* Regular fabric structure is expected to reduce the process variations and increase the yield in sub-micron technology regime. Few experimental assessments, however, for the effectiveness of the regular structures has been carried out yet. In this paper, three kinds of circuit blocks are implemented with four kinds of layout styles with different regularity, and the effect of regularity on the circuit performance variations is evaluated. A test chip is fabricated with 90nm CMOS logic process and measured results show that the regular structure increases average delay, and the worst delay of the regular structure is not better than the worst delay of normal circuits with irregular standard cells.

## I. INTRODUCTION

Performance variations due to the device mismatches and random process parameters are crucial problems to reduce the chip cost as the technology scales down to deep submicron. Many researches find out the source of the variations [1, 2, 3], and regular fabric structure is expected to be one of the promising solutions [4, 5, 6] in the circuit design level. There are, however, few researches on how much regular structures are attractive from the variation and performance point of view, although layout is revealed to have huge impact on lithography-induced variability [7]. This work focuses on the effect of the layout regularity on the performance and variations of the chip.

# II. TEST STRUCTURES WITH VARIOUS LAYOUT STYLES

3- and 11-stage ring oscillators (RO3, RO11) with four layout patterns, composition of regular and irregular MOSFET fabric structure and interconnections, are implemented. And the effects of the regularity on the circuit performance and variation are analyzed by measuring the oscillation frequency of each RO. The oscillation frequency is determined only by the devices which construct the RO chain. And the frequency is more robust to noise than voltage and current. Adders with best-effort clock operation are also implemented with various layout patterns to survey the performance and variations in practical logic which is more complicated than simple ROs. They are design to generate periodic signal as soon as they finish the calculation so that the measured frequency distributions represent the performance and variations. Each of RO and adder patterns has 128 and 64 samples, respectively, and the frequency is measured using specially-cared frequency dividers (FD). Since the variation of each device is averaged out as the number of the oscillation stage increases [8], the oscillation frequencies of RO3, the shortest number of oscillation stage, maximize the effect of the local random variations. RO11s are measured in order to compare the effect of the number of the logic stage. 16-bit ripple-carry adders with best-effort clock operation show the results of practical circuit situations.

Fig. 1 shows the block diagram of RO3s and RO11s. ROs consist of static logic gates such as NAND, NOR and inverter. All ROs place neighbored to keep the layout regularity. And all edges of the regularity-considered area are filled with regular dummy patterns to prevent the systematic variations of edgeplaced samples. In the peripheral area, each FD with minimum capacitive load is dedicated to a single RO to minimize the interconnection load and preserve the RO's characteristic frequency. Once the frequency is divided and buffered, it passes through the shared dividing path which consists of FD and XOR gates. External address activates only one RO at a time, and corresponding frequency passes through the shared dividing blocks. Shared dividing path of RO3, RO11 and adder has 10, 7 and 6 FDs, respectively, so that the generated frequencies show order of MHz range from the pad and are easy to measure with off-chip measurement.



Fig. 1 Block diagram of ROs structure (16 clusters x 8 ROs)

The FD with minimum capacitive load shown in the Fig. 2 is required to catch the oscillation frequency of RO3 which generates order of gigahertz oscillation range. The simple D-Flip/Flop-based FD divides up to 8-GHz signal by two on the post-layout simulation. Due to its simplicity and the layout considering minimum parasitic capacitance, it consumes 0.14mW power with 4.7GHz input frequency. Measured data show that it successfully divides average 4.7GHz of RO3 oscillation frequency.



Fig. 2 Minimum capacitive load frequency divider

Fig. 3 shows the block diagram and operation flow of the best-effort clock operation adder. It executes two calculations, critical-path calculation and recovery calculation so that the MSB carry-out generates oscillatory signal.



Each RO and adder block is implemented with four kinds of layout styles, namely regular transistor layout (RT) with regular interconnections (RI), RT with irregular interconnections (iRI), irregular transistor (iRT) with RI, and iRT with iRI. Fig. 4 shows the layout regularity of each style. RT is implemented with the Sea-of-Gates (SOG) design style using gate-isolation method as shown in the Fig. 4 (a) to keep the pattern of all MOSFETs equal, while iRT is implemented with normal standard cells. The cell height of RT is 27% higher than that of iRT, as is found in other RT designs [5]. For regular interconnection, all the widths and spaces use minimum design rule regulations, and dummy metal bars are inserted in the empty area. For irregular interconnections, metals have various widths and no dummy bars are used.

Fig. 5 shows equivalent schematics of RO3 and RO11 with RT and iRT patterns. SOG using gate-isolation method causes redundant MOSFETs between logic gate separations as shown in the Fig. 5 (a) and (c). SOG using oxide-isolation eliminates such redundancy, but it doesn't guarantee the diffusion regularity. Size of all transistors is the same in all ROs, while size of transistors in adders varies. Core area overhead of RO3, RO11 and adder block with RT design style increases 25%, 28% and 85% compared with those of iRT implementations, respectively.



(d) RO11 with irregular transistor fabric (128 w/ RI & 128 w/ iRI on a chip)

Fig. 5 RO schematics with RT and iRT layouts

Fig. 6 shows the test chip implemented with 90nm CMOS logic process. 11 kinds of patterns are implemented in 1.6mm x 0.9mm core area. For ROs, each layout style has 128 samples and adder block has 64 samples for each style on a chip. Frequencies of 11 dies, 1216 samples for each die, are automatically measured using LabVIEW [9], and analyzed to figure out the performance variations.



Fig. 6 Measured test chip

### III. CYCLE TIME DISTRIBUTIONS

Fig. 7 distributes the average cycle time ( $\mu$ ) and its  $3\sigma$  (standard deviation) of 11 test chips. For RO3 shown in Fig. 7 (a), composition of RT and RI shows the largest  $\mu$ , and iRT+iRI shows the shortest  $\mu$  among 4 different layout styles. With the same transistor layout style, that is, if RT or iRT is fixed, RI is 3% slower than iRI. And RT is 18% and 17% slower than iRT with the same RI and iRI, respectively. The  $3\sigma$  range is similar to each other, regardless of the layout regularity. This is because the random dopant fluctuation dominates the variation, which can not be controlled by layout style.

In RO11 shown in Fig. 7 (b), average of RT is 23% and 24% slower than iRT in equal RI and iRI condition, respectively. And RI is 1% and 2% slower than iRI with RT and iRT condition, respectively. Again the random dopant fluctuation dominates the variation, which is independent from the layout styles. For all 11 test chips, worst cycle time of iRTs doesn't exceed that of best cycle time of RT, which means that the regular fabrics does not show any advantage over normal standard cell approach in both worst-case performance and area.

Since the blocks in adders run asynchronously to generate a clock, it happens to be very sensitive to glitch noise. This phenomenon unexpectedly acts as a performance variability magnifier. With the help of these magnifying effects, RT styles show much larger variation than iRT counterpart, as shown in Fig. 7 (c). RT design style has more coupling capacitance among lines,

which makes noisier environments. The increase of average delay time in RT is mainly due to the area increase and corresponding load increase.





Normalized variation  $(\sigma/\mu)$  increases as  $V_{DD}$  decreases, as seen in Fig. 8. But the conclusion that regular fabric does not help improve the worst-case performance is unchanged as is shown in Fig. 9 which depicts variation at 0.7V  $V_{DD}$ .



Fig. 8 Normalized variation  $(\sigma/\mu)$  of ROs at various V<sub>DD</sub>



Fig. 9 Delay and  $3\sigma$  error bar of ROs at 0.7V V<sub>DD</sub>

## IV. CONCLUSIONS

Three kinds of circuits are implemented with four layout styles with different regularities and the oscillating frequencies are measured to investigate the effect of regular structures on the performance. Regular transistor fabric significantly degrades the performance due to its larger parasitic capacitance but doesn't reduce the variations, since the variations mostly come from the random dopant effects. Regular interconnection fabric shows negligible performance reduction with no variation reduction. Thus, the regular fabric does not help improve the worst-case performance.

#### ACKNOWLEDGEMENT

This work is partially supported by STARC. The VLSI chips were fabricated through the chip fabrication program of VLSI Design and Education Center (VDEC), the University of Tokyo, with the collaboration by STARC, Fujitsu Limited, Matsushita Electric Industrial Company Limited., NEC Electronics Corporation, Renesas Technology Corporation, and Toshiba Corporation.

#### REFERENCES

- Marcel J.M. Pelgrom, et al., "Matching Properties of MOS Transistors," IEEE J. Solid-State Circuits, pp. 1433-1440, Oct. 1989
- [2] Sani Nassif, "Delay Variability: Sources, Impacts and Trends," ISSCC Dig. Tech. Papers, pp. 368-369, Feb. 2000
- [3] Saibal Mukhopadhyay, et al., "Statistical Characterization and On-Chip Measurement Methods for Local Random Variability of a Process Using Sense-Amplifier-Based Test Structure," IEEE Dig. Tech. Papers, pp. 400-401, Feb. 2007
- [4] L. Pileggi, et al., "Exploring Regular Fabrics to Optimize the Performance-Cost Trade-Off," DAC, pp. 782-787, 2003
- [5] V. Kheterpal, et al., "Design Methodology for IC Manufacturability Based on Regular Logic-Bricks," DAC, pp. 353-358, 2005
- [6] T. Jhaveri, et al., "Maximization of Layout Printability/Manufacturability by Extreme Layout Regularity," SPIE Invited paper, Feb. 2006
- [7] Borivoje Nikolic, et al., "Measurements and Analysis of Process Variability in 90nm CMOS," IEEE ICSICT, pp. 505-508, Oct. 2006
- [8] Bowman, K. A., et al., "Impact of die-to-die and within-die parameter fluctuations on the maximum clock frequency distribution for gigascale integration," IEEE J. Solid-State Circuits, pp. 183-190, Feb. 2002
- [9] LabVIEW by National Instruments: http://www.ni.com/labview/