An Efficient Lyapunov Equation-Based Approach for ... - Lirmm

DAC'99, pages 1-6 

An Efficient Lyapunov Equation-Based Approach for Generating 

Reduced-OrderModels of Interconnect 

Jing-Rebecca Li, Frank Wang, Jacob White 

Research Laboratory of Electronics, Massachusetts Institute of Technology 

Cambridge, MA 02139 

Abstract 

In this paper we present a new algorithm for computing reduced-order models of interconnect 

which utilizes the dominant controllable subspace of the system. The dominant controllable 

modes are computed via a new iterative Lyapunov equation solver, Vector ADI. This new 

algorithm is as inexpensive as Krylov subspace-based moment matching methods, and often 

produces a better approximation over a wide frequency range. A spiral inductor and a 

transmission line example show this new method can be much more accurate than moment 

matching via Arnoldi. 

References 

[1] J. E. Bracken. Passive Modeling of Linear Interconnect Networks. IEEE Trans. on Circuits and Systems, (Part I: 

Fundamental Theory and Applications), to appear 

[2] E. Chiprout and M. Nakhla. Generalized Moment-Matching Methods for Transient Analysis of Interconnect 

Networks. In 29th ACM/IEEE Design Automation Conference, pp. 201-206, Anaheim, CA, June 1992 

[3] N. Ellner and E. L. Wachspress. Alternating Direction Implicit Iteration for Systems with Complex Spectra. 

Siam J. Numer. Anal., Vol. 28, No.3, pp.859-870, June 1991 

[4] D. F. Enns. Model Reduction with Balanced Realizations: An Error Bound and a Frequency Weighted 

Generalization. Proc. of 23rd Conf. on Decision and Control, Dec. 1984 

[5] P. Feldmann and R. W. Freund. Efficient Linear Circuit Analysis by Padé Approximation via the Lanczos 

Process. IEEE Trans. Computer-Aided Design, Vol. 14, No.5, pp.639-649, May 1995 

[6] R. W. Freund, G. H. Golub, and N. M. Nachtigal. Iterative Solution of Linear Systems. Acta Numerica(1991), 

pp.57-100 

[7] K. Gallivan, E. J. Grimme, and P. Van Dooren. A Rational Lanczos Algorithm for Model Reduction. Numer. 

Algorithms, 1996 

[8] K. Glover. All Optimal Hankel-norm Approximations of LinearMultivariable Systems and Their L 8 -error 

Bounds. Int. J. Control, Vol.39, No.6, pp.1115-1193, 1984 

[9] E. J. Grimme, D. C. Sorensen, and P. Van Dooren. Model Reduction of State Space Systems via an Implicitly 

Restarted LanczosMethod. Numer. Algorithms, 1995 

[10] A. S. Hodel and K. Poolla. Numerical Solution of Very Large, Sparse Lyapunov Equations through 

Approximate Power Iteration. IEEE Conf. on Decision and Control, Dec. 1990 

[11] M. Kamon, F. Wang, and J. White. Recent Improvements for Fast Inductance Extraction and Simulation. EPEP 

Conference, to appear 

[12] K. J. Kerns, I. L. Wemple, and A. T. Yang. Stable and Efficient Reduction of Substrate Model Networks Using 

Congruence Transforms. In IEEE/ACM International Conference on Computer Aided Design, pp. 207-214, San 

Jose, CA, Nov. 1995 

[13] S. Kumashiro, R. Rohrer, and A. Strojwas. A New Efficient Method for the Transient Simulation of Three- 

Dimensional Interconnect Structures. In Proc. Int. Electron Devices Meeting, Dec. 1990 

[14] J. Li and J. White. Model Order Reduction using Approximate Dominant Singular Subspaces of the System 

Grammians. in preparation 

[15] A. Lu and E. L. Wachspress. Solution of Lyapunov equations by ADI Iteration. Computers Math. Applic., Vol. 

21 No. 9, 1991 

[16] N. Marques, M. Kamon, J. White, and L. M. Silveira. A Mixed Nodal-Mesh Formulation for Efficient 

Extraction and Passive Reduced-Order Modeling of 3D Interconnects. In 35th ACM/IEEE Design Automation 

Conference, pp. 297-302, San Francisco, CA, June 1998

[17] A. Odabasioglu, M. Celik, and L. Pileggi. PRIMA: Passive Reduced-Order Interconnect Macromodeling 

Algorithm. IEEEConference on Computer-Aided Design, San Jose, CA, 1997 

[18] L. Pernebo and L. M. Silverman. Model Reduction via Balanced State Space Representations. IEEE Trans. on 

Automatic Control, Vol. 27, No. 2:382–387, April 1982 

[19] L. T. Pillage and R. A. Rohrer. Asymptotic Waveform Evaluation for TimingAnalysis. IEEE Trans. CAD, 

9(4):352–366, April 1990 

[20] L. M. Silveira,M. Kamon, I. Elfadel, and J. K.White. A Coordinate-Transformed Arnoldi Algorithm for 

Generating Guaranteed Stable Reduced-Order Models of RLC Circuits. pp.288–294, ICCAD, San Jose, CA. Nov. 

1996 

[21] G. Starke. Optimal Alternating Direction Implicit Parameters for Nonsymmetric Systems of Linear Equations. 

Siam J. Numer. Anal, Vol. 28, No. 5, pp. 1431-1445, Oct. 1991 

[22] E. L. Wachspress. The ADI Model Problem. Windsor, CA, 1995


Error Bounded Padé Approximation via Bilinear Conformal Transformation 

Chung-Ping Chen 1 and D. F. Wong 2 

1 Strategic CAD Labs, Intel Corp., Hillsboro, Oregon 97124. 

2 Department of Computer Sciences, University of Texas at Austin, Austin, Texas 78712. 

Abstract 

Since Asymptotic Waveform Evaluation (AWE) was introduced in [5], many interconnect model 

order reduction methods via Padé approximation have been proposed. Although the stability and 

precision of model reduction methods have been greatly improved, the following important 

question has not been answered: "What is the error bound in the time domain?". This problem is 

mainly caused by the "gap" between the frequency domain and the time domain, i.e. a good 

approximated transfer function in the frequency domain may not be a good approximation in the 

time domain. All of the existing methods approximate the transfer function directly in the 

frequency domain and hence can not provide error bounds in the time domain. In this paper, we 

present new moment matching methods which can provide guaranteed error bounds in the time 

domain. Our methods are based on the classic work by Teasdale in [1] which performs Padé 

approximation in a transformed domain by the bilinear conformal transformation s = 1-z/1+z. 

References 

[1] R.D. Teasdale, “Time domain approximation by use of Padé approximants", IRE Convention Record, Vol.1, pt.5 

pp. 89-94, 1953. 

[2] R. S. Tsay, “An exact zero-skew clock routing algorithm" IEEE Transactions on Computer-Aided Design of 

Integrated Circuits and Systems, February 1993. 

[3] P. Feldmann, R. W. Freund, “Efficient linear circuit analysis by Padé approximation via the Lanczos process", 

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, May 1995. 

[4] L. M. Silveira, M. Kamon, I. Elfadel, and J. White, “A coordinate-transformed Arnoldi algorithm for generating 

guaranteed stable reduced-order models of RLC circuits" Proc. ICCAD, 1996. 

[5] L. T. Pillege, and R. A. Rohrer “Asymptotic wave-form evaluation for timing analysis", IEEE Transactions on 

Computer-Aided Design of Integrated Circuits and Systems, April, 1990. 

[6] E. Chiprout, and M. S. Nakhla, “Analysis of interconnect networks using complex frequency hopping", IEEE 

Transactions on Computer-Aided Design of Integrated Circuits and Systems, February, 1995. 

[7] G. H. Golub and C. F. Van Loan, “Matrix computations", The Johns Hopkins University Press, Baltimore, 

Maryland, 1983. 

[8] K.J. Kerns, I.L. Wemple, and A.T. Yang, “Stable and Efficient reduction of large, multiport RC networks by 

pole analysis via congruence transformations", 33th ACM/IEEE DAC, 1996. 

[9] A. Odabasioglu, M. Celik, and L. Pilleggi, “PRIMA: Passive reduced-order interconnect macromodeling 

algorithm", Proc. ICCAD, 1998. 

[10] P. Rabiei, and M. Pedram, “Model Order Reduction for Large Circuit Using Balance Truncation", Proc. ASP- 

DAC, 1999. 

[11] Qingjian Yu, Janet M. Wang, and Ernest S. Kuh, “Multipoint Moment Matching Model For Multiport 

Distributed Interconnect Networks", Proc. ICCAD, 1998. 

[12] I.M. Elfadel, and D.D. Ling, “A Block Rational Arnoldi Algorithm for Multipoint Pawwive Model-Order 

Reduction of Multiport RLC Networks", Proc. ICCAD, 1997. 

[13] T.V. Nguyen, and J. Li, “Multipoint Padé Approximation Using a Rational Block Lanczos Algorithm", Proc. 

ICCAD, 1997. 

[14] K. Gallivan, E. Grimme, and P. Wan Dooren, “Asymptotic waveform evaluation via a Lanczos method", App. 

Math. Lett., vol. 7, pp. 75-80, 1994.


Model-Reduction of Nonlinear Circuits Using Krylov-Space Techniques 

Pavan K. Gunupudi, Michel S. Nakhla 

Department of Electronics, Carleton University, Ottawa, Canada K1S 5B6 

ABSTRACT 

A new algorithm based on Krylov subspace methods is proposed for model-reduction of large 

nonlinear circuits. Reduction is obtained by projecting the original system described by nonlinear 

differential equations into a subspace of a lower dimension. The reduced model can be simulated 

using conventional numerical integration techniques. Significant reduction in computational 

expense is achieved as the size of the reduced equations is much less than that of the original 

system. 

Keywords: Model-reduction, nonlinear circuits, Krylov-subspace 

REFERENCES 

[1] J. K. White and A. S. Vincentelli, Relaxation Techniques for the simulation of VLSI Circuits, Boston: Kluwer 

Academic Publishers, 1990. 

[2] D. O. Pederson, “A historical review of circuit simulation,” IEEE Transactions on Circuits and Systems, vol. 

CAS-31, no. 1, Jan. 1984. 

[3] A. S. Vincentelli, “Circuit Simulation,” in Computer Design Aids for VLSI Circuits, P. Antognetti, D. O. 

Pederson and H. De Man (editors). Martinus Nijhoff Publishers, 1986, pp. 19-112. 

[4] J. K. Ousterhout, “CRYSTAL: A timing analyzer for NMOS VLSI Circuits,” in Proc. 3rd Caltech. Conf. on 

VLSI, Mar. 1983, pp. 57-69 

[5] Norman P. Jouppi, “Timing analysis and performance improvement of MOS VLSI design,” IEEE Trans. 

Computer-Aided Design, vol. 6, no 4, pp. 650-665, Jul. 1987. 

[6] S. Lin, M. M. Sadowska, and E. S. Kuh, “SWEC: A step wise equivalent conductance timing simulator for 

CMOS VLSI circuits,” in Proc. Electron. Design Automation Conf., 1991, pp. 142-148. 

[7] A. S. Vincetelli, E. Lelarasmee, and A. Ruehli, “The waveform relaxation method for the time-domain analysis 

of large scale integrated circuits,” IEEE Trans. Computer-Aided Design, vol. 1, no. 3, pp. 131-145, Aug. 1982. 

[8] A. Devgan and R. A. Roher, “Adaptively controlled explicit simulation,” IEEE Trans. on Computer-Aided 

Design, vol. 13, no. 6, Jun. 1994. 

[9] E. Chiprout and M. S. Nakhla, “Analysis of Interconnect Networks Using Complex Frequency Hopping (CFH),” 

IEEE Trans. on CAD of Integrated Circuits and Systems, vol. 14 no. 2, pp. 186-200, Feb 1995. 

[10] I. M. Elfadel and D. D. Ling, “A block rational Arnoldi algorithm for multiport passive model-order reduction 

of multiport RLC networks,” Proc. of ICCAD-97, pp. 66-71, Nov. 1997. 

[11]Q. Yu, J. M. L. Wang and E. S.Kuh, “Multipoint multiport algorithm for passive reduced-order model of 

interconnect networks,” Proc. of ISCAS-98, vol. 6, pp. 74-77, Jun. 1998. 

[12] A. Odabasioglu, M. Celik, L. T. Pileggi, “PRIMA: passive reduced-order interconnect macromodeling 

algorithm,” Proc. of ICCAD-97, pp. 58-65, Nov. 1997. 

[13] K. J. Kerns and A. T. Yang, “Preservation of passivity during RLC network reduction via split congruence 

transformations,” IEEE/ACM Proc. DAC, pp. 34-39, Jun. 1997. 

[14] J. W. Demmel, Applied Numerical Linear Algebra, Philadelphia, PA: SIAM Publishers, 1997. 

[15]C. W. Ho, A. E. Ruehli and P. A. Brennan, “The modified nodal approach to network analysis,” IEEE Trans. 

Circuits and Systems, vol. CAS-22, pp. 504-509, Jun. 1975. 

[16] R. Griffith and M. S. Nakhla, “A new high-order absolutely stable explicit numerical integration algorithm for 

the time-domain simulation of nonlinear circuits,” Proc. of ICCAD-97, pp. 276-280, Nov. 1997. 

[17] J. Vlach and K. Singhal, Computer methods for circuit analysis and design, New York, NY: Van Nostrand 

Reinhold, 1983.


ENOR: Model Order Reduction of RLC Circuits Using Nodal Equations for Efficient 

Factorization 

Bernard N. Sheehan 

Mentor Graphics, Wilsonville OR 

Abstract 

ENOR is an innovative way to produce provably-passive, reciprocal, and compact 

representations of RLC circuits. Beginning with the nodal equations, ENOR formulates 

recurrence relations for the moments that involve factorizing a symmetric, positive definite 

matrix; this contrasts with other RLC order reduction algorithms that require expensive LU 

factorization. It handles floating capacitors, inductor loops, and resistor links in a uniform way. It 

distinguishes between active and passive ports, does Gram-Schmidt orthogonalization on the fly, 

controls error in the time-domain. ENOR is a superbly simple, flexible, and well-conditioned 

algorithm for lightning reduction of mega-sized RLC trees, meshes, and coupled interconnectsall 

with excellent accuracy. 

References 

[1] A. Odabasioglu, M. Celik, L. T. Pileggi, "PRIMA: Passive Reduced-order Interconnect Macromodeling 

Algorithm," 34th DAC, pp. 58-65, 1997 

[2 ] L. Rohrer and L. Pillage, "Asymptotic Waveform Evaluation for Timing Analysis," IEEE Trans. Computer 

Aided Design, vol. 9, pp. 352-66, 1990. 

[3] P. Feldmann and R. W. Freund, "Reduced-Order Modeling of Large Linear Subcircuits via a Block Lanczos 

Algorithm," 32nd DAC, pp. 474-79, 1995. 

[4] M. Silveira, M. Kamon, I. Elfadel and J. White, "A Coordinate-Transformed Arnoldi Algorithm for Generating 

Guaranteed Stable Reduced-Order Models of RLC Circuits," 33rd DAC, pp. 288-94, 1996. 

[5] K. Kerns, I. Wemple, A. Yang, "Stable and Efficient Reduction of Substrate Model Networks Using Congruence 

Transforms," ICCAD 1995, pp. 207-14. 

[6] R. W. Freund and P. Feldmann, "Reduced-Order Modelling of Large Passive Linear Circuits by Means of the 

SyPVL Algorithm," 33rd DAC, pp. 280-87, 1996. 

[7] K. L. Shepard, V. Narayanan, P. C. Elmendorf, G. Zheng, "Global Harmony: Coupled Noise Analysis for Full- 

Chip RC Interconnect Networks", 34th DAC, pp. 139-46, 1997. 

[8] B. Sheehan, “Projective Convolution: RLC Model-Order Reduction Using the Impulse Response”, DATE 99, 

1999. 

[9] C. Ratzlaff and L. Pillage, "RICE: Rapid Interconnect Circuit Evaluation Using AWE," IEEE Trans. CAD, vol. 

13, pp. 763-76, 1994. 

[10] A. George and J. W-H Liu. Computer Solution of Large Sparse Positive Definite Systems. Prentice-Hall, New 

Jersey, 1981. 

[11] E. A. Guillemin, Synthesis of Passive Networks, John Wiley and Sons, 1957.


Why is ATPG easy? 

Mukul R. Prasad, Philip Chong, Kurt Keutzer 

Department of Electrical Engineering and Computer Sciences 

University of California, Berkeley, CA 94720 

Abstract 

Empirical observation shows that practically encountered instances of ATPG are efficiently 

solvable. However, it has been known for more than two decades that ATPG is an NP-complete 

problem. This work is one of the first attempts to reconcile these seemingly disparate results. We 

introduce the concept of circuit cut-width and characterize the complexity of ATPG in terms of 

this property. We provide theoretical and empirical results to argue that an interestingly large 

class of practical circuits have cut-width characteristics which ensure a provably efficient 

solution of ATPG on them. 

References 

[1] C. L. Berman. Circuit Width, Register Allocation and Ordered Binary Decision Diagrams. IEEE Trans. CAD, 

10(8):1059–1066, Aug 1991. 

[2] E. Boros, Y. Crama, and P. L. Hammer. Polynomial-time Inference of All Valid Implications for Horn and 

Related Formulae. Ann. Math Art. Intell., 1:21–32, 1990. 

[3] D. Brand. Verification of Large Synthesized Designs. In IEEE ICCAD, pages 534–537, 1993. 

[4] R. Brayton, R. Rudell, A. Sangiovanni-Vincentelli, and A. Wang. MIS: A Multiple-Level Logic Optimization 

System. IEEE Trans. on CAD/ICAS, CAD-6(6):1062–1082, Nov 1987. 

[5] F. Brglez and H. Fujiwara. A Neural Netlist of 10 Combinational Benchmark Circuits and a Target Translator in 

Fortran. In Intl. Symp. on Circuits and Systems, Jun 1985. 

[6] K.-T. Cheng and L. A. Entrena. Multi-level Logic Optimization by Redundancy Addition and Removal. In 

European Conference on Design Automation, pages 373–377, Jun 1993. 

[7] P. Chong, M. R. Prasad, and K. Keutzer. Why is ATPG Easy ? Technical Report UCB/ERL M99/9, ERL, 

University of California, Berkeley, Feb 1999. 

[8] O. Coudert. Exact Covering of Real-Life Graphs is Easy. In Proceedings of the DAC, pages 121–126, Jun 1997. 

[9] S. Devadas, H.-K. T. Ma, and A. Sangiovanni-Vincentelli. Logic Verification, Testing and Their Relationship to 

Logic Synthesis. In Testing and Diagnosis of VLSI and ULSI, pages 181–246. Kluwer Academic Publishers, 1988. 

[10] H. Fujiwara. Computational Complexity of Controllability/Observability Problems for Combinaitonal Circuits. 

In Intl. Symp. on Fault-Tolerant Computing, pages 64–69, Jun 1988. 

[11] M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. 

H. Freeman and Company, 1979. 

[12] J. Gu, P. W. Purdom, J. Franco, and B. W. Wah. Algorithms for the Satisfiability (SAT) Problem: A Survey. 

DIMACS Series in Discrete Mathematics and Computer Science, 35:19–151, 1997. 

[13] D. S. Hochbaum, editor. Approximation Algorithms for NP-Hard Problems. PWS Publishing Company, 1997. 

[14] M. Hutton, J. Grossman, J. Rose, and D. Corneil. Characterization and Paramterized Random Generation of 

Digital Circuits. In 33rd Design Automation Conference, pages 94–99, 1996. 

[15] O. H. Ibarra and S. K. Sahni. Polynomially Complete Fault Detection Problems. IEEE Trans. Computers, C- 

24(3):242–249, Mar 1975. 

[16] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar. Multilevel Hypergraph Partitioning: Application in VLSI 

Domain. In 34th Design Automation Conference, pages 526–529, 1997. 

[17] A. Kuehlmann, A. Srinivasan, and D. P. LaPotin. Verity - A Formal Verification Program for Custom CMOS 

Circuits. IBM Journal of Research and Development, 39:149–165, 1995. 

[18] T. Larrabee. Efficient Generation of Test Patterns Using Boolean Difference. In Intl. Test Conference, pages 

795–801, 1989. 

[19] K. L. McMillan. Symbolic model checking: An approach to the state explosion problem. PhD thesis, School of 

Computer Science, Carnegie Mellon University, 1992. 

[20] S. L. Meyer. Data Analysis For Scientists and Engineers. Wiley and Sons, 1975.

[21] P. W. Purdom and C. A. Brown. Polynomial-Average-Time Satisfiability Problems. Information Sciences, 

41:23–42, 1987. 

[22] E. M. Sentovich et al. SIS: A System for Sequential Circuit Synthesis. Technical Report UCB/ERL M92/41, 

ERL, College of Engineering, University of California, Berkeley, May 1998. 

[23] J. P. M. Silva and K. A. Sakallah. GRASP - A New Search Algorithm for Satisfiability. In ICCAD, pages 220– 

227, 1996. 

[24] P. Stephan, R. K. Brayton, and A. L. Sangiovanni-Vincentelli. Combinational Test Generation Using 

Satisfiability. IEEE Trans. on CAD/ICAS, 15(9):1167–1176, Sep 1996. 

[25] T. W. Williams and K. Parker. Testing Logic Networks and Designing for Testability. Computer, pages 9–21, 

Oct 1979. 

[26] S. Yang. Logic Synthesis and Optimization Benchmarks User Guide, Version 3.0. Technical report, 

Microelectronics Center of North Carolina, 1991.


Using Lower Bounds during Dynamic BDD Minimization 

Rolf Drechsler, Wolfgang Günther 

Institute of Computer Science, Albert-Ludwigs-University, 

79110 Freiburg im Breisgau, Germany 

Abstract 

Ordered Binary Decision Diagrams (BDDs) are a data structure for representation and 

manipulation of Boolean functions often applied in VLSI CAD. The choice of the variable 

ordering largely influences the size of the BDD; its size may vary from linear to exponential. The 

most successful methods for finding good orderings are based on dynamic variable reordering, 

i.e. exchanging of neighboring variables. This basic operation has been used in various variants, 

like sifting and window permutation. 

In this paper we show that lower bounds computed during the minimization process can speed up 

the computation significantly. First, lower bounds are studied from a theoretical point of view. 

Then these techniques are incorporated in dynamic minimization algorithms. By the computation 

of good lower bounds large parts of the search space can be pruned resulting in very fast 

computations. Experimental results are given to demonstrate the efficiency of our approach. 

References 

[1] B. Bollig, M. Löbbing, and I. Wegener. On the effect of local changes in the variable ordering of ordered 

decision diagrams. Information Processing Letters, 59:233-239, 1996. 

[2] B. Bollig and I. Wegener. Improving the variable ordering of OBDDs is NP-complete. IEEE Trans. on Comp., 

45(9):993-1002, 1996. 

[3] R.E. Bryant. Graph - based algorithms for Boolean function manipulation. IEEE Trans. on Comp., 35(8):677- 

691, 1986. 

[4] R.E. Bryant. On the complexity of VLSI implementations and graph representations of Boolean functions with 

application to integer multiplication. IEEE Trans. on Comp., 40:205-213, 1991. 

[5] R. Drechsler and B. Becker. Binary Decision Diagrams - Theory and Implementation. Kluwer Academic 

Publishers, 1998. 

[6] R. Drechsler, N. Drechsler, and W. Günther. Fast exact minimization of BDDs. In Design Automation Conf., 

pages 200-205, 1998. 

[7] S.J. Friedman and K.J. Supowit. Finding the optimal variable ordering for binary decision diagrams. In Design 

Automation Conf., pages 348-356, 1987. 

[8] H. Fujii, G. Ootomo, and C. Hori. Interleaving based variable ordering methods for ordered binary decision 

diagrams. In Int'l Conf. on CAD, pages 38-41, 1993. 

[9] M. Fujita, Y. Matsunaga, and T. Kakuda. On variable ordering of binary decision diagrams for the application of 

multi-level synthesis. In European Conf. on Design Automation, pages 50-54, 1991. 

[10] N. Ishiura, H. Sawada, and S. Yajima. Minimization of binary decision diagrams based on exchange of 

variables. In Int'l Conf. on CAD, pages 472-475, 1991. 

[11] S.-W. Jeong, T.-S. Kim, and F. Somenzi. An Efficient method for optimal BDD ordering computation. In 

International Conference on VLSI and CAD, 1993. 

[12] C. Meinel and A. Slobodová. Speeding up variable reordering of OBDD. In Int'l Conf. on Comp. Design, pages 

338-343, 1997. 

[13] S. Panda and F. Somenzi. Who are the variables in your neighborhood. In Int'l Conf. on CAD, pages 74-77, 

1995. 

[14] R. Rudell. Dynamic variable ordering for ordered binary decision diagrams. In Int'l Conf. on CAD, pages 42- 

47, 1993. 

[15] A. Slobodová and C. Meinel. Sample method for minimization of OBDD. In Int'l Workshop on Logic Synth., 

pages 311-316, 1998.

[16] F. Somenzi. CUDD: CU Decision Diagram Package Release 2.2.0. University of Colorado at Boulder, 1998.


Optimization-Intensive Watermarking Techniques for Decision Problems 

Gang Qu, Jennifer L. Wong, and Miodrag Potkonjak 

Computer Science Department, University of California, Los Angeles, CA 90095 

Abstract 

Recently, a number of watermarking-based intellectual property protection techniques have been 

proposed. Although they have been applied to different stages in the design process and have a 

great variety of technical and theoretical features, all of them share two common properties: they 

all have been applied solely to optimization problems and do not involve any optimization during 

the watermarking process. 

In this paper, we propose the first set of optimization-intensive watermarking techniques for 

decision problems. In particular, we demonstrate how one can select a subset of superimposed 

water-marking constraints so that the uniqueness of the signature and the likelihood of satisfying 

an instance of the satisfiability problem are simultaneously maximized. We have developed three 

watermarking SAT techniques: adding clauses, deleting literals, push-out and pull-back. Each 

technique targets different types of signature-induced constraint superimposition on an instance 

of the SAT problem. In addition to comprehensive experimental validation, we theoretically 

analyze the potentials and limitations of the proposed watermarking techniques. Furthermore, we 

analyze the three proposed optimization-intensive watermarking SAT techniques in terms of 

their suitability for copy detection. 

References 

[1] P. Cheeseman, B. Kanefsky, andW.M. Taylor. Where the Really Hard Problems Are. Twelveth International 

Joint Conference on Artificial Intelligence, pp. 331- 337, 1991. 

[2] J. Franco, and Y.C. Ho. Probabilistic Performance of A Heuristic for the Satisfiability Problem. Discrete 

Applied Mathematics, Vol. 22, pp. 35-51, 1988. 

[3] A.B. Kahng, J. Lach,W.H. Magione-Smith, S. Mantik, I.L. Markov, M. Potkonjak, P. Tucker, H. Wang and G. 

Wolfe. Watermarking Techniques for Intellectual Property Protection. 35th Design Automation Conference 

Proceedings, pp. 776-781, 1998. 

[4] G. Qu, and M. Potkonjak. Analysis of Watermarking Techniques for Graph Coloring Problem. IEEE/ACM 

International Conference on Computer Aided Design Proceedings, 1998. 

[5] B.Selman, H.Kautz, and D.McAllester. Ten Challenges in Propositional Reasoning and Search. Proceedings of 

the 15th International Joint Conference on Artificial Intelligence (IJCAI-97) pp. 50-54, 1997. 

[6] J.P.M. Silva, and K.A. Sakallah. GRASP— A New Search Algorithm for Satisfiability. Proceedings of ICCAD- 

96, pp. 220-227, 1996. 

[7] http://dimacs.rutgers.edu/ 

[8] http://aida.intellektik.informatik.th-darmstadt.de/ hoos/SATLIB/


Efficient Algorithms for Optimum Cycle Mean and Optimum Cost to Time Ratio Problems 

Ali Dasdan*, Sandy S. Irani and Rajesh K. Gupta 

*Dept. of Computer Science, University of Illinois, Urbana, IL 61801 

Dept. of Information and Computer Science, University of California, Irvine, CA 92697 

Abstract 

The goal of this paper is to identify the most efficient algorithms for the optimum mean cycle 

and optimum cost to time ratio problems and compare them with the popular ones in the CAD 

community. These problems have numerous important applications in CAD, graph theory, 

discrete event system theory, and manufacturing systems. In particular, they are fundamental to 

the performance analysis of digital systems such as synchronous, asynchronous, data flow, and 

embedded real-time systems. For instance, algorithms for these problems are used to compute 

the cycle period of any cyclic digital system. Without loss of generality, we discuss these 

algorithms in the context of the minimum mean cycle problem (MCMP). We performed a 

comprehensive experimental study of ten leading algorithms for MCMP. We programmed these 

algorithms uniformly and efficiently. We systematically compared them on a test suite composed 

of random graphs as well as benchmark circuits. Above all, our results provide important insight 

into the performance of these algorithms in practice. One of the most surprising results of this 

paper is that Howard's algorithm, known primarily in the stochastic control community, is by far 

the fastest algorithm on our test suite although the only known bound on its running time is 

exponential. We provide two stronger bounds on its running time. 

References 

[1] Ahuja, R. K., Kodialam, M., Mishra, A. K., and Orlin, J. B. Computational investigation of maximum flow 

algorithms. European J. of Operational Research, 97 (1997), 509-542. 

[2] Ahuja, R. K., Magnanti, T. L., and Orlin, J. B. Network Flows. Prentice Hall, Upper Saddle River, NJ, USA, 

1993. 

[3] Bacelli, F., Cohen, G., Olsder, G. J., and Quadrat, J.-P. Synchronization and Linearity. John Wiley & Sons, New 

York, NY, USA, 1992. 

[4] Burns, S. M. Performance analysis and optimization of asynchronous circuits. PhD thesis, California Institute of 

Technology, 1991. 

[5] Cherkassky, B. V., Goldberg, A. V., and Radzik, T. Shortest path algorithms: Theory and experimental 

evaluation. In Proc. 5th ACM-SIAM Symp. on Discrete Algorithms (1994), pp. 516-525. 

[6] Cochet-Terrasson, J., Cohen, G., Gaubert, S., McGettrick, M., and Quadrat, J.-P. Numerical computation of 

spectral elements in max-plus algebra. In Proc. IFAC Conf. on Syst. Structure and Control (1998). 

[7] Cuninghame-Green, R. A., and Yixun, L. Maximum cycle-means of weighted digraphs. Applied Math.-JCU 11 

(1996), 225-34. 

[8] Dasdan, A., and Gupta, R. K. Faster maximum and minimum mean cycle algorithms for system performance 

analysis. IEEE Trans. Computer-Aided Design 17, 10 (Oct. 1998). 

[9] Dasdan, A., Irani, S., and Gupta, R. K. An experimental study of minimum mean cycle algorithms. Tech. rep. 

#98-32, Univ. of California, Irvine, July 1998. 

[10] Gerez, S. H., de Groot, S. M. H., and Herrmann, O. E. A polynomial-time algorithm for the computation of the 

iteration-period bound in recursive data-ow graphs. IEEE Trans. on Circuits and Syst.-1 39, 1 (Jan. 1992), 49-52. 

[11] Gondran, M., and Minoux, M. Graphs and Algorithms. John Wiley and Sons, New York, NY, USA, 1984. 

[12] Hartmann, M., and Orlin, J. B. Finding minimum cost to time ratio cycles with small integral transit times. 

Networks 23 (1993), 567-74. 

[13] Hulgaard, H., Burns, S. M., Amon, T., and Borriello, G. An algorithm for exact bounds on the time separation 

of events in concurrent systems. IEEE Trans. Comput. 44, 11 (Nov. 1995), 1306-17. 

[14] Ito, K., and Parhi, K. K. Determining the minimum iteration period of an algorithm. J. VLSI Signal Processing 

11, 3 (Dec. 1995), 229-44.

[15] Karp, R. M. A characterization of the minimum cycle mean in a digraph. Discrete Mathematics 23 (1978), 309- 

11. 

[16] Karp, R. M., and Orlin, J. B. Parametric shortest path algorithms with an application to cyclic staffing. Discrete 

Applied Mathematics 3 (1981), 37-45. 

[17] Lawler, E. L. Combinatorial Optimization: Networks and Matroids. Holt, Reinhart, and Winston, New York, 

NY, USA, 1976. 

[18] Mathur, A., Dasdan, A., and Gupta, R. K. Rate analysis of embedded systems. ACM Trans. on Design 

Automation of Electronic Systems 3, 3 (July 1998). 

[19] Megiddo, N. Combinatorial optimization with rational objective functions. Mathematics of Operations 

Research 4, 4 (Nov. 1979), 414-424. 

[20] Mehlhorn, K., and Naher, S. LEDA: A platform for combinatorial and geometric computing. Comm. of the 

ACM 38, 1 (1995), 96-102. 

[21] Orlin, J. B., and Ahuja, R. K. New scaling algorithms for the assignment and minimum mean cycle problems. 

Mathematical Programming 54 (1992), 41-56. 

[22] Szymanski, T. G. Computing optimal clock schedules. In Proc. 29th Design Automation Conf. (1992), 

ACM/IEEE, pp. 399-404. 

[23] Teich, J., Sriram, S., Thiele, L., and Martin, M. Performance analysis and optimization of mixed asynchronous 

synchronous systems. IEEE Trans. Computer-Aided Design 16, 5 (May 1997), 473-84. 

[24] Yang, S. Logic synthesis and optimization benchmarks user guide version 3.0. Tech. rep., Microelectronics 

Center of North Carolina, Jan. 1991. 

[25] Young, N. E., Tarjan, R. E., and Orlin, J. B. Faster parametric shortest path and minimum-balance algorithms. 

Networks 21 (1991), 205-21.

DAC'99, page 43 

IP-Based Design Methodology 

Daniel D. Gajski 

University of California, Irvine, Clifornia 92697 

Silicon capacity is doubling every 18 months and allowing more complex systems to be built on 

a single chip of silicon. However, our capability in designing such complex systems in 

reasonable time is diminishing with complexity. This gap between capacity and productivity 

seems to be growing and treathening to slow down the growth of semiconductor industry. The 

solutions to productivity problem are in increasing the abstraction levels in design technology 

and tools and introducing reuse of components and system parts. The reuse, on the other hand, 

has generated a whole new branch of semiconductor industry, called IP business, which includes 

IP providers, IP brokers, IP tool makers, IP services, and IP integrators. However, it is 

unreasonable to believe that IP community can integrate all possible designs, in all possible 

technologies, for all possible systems using all possible tools and languages developed so far. As 

we know, the solutions that integrate all possible ideas rarely work. That is the reason why IP 

community is in search for an efficient IP-based methodology. 

There are basically two approaches to the above problem: bottom-up and top-down. The first 

school of thought believes that present CAD tools and methods are sufficient and that IP 

community just have to define some standards in information exchange and train designers to 

follow the guidelines. The other school of thought believes that present methodology in 

designing systems on silicon must be changed to accomodate IP business. The change must 

include, they believe, the way we specify different models of computation, the way we model 

systems for IP insertion and replaceme nt and development of new CAD tools for synthesis and 

test of systems with IPs. 

In this presentation we will cover both schools of thought and explain the advanatages and 

disandvateges of each approach. We will start with requirements for IP-based design, explain 

the obstacles to success for IP community, and indentify problems to be solved to remove those 

obstacles. We will review briefly the present status and make some prediction for the future in 

conclusion.


IPCHINOOK: An Integrated IP-based Design Framework for Distributed Embedded 

Systems 

Pai Chou*, Ross Ortega, Ken Hines, Kurt Partridge, and Gaetano Borriello 

*Consystant Design Technologies, Inc., Seattle, WA 

Department of Computer Science and Engineering, 

University of Washington, Seattle, WA 98195-2350 USA 

Abstract 

IPCHINOOK is a design tool for distributed embedded systems. It gains leverage from the use of 

a carefully chosen set of design abstractions that raise the level of designer interaction during the 

specification, synthesis, and simulation of the design. IPCHINOOK focuses on a componentbased 

approach to system building that enhances the ability to reuse existing software modules. 

This is accomplished through a new model for constructing components that enables 

composition of control-flow as well as data-flow. The designer then maps the elements of the 

specification to a target architecture: a set of processing elements and communication channels. 

IPCHINOOK synthesizes all of the detailed communication and synchronization instructions. 

Designers get feedback via a co-simulation engine that permits rapid evaluation. By shortening 

the design cycle, designers are able to more completely explore the design space of possible 

architectures and/or improve time-to-market. IPCHINOOK is embodied in a system 

development environment that supports the design methodology by integrating a user interface 

for system specification, simulation, and synthesis tools. By raising the level of abstraction of 

specifications above the low-level target-specific implementation, and by automating the 

generation of these difficult and error-prone details, IPCHINOOK lets designers focus on global 

architectural and functionality decisions. 

References 

[1] BALBONI, A., FORNACIARI, W., AND SCIUTO, D. Cosynthesis and co-simulation of control-dominated 

embedded systems. Design Autmation for Embedded Systems (July 1996). 

[2] BERRY, G. Programming a digital watch in Esterel v3.2. Tech. Rep. 1032, Instut National de Recherche en 

Informatique et Automatique (INRIA), May 1989. 

[3] BERRY, G., RAMESH, S., AND SHYAMASUNDAR, R. K. Communicating reactive processes. In Conference 

Record of the Twentieth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages 

(January 1993), pp. 85–98. 

[4] BOLSENS, I., DEMAN, H. J., LIN, B., ROMPAEY, K. V., VERCAUTEREN, S., AND VERKEST, D. 

Hardware/software co-design of digital telecommunication systems. Proceedings of the IEEE 85, 3 (March 1997), 

391–418. 

[5] CHIODO, M., ENGELS, D., GIUSTO, P.,HSIEH, H., JURECSKA, A., LAVAGNO, L., SUZUKI, K., AND 

SANGIOVANNIVINCENTELLI, A. A case study in computer-aided co-design of embedded controllers. Design 

Automation for Embedded Systems 1, 1-2 (January 1996), 51–67. 

[6] CHOU, P. Control Composition and Synthesis of Distributed Real-Time Embedded Systems. PhD thesis, 

University of Washington, 1998. 

[7] CHOU, P., AND BORRIELLO, G. An analysis-based approach to composition of distributed embedded 

systems. In Proc. International Workshop on Hardware/Software Codesign (CODES/CACHE) (1998). 

[8] CHOU, P., AND BORRIELLO, G. Modal processes: Towards enhanced retargetability through control 

composition of distributed embedded systems. In Proc. Design Automation Conference (June 1998), pp. 88–93. 

[9] CHOU, P., HINES, K., PARTRIDGE, K., AND BORRIELLO, G. Control generation for embedded systems 

based on composition of modal processes. In Proc. International Conference on Computer-Aided Design (1998). 

[10] CHOU, P., ORTEGA, R., AND BORRIELLO, G. Synthesis of the hardware/software interface in 

microcontroller-based systems. In Proc. International Conference on Computer-Aided Design (1992), pp. 488–495.

[11] CHOU, P., ORTEGA, R., AND BORRIELLO, G. Interface co-synthesis techniques for embedded systems. In 

Proc. International Conference on Computer-Aided Design (1995), pp. 280–287. 

[12] DAVEAU, J.-M., MARCHIORO, G. F., BEN-ISMAIL, T., AND JERRAYA, A. A. Protocol selection and 

interface generation for hw-sw codesign. IEEE Transactions on VLSI Systems 5, 1 (March 1997), 136–144. 

[13] ERNST, R., HENKEL, J., BENNER, T., YE, W., HOLTMANN, U., HERRMANN, D., AND TRAWNY, M. 

The COSYMA environment for hardware/software cosynthesis of small embedded systems. Microprocessors and 

Microsystems 20, 3 (May 1996), 159–166. 

[14] HAREL, D. StateCharts: a visual formalism for complex systems. Science of Programming 8, 3 (June 1987), 

231–274. 

[15] HINES, K., AND BORRIELLO, G. Dynamic communication models in embedded system co-simulation. In 

Proc. Design Automation Conference (June 1997), pp. 395–400. 

[16] HINES, K., AND BORRIELLO, G. Optimizing communication in hardware-software co-simulation. In 

Codes/CASHE ' 97 (1997), IEEE, ACM. 

[17] HINES, K., AND BORRIELLO, G. Debugging distributed implementations of modal process systems. Lecture 

Notes in Computer Science 1474 (1998), 98–107. 

[18] HINES, K., AND BORRIELLO, G. A geographically distributed framework for embedded system design and 

validation. In Proc. Design Automation Conference (June 1998), pp. 140–145. 

[19] ISMAIL, T. B., AND JERRAYA, A. A. Synthesis steps and design models for codesign. IEEE Computer 28, 2 

(February 1995), 44–53. 

[20] ISO 11898. Road vehicles - Interchange of Digital Information - Controller Area Network (Can) for High- 

Speed Communication, 1st ed., 1993. 

[21] ORTEGA, R., AND BORRIELLO, G. Communication synthesis for distributed embedded systems. In Proc. 

International Conference on Computer-Aided Design (1998). 

[22] PASSERONE, C., LAVAGNO, L., CHIODO, M., AND SANGIOVANNI-VINCENTELLI, A. Fast 

hardware/software co-simulation for virtual prototyping and trade-off analysis. In Proc. Design Automation 

Conference (1997), pp. 389–394. 

[23] ROWSON, J. Hardware/software co-simulation. In Proceedings of the Design Automation Conference (1994), 

pp. 439–440. 

[24] ROWSON, J. A., AND SANGIOVANNI-VINCENTELLI, A. Interface-based design. In Proceedings of the 

Design Automation Conference (June 1997), pp. 178–83. 

[25] SELIC, B., GULLEKSON, G., AND WARD, P. T. Real-Time Object-Oriented Modeling. Wiley, 1994. 

[26] VALDERRAMA, C. A., NACABAL, F., PAULIN, P., AND JERRAYA, A. A. Automatic generation of 

interfaces for distributed C-VHDL cosimulation of embedded systems: an industrial experience. In 7th International 

Workshop on Rapid Systems Prototyping (June 1996). 

[27] WubbleU hand held PDA benchmark for co-design, http://www.it.dtu.dk/jan/WubbleU.

DAC'99, pages 50- 55 

Virtual Simulation of distributed IP-based designs 

Marcello Dalpasso, Alessandro Bogliolo*, Luca Benini* 

DEI - Università di Padova, Via Gradenigo, 6/A - 35131 Padova, Italy 

*DEIS - Università di Bologna, Viale Risorgimento, 2 - 40136 Bologna, Italy 

Abstract 

One key issue in design flows based on reuse of third-party intellectual property (IP) components 

is the need to estimate the impact of component instantiation within complex designs. In this 

paper we introduce JavaCAD, an internet-based EDA tool built on a secure client-server 

architecture that enables designers to perform simulation and cost estimation of circuits 

containing IP components without actually purchasing them. At the same time, the tool ensures 

intellectual property protection for the vendors of IP components, and for the IP-users as well. 

Moreover, JavaCAD supports negotiation of the amount of information and the accuracy of cost 

estimates, thereby providing seamless transition between IP evaluation and purchase. 

References 

[1] A. Bedenfeld and R. Camposano. Tool integration and construction using generated graph-based design 

representation. Proc. of the Design Automation Conference, pages 94-99, 1995. 

[2] A. Bogliolo, L. Benini, D. De Micheli and B. Riccò. Power and Current Estimation of Cell-Based CMOS 

Circuits. IEEE Transactions on VLSI Systems, pages 473-488, 1997. 

[3] D. Lidsky and J. Rabaey. Early power exploration - a World Wide Web application. Proc. of the Design 

Automation Conference, pages 27-32, 1996. 

[4] F. Chan, M. Spiller and R. Newton. WELD - An environment for Web-based electronic design. Proc. of the 

Design Automation Conference, pages 146-151, 1998. 

[5] H. Lavana, A. Khetawat, F. Brglez and K. Kozminski. Executable workows: a paradigm for collaborative design 

on the Internet. Proc. of the Design Automation Conference, pages 553-558, 1997. 

[6] J. Gosling, B. Joy and G. Steele. The Java Language Specification. Addison-Wesley, 1996. 

[7] J. Young et al. Design and specification of embedded systems in Java using successive, formal refinement. Proc. 

of the Design Automation Conference, pages 70-75, 1998. 

[8] L. Benini, A. Bogliolo and G. De Micheli. Distributed EDA tool integration: the PPP paradigm. Proc. of the 

International Conference on Computer Design, pages 448-453, 1996. 

[9] L. Geppert. IC Design on the World Wide Web. IEEE Spectrum, June 1998. 

[10] L. Gong. The Java Security Model and Architecture. Addison-Wesley, announced. 

[11] M. J. Silva and R. H. Katz. The case for design using the World Wide Web. Proc. of the Design Automation 

Conference, pages 579-585, 1995. 

[12] M. Spiller and R. Newton. EDA and the Network. Proc. of the International Conference on Computer-Aided 

Design, pages 470-475, 1997. 

[13] P. Chan. The Java Developers Almanac. Addison-Wesley, 1998. 

[14] P. G. Ploger et al. WWW Based structuring of codesigns. Proc. of the International Symposium on System 

Synthesis, pages 138-143, 1995. 

[15] R. Helaihel and K. Olukotun. Java as a specification language for hardware-software systems. Proc. of the 

International Conference on Computer-Aided Design, pages 690-697, 1997. 

[16] S. Hauck and S. Knoll. Data security for Web-based CAD. Proc. of the Design Automation Conference, pages 

788-793, 1998. 

[17] T. J. Barnes et al. Electronic CAD frameworks. Kluwer Academic Publishers, 1992.


Common-Case Computation: A High-Level Technique for Power and Performance 

Optimization 

Ganesh Lakshminarayana †, Anand Raghunathan †, Kamal S. Khouri ‡, Niraj K. Jha‡, 

and Sujit Dey§ 

† CCRL-NEC USA, 

‡ Dept. of Electrical Engg., Princeton University 

§ Dept. of Electrical Engg., Univ. of California, San Diego 

Abstract 

This paper presents a design methodology, called common-case computation (CCC), and new 

design automation algorithms for optimizing power consumption or performance. The proposed 

techniques are applicable in conjunction with any high-level design methodology where a 

structural register-transfer level (RTL) description and its corresponding scheduled behavioral 

(cycle-accurate functional RTL) description are available. It is a well-known fact that in 

behavioral descriptions of hardware (also in software), a small set of computations (CCCs) often 

accounts for most of the computational complexity. However, in hardware implementations 

(structural RTL or lower level), CCCs and the remaining computations are typically treated 

alike. This paper shows that identifying and exploiting CCCs during the design process can lead 

to implementations that are much more efficient in terms of power consumption or performance. 

We propose a CCC-based high-level design methodology with the following steps: extraction of 

common-case behaviors and execution conditions from the scheduled description, simplification 

of the common-case behaviors in a stand-alone manner, synthesis of common-case detection and 

execution circuits from the common-case behaviors, and composing the original design with the 

common-case circuits, resulting in a CCC-optimized design. We demonstrate that CCCoptimized 

designs reduce power consumption by up to 91.5%, or improve performance by up to 

76.6% compared to designs derived without special regard for CCCs. 

References 

[1] D. D. Gajski, N. D. Dutt, A. C.-H. Wu, and S. Y.-L. Lin, High-level Synthesis: Introduction to Chip and System 

Design, Kluwer Academic Publishers, Norwell, MA, 1992. 

[2] G. De Micheli, Synthesis and Optimization of Digital Circuits, McGraw-Hill, New York, NY, 1994. 

[3] D. A. Patterson and J. L. Hennessy, Computer Architecture: A Quantitative Approach, Morgan Kaufman 

Publishers, San Mateo, CA, 1989. 

[4] J. A. Fisher, “Trace scheduling: A technique for global microcode compaction,” IEEE Trans. Computers, vol. C- 

30, pp. 478–490, July 1981. 

[5] M. Aldina, J. Monteiro, S. Devadas, A. Ghosh, and M. Papaefthymiou, “Precomputation-based sequential logic 

optimization for low power,” IEEE Trans. VLSI Systems, vol. 2, pp. 426–436, Dec. 1994. 

[6] L. Benini, E. Macii, M. Poncino, and G. De Micheli, “Telescopic units: A new paradigm for performance 

optimization of VLSI designs,” IEEE Trans. Computer-Aided Design, vol. 17, pp. 220–232, Mar. 1998. 

[7] S. K. Bommu, N. O’Neill, and M. Ciesielski, “Retiming based factorization for sequential logic optimization,” 

ACM Trans. Design Automation Electronic Systems, to appear, 1998. 

[8] A. Raghunathan, N. K. Jha, and S. Dey, High-level Power Analysis and Optimization, Kluwer Academic 

Publishers, Norwell, MA, 1998. 

[9] H. Trickey, “Flamel: A high-level hardware compiler,” IEEE Trans. Computer-Aided Design, vol. 6, pp. 259– 

269, Mar. 1987. 

[10] A. P. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, and R. Brodersen, “Optimizing power using 

transformations,” IEEE Trans. Computer-Aided Design, vol. 14, pp. 12–31, Jan. 1995. 

[11] G. Casella and R. L. Berger, Statistical Inference, Duxbury Press, Belmont, CA, 1990.

[12] L. Benini and G. De Micheli, Dynamic Power Management: Design Techniques and CAD Tools, Kluwer 

Academic Publishers, Norwell, MA, 1997. 

[13] A. Chatterjee and R. K. Roy, “Synthesis of low power DSP circuits using activity metrics,” in Proc. Intl. Conf. 

VLSI Design, pp. 255–270, Jan. 1994. 

[14] G. Lakshminarayana and N. K. Jha, “FACT: A framework for the application of throughput and power 

optimizing transformations to control-flow intensive behavioral descriptions,” in Proc. Design Automation Conf., 

pp. 102–107, June 1998. 

[15] OpenCAD V 5 Users Manual, NEC Electronics, Inc., Sept. 1997.


Layout Techniques Supporting the Use of Dual Supply Voltages for Cell-Based Designs 

Chingwei Yeh, Yin-Shuin Kang, Shan-Jih Shieh, Jinn-Shyan Wang 

EE Dept., Nat’l Chung-Cheng Univ., Chiayi 621, Taiwan, ROC 

Abstract 

Gate-level voltage scaling is an approach that allows different supply voltages for different gates 

in order to achieve power reduction. Previous researches focused on determining the voltage 

level for each gate and ascertaining the power saving capability of the approach via logic-level 

power estimation. In this paper, we present the layout techniques that feasiblize the approach in 

cell-based design environment. A new block layout style is proposed to support the voltage 

scaling with conventional standard cell libraries. The block layout can be automatically 

generated via a simulated annealing based placement algorithm. In addition, we propose a new 

cell layout style with built-in multiple supply rails. Using the cell layout, gate-level voltage 

scaling can be immediately embedded in a typical cell-based design flow. Experimental results 

show that proposed techniques produce very promising results. 

References 

[1] Chandrakasan, A. P. and Brodersen, R. W. Low-power CMOS digital design. Kluwer Academic Publishers, 

1995. 

[2] Chang, J. M. and Pedram, M. Energy minimization using multiple supply voltages. Proc. 1996 Int. Symp. on 

Low Power Electronics and Design, pp. 157-162, 1996. 

[3] Kirkpatrick, S. et. al., Optimization by Simulated Annealing. Science, Vol. 220, May 1983, pp. 671-680. 

[4] Raje, S. and Sarrafzadeh, M. Variable voltage scheduling. Int. Symp. on Low Power Design, 1995, pp. 9-14. 

[5] Deng, C. Power analysis for CMOS/BiCMOS circuits. Proc. Int. Workshop on Low Power Design, Apr. 1994, 

pp. 3-8. 

[6] Johnson, M. C. and Roy, K. Optimal selection of supply voltages and level conversions during data path 

scheduling under resource constraints. Proc. Int. Conf. on Computer Design, pp. 72-77, 1996. 

[7] Johnson, M. C. and Roy, K. Scheduling and optimal voltage selection for low power multi-voltage DSP 

datapaths. Proc. Int. Symp. on Circuits and Systems, vol. 3, pp. 2152-2155, 1997. 

[8] Sechen, C. et al., TimberWolf: mixed macro/standard cell floorplanning, placement, and routing package. Yale 

University, May 1, 1992. 

[9] Usami, K. and Horowitz, M. Clustered voltage scaling technique for low-power design. Int. Symp. on Low 

Power Design, 1995, pp. 3-8. 

[10] Usami, K. et al. Automated Low-Power Technique Exploiting Multiple Supply Voltages Applied to a Media 

Processor. IEEE J. Solid-State Circuits, vol. 33, No. 3, Mar. 1998, pp. 463-472. 

[11] Uehara, T. and van Cleemput, W. M. Optimal layout of CMOS functional arrays. IEEE Trans. Comput., vol. C- 

30, pp. 305-312, May 1981. 

[12] Chang, M. C., Master Thesis, EE Dept., Nat’l Chung-Cheng Univ., Jul. 1997.


Gate-Level Design Exploiting Dual Supply Voltages for Power-Driven Applications 

Chingwei Yeh, Min-Cheng Chang, Shih-Chieh Chang*, Wen-Bone Jone* 

EE & *CS, Nat’l Chung-Cheng Univ., Chiayi 621, Taiwan, ROC 

Abstract 

The advent of portable and high-density devices has made power consumption a critical design 

concern. In this paper, we address the problem of reducing power consumption via gate-level 

voltage scaling for those designs that are not under the strictest timing budget. We first use a 

maximum-weighted independent set formulation for voltage reduction on non-critical part of the 

circuit. Then, we use a minimum-weighted separator set formulation to do gate sizing and 

integrate the sizing procedure with a voltage scaling procedure to enhance power saving on the 

whole circuit. The proposed methods are evaluated using the MCNC benchmark circuits. and an 

average of 19.12% power reduction over the circuits having only one supply voltage has been 

achieved. 

References 

[1] A. P. Chandrakasan and R. W. Brodersen, Low-power CMOS digital design, Kluwer Academic Publishers, 1995. 

[2] T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Algorithms, Chap. 27, MIT Press, McGraw-Hill Book Co., 

1992. 

[3] D. Kagaris and S. Tragoudas, ”Maximum independent sets on transitive graphs and their applications in testing 

and CAD,” Proc. Int. Conf. on Computer-Aided Design, Nov. 1997. pp. 736-740, 

[4] S. Raje and M. Sarrafzadeh, ”Variable voltage scheduling,” Int. Symp. on Low Power Design, 1995, pp. 9-14. 

[5] J. D. Meindl, ”Low power microelectronics: retrospect and prospect,” Proc. IEEE, vol. 83, no. 4, Apr. 1995. 

[6] E. M. Sentovich et al, ”SIS : A System for Sequential Circuit Synthesis,” Technical report UCB/ERL M92/41, 

Univ. of California, Berkeley, May 1992. 

[7] D. Singh et al., ”Power conscious CAD tools and methodologies: a perspective,” Proc. IEEE, vol. 83, no. 4, Apr. 

1995, pp. 570-593. 

[8] K. Usami and M. Horowitz, ”Clustered voltage scaling technique for low-power design” Int. Symp. on Low 

Power Design, 1995, pp. 3-8. 

[9] K. Usami et al., ”Automated Low-Power Technique Exploiting Multiple Supply Voltages Applied to a Media 

Processor” IEEE J. Solid-State Circuits, vol. 33, No. 3, Mar. 1998, pp. 463-472. 

[10] Wang, J. S., Shieh, S. J., Wang, J. C., and Yeh, C. Design of standard cells used in low power ASICs exploiting 

multiple-supply-voltage scheme. Proc. 11th ASIC Conf., Sept. 1998.


SYNTHESIS OF LOWPOWER CMOS VLSI CIRCUITS USING DUAL 

SUPPLY VOLTAGES 

Vijay Sundararajan, Keshab K. Parhi 

Dept. of ECE, University of Minnesota, Minneapolis, MN 55455 

ABSTRACT 

Dynamic power consumed in CMOS gates goes down quadratically with the supply voltage. By 

maintaining a high supply voltage for gates on the critical path and by using a low supply voltage 

for gates off the critical path it is possible to dramatically reduce power consumption in CMOS 

VLSI circuits without performance degradation. Interfacing gates operating under multiple 

supply voltages, however, requires the use of level converters, which makes the problem 

modeling difficult. In this paper we develop a formal model and develop an efficient heuristic for 

addressing the use of two supply voltages for low power CMOS VLSI circuits without 

performance degradation. Power consumption savings up to 25% over and above the best known 

existing heuristics are demonstrated for combinational circuits in the ISCAS85 benchmark suite. 

REFERENCES 

[1] A. P. Chandrakasan and R. W. Brodersen, “Low Power CMOS Digital Design,” IEEE Journal of Solid State 

Circuits, vol. 27, pp. 473–484, April. 1992. 

[2] S. Mutoh et al., “1-V Power Supply High-Speed Digital Circuits Technology with Multithreshold-voltage 

CMOS,” IEEE Journal of Solid State Circuits, vol. 30, pp. 847–854, Aug. 1995. 

[3] T. Kuroda et al., “A High-Speed Low-Power 0.3 mm CMOS Gate Array with Variable Threshold Voltage (VT) 

Scheme,” in Proc. IEEE Custom Integrated Circuits Conference, pp. 53–56, May 1996. 

[4] I. Mutsunori et al., “Low Power Design Method Using Multiple Supply Voltages,” Proceedings of ISLPED’97, 

pp. 36–41, 1997. 

[5] S. Raje and M. Sarrafzadeh, “Variable Voltage Scheduling,” Proceedings of ISLPD’95, pp. 9–14, 1995. 

[6] M. C. Johnson and K. Roy, “Optimal Selection of Supply Voltages and Level Conversions During Data Path 

Scheduling Under Resource Constraints,” Proceedings of ICCD’96, pp. 72–77, 1996. 

[7] J. Chang and M. Pedram, “Energy Minimization Using Multiple Supply Voltages,” IEEE Transactions on VLSI 

Systems, vol. 5, pp. 1–8, December 1997. 

[8] W. Shiue and C. Chakrabarthi, “Low Power Scheduling with Resources at Multiple Voltages,” in Proceedings of 

ISCAS-98, (Monterey, CA, USA), June 1998.


Panel: HW and SW in Embedded System Design: Loveboat, Shipwreck, 

or Ships Passing in the Night ? 

Moderator: Kurt Keutzer – University of California, Berkeley 

Panel Members: Jerry Fiddler, Raul Camposano, Alberto Sangiovanni-Vincentelli, Jim Lansford 

Abstract 

The merging of hardware and software on a single integrated circuit is causing many to rethink 

their approach to embedded system design, and some are forecasting significant changes in the 

dynamics of the associated electronic-design automation (EDA) and embedded software 

industries as well. 

For some, the ocean of silicon ahead is sure to host a loveboat for fully integrated 

hardware/software systems. In this scenario a unifying system-design-environment provides 

implementation-independent modeling that stands above the particulars of hardware and 

software implementation issues. At the implementation level, embedded software design tools 

and electronic-design automation tools work seamlessly together providing a variety of 

architectural targets for the functionality of high-level descriptions. 

Others see a shipwreck on the horizon, as hardheaded hardware designers and softheaded 

software developers clash. In this scenario the system-on-chip becomes a battleground in which 

the two respective communities attempt to dominate both the solution space, and the systemdesign 

budgets of their common customer. 

There's an old adage that the most boring scenario is the most likely. Following this adage some 

predict that the future holds more 'no show' than 'showdown'. The holders of this view note that 

as severe power constraints cause hardware-designers to eschew software solutions, and as the 

world of ubiquitous computing significantly broadens the market for embedded system software, 

then the hardware and software communities will be simply chips passing in the night.


Reliability-Constrained Area Optimization of VLSI Power/Ground 

Networks Via Sequence of Linear Programmings 

Xiang-Dong Tan†, C.-J. Richard Shi†, Dragos Lungeanu†, Jyh-Chwen Lee‡ and Li-Pen Yuan‡ 

†Department of Electrical Engineering, University of Washington, Seattle, WA 98195 

‡Avant! Corporation, USA Fremont, CA 94538, USA 

Abstract 

This paper presents a new method for determining the widths of the power and ground routes in 

integrated circuits so that the area required by the routes is minimized subject to the reliability 

constraints. The basic idea is to transform the resulting constrained nonlinear programming 

problem into a sequence of linear programs. Theoretically, we show that the sequence of linear 

programs always converges to the optimum solution of the relaxed convex problem. 

Experimental results demonstrate that the sequence-of-linear-programming method is orders of 

magnitude faster than the best-known method based on conjugate gradients, with constantly 

better optimization solutions. 

References 

[1] J. R. Black, “Electromigration failure modes in aluminum metallization for semiconductor devices,” in Proc. of 

IEEE, vol. 57, pp.1587-1597, Sept. 1996. 

[2] R. K. Brayton, G.D. Hatchtel and A. Sangiovanni-Vincentelli, “A survey of optimization techniques for 

integrated circuit design,” in Proc. of IEEE, vol. 69, no. 10, pp. 1334-1362, Oct. 1981. 

[3] M. S. Bazaraa, H. D. Sherali and C. M. Shetty, Nonlinear Programming: theory and algorithm, 2ed, John-Wiley 

& Sons, New York, 1993. 

[4] S. Chowdhury and M. A. Breuer, “Minimal area design of power/ground nets having graph topologies,” IEEE 

Trans. Circuits and Systems, vol. CAS-34, no. 12, pp. 1441–1451, Dec. 1987. 

[5] S. Chowdhury and M. A. Breuer, “Optimum design of IC power/ground networks subject to reliability 

constraints,” IEEE Trans. Computer-Aided Design, vol. 7, no. 7, pp. 787–796, July 1988. 

[6] S. Chowdhury, “Optimum design of reliable IC power networks having general graph topologies,” in Proc. 26th 

ACM/IEEE Design Automation Conf., pp. 787–790, 1989. 

[7] R. Dutta and M. Marek-Sadowska, “ Automatic sizing of power/ground (P/G) networks VLSI,” in Proc. 26th 

ACM/IEEE Design Automation Conf., pp. 783–786, 1989. 

[8] R. E. Griffith and R. A. Stewart, “A nonlinear programming technique for the optimization of continuous 

process systems,” Management Science, no. 7, pp. 379-392, 1961. 

[9] T. Mitsuhashi and E. S. Kuh, “Power and ground network topology optimization,” in Proc. 29th ACM/IEEE 

Design Automation Conf., pp. 524–529, 1992. 

[10] C. H. Papadimitriou and K. Steiglitz, Combinatorial Optimization Algorithms and Complexity, Printice-Hall 

Inc., New York, 1992


FAR-DS: Full-plane AWE Routing with Driver Sizing 

Jiang Hu and Sachin S. Sapatnekar 

Department of Electrical and Computer Engineering, 

University of Minnesota, Minneapolis, MN 55455, USA 

Abstract 

We propose a Full-plane AWE Routing with Driver Sizing (FAR-DS) algorithm for performance 

driven routing in deep sub-micron technology. We employ a fourth order AWE delay model in 

the full plane, including both Hanan and non-Hanan points. Optimizing the driver size 

simultaneously extends our work into a two-dimensional space, enabling us to achieve the 

desired balance between wire and driver cost reduction, while satisfying the timing constraints. 

Compared to SERT, experimental results showed that our algorithm can provide an average 

reduction of 23% in the wire cost and 50% in the driver cost under stringent timing constraints. 

References 

[1] K. D. Boese, A. B. Kahng, B. A. McCoy and G. Robins, “Near-optimal critical sink routing tree constructions,” 

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 14, No. 12, pp. 1417-36, 

Dec. 1995. 

[2] J. Lillis and P. Buch, “Table-lookup methods for improved performance-driven routing,” Proceedings of the 

ACM/IEEE Design Automation Conference, pp. 368–373, 1998. 

[3] J. Cong and C. K. Koh, “Interconnect layout optimization under higher-order RLC model,” Proceedings of the 

IEEE/ACM International Conference on Computer-Aided Design, pp. 713-720, 1997. 

[4] J. Lillis, C. K. Cheng, T. T. Lin and C. Y. Ho, “New performance driven routing techniques with explicit 

area/delay tradeoff and simultaneous wire sizing,” Proceedings of the 33rd ACM/IEEE Design Automation 

Conference, pp. 395-400, Jun. 1996. 

[5] F. J. Liu, J. Lillis and C. K. Cheng, “Design and implementation of a global router based on a new layout-driven 

timing model with three poles,” Proceedings of the IEEE International Symposium on Circuits and Systems, 1997. 

[6] S. S. Sapatnekar, “RC interconnect optimization under the Elmore delay model,” Proceedings of the ACM/IEEE 

Design Automation Conference, pp. 392-396, 1994. 

[7] H. Hou and S. S. Sapatnekar, “Routing tree topology construction to meet interconnect timing constraints”, 

Proceedings of the International Symposium on Physical Design, pp. 205-210, 1998. 

[8] W. C. Elmore, “The transient response of damped linear network with particular regard to wideband amplifiers,” 

Journal of Applied Physics, Vol. 19, pp. 55-63, 1948. 

[9] L. T. Pillage and R. A. Rohrer, “Asymptotic waveform evaluation for timing analysis,” IEEE Transactions on 

Computer-Aided Design of Integrated Circuits and Systems, Vol. 9, No. 4, pp. 352-366, Apr. 1990. 

[10] J. Qian, S. Pullela and L. T. Pillage, “Modeling the effective capacitance for the RC interconnect of CMOS 

gates,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 13, No. 12, pp. 

1526-35, Dec. 1994. 

[11] J. Rubinstein, P. Penfield and M. A. Horowitz, “Signal delay in RC tree networks,” IEEE Transactions on 

Computer-Aided Design, Vol. CAD-2, No. 3, pp. 202-211, July 1983. 

[12] R. Gupta, B. Krauter, B. Tutuianu, J. Willis and L. T. Pillage, “The Elmore delay as a bound for RC trees with 

generalized input signals,” Proc. 33rd ACM/IEEE Design Automation Conference, 1995. 

[13] C. L. Ratzlaff, N. Gopal and L. T. Pillage, “RICE: rapid interconnect circuit evaluator,” Proc. 28th ACM/IEEE 


[14] D. G. Luenberger, “Linear and Nonlinear Programming,” Addison-Wesley Publishing Company, Inc., 1984.


Noise-Constrained Performance Optimization by Simultaneous Gate andWire 

Sizing Based on Lagrangian Relaxation 

Hui-Ru Jiang 1 , Jing-Yang Jou 1 , and Yao-Wen Chang 2 

1 Department of Electronics Engineering, National Chiao Tung University, 

Hsinchu 30010, Taiwan 

2 Department of Computer and Information Science, National Chiao Tung University, 

Hsinchu 30010, Taiwan 

Abstract 

Noise, as well as area, delay, and power, is one of the most important concerns in the design of 

deep submicron ICs. Currently existing algorithms can not handle simultaneous switching 

conditions of signals for noise minimization. In this paper, we model not only physical coupling 

capacitance, but also simultaneous switching behavior for noise optimization. Based on 

Lagrangian relaxation, we present an algorithm that can optimally solve the simultaneous noise, 

area, delay, and power optimization problem by sizing circuit components. Our algorithm, with 

linear memory requirement overall and linear runtime per iteration, is very effective and 

efficient. For example, for a circuit of 6144 wires and 3512 gates, our algorithm solves the 

simultaneous optimization problem using only 2.1 MB memory and 47 minute runtime to 

achieve the precision of within 1% error on a SUN UltraSPARC-I workstation. 

References 

[1] H. B. Bakoglu, Circuits, Interconnections, and Packaging for VLSI, Addison-Wesley Pub. Company Inc., 1990. 

[2] C.-P. Chen, Y.-W. Chang and D. F.Wong, “Fast Performance-DrivenOptimization for Buffered Clock Trees 

Based on Lagrangian Relaxation,” Proc. DAC, pp. 405–408, June 1996. 

[3] C.-P. Chen, C. C. N. Chu and D. F.Wong, “Fast and Exact Simultaneous Gate and Wire Sizing by Lagrangian 

Relaxation,” Proc. ICCAD, pp. 617–624,Nov. 1998. 

[4] D.-S. Chen and M. Sarrafzadeh, “An Exact Algorithm for Low Power Library-Specific Gate Re-Sizing,” Proc. 

DAC, June 1996. 

[5] L. O. Chua, C. A. Desoer and E. S. Kuh, Linear and Nonlinear Circuits, McGraw-Hill Book Company, 1987. 

[6] A. Devgan, “EfficientCoupled Noise Estimation forOn-Chip Interconnects,” Proc. ICCAD, pp. 147–151,Nov. 

1997. 

[7] W. C. Elmore, “The Transient Response of Damped Linear Networks with Particular Regard to Wide Band 

Amplifiers,” J. Applied Physics, 19(1), 1948. 

[8] F. S. Hillier and G. J. Lieberman, Introduction to Operations Research, 5th ed., McGraw-Hill Publishing, 1990. 

[9] M. Marek-Sadowska, “Impact of Deep Sub-micron Technologies on Physical Design,” Lecture notes and Private 

Communication,Aug. 1998. 

[10] Y. Massoud, S.Majors, T. Bustami and J.White, “Layout Techniques for Minimizing On-Chip Interconnect Self 

Inductance,” Proc. DAC, pp. 566–571, June 1998. 

[11] M. Nemani and F. N. Najm, “High-Level Area and Power Estimation for VLSI Circuits,” Proc. ICCAD, pp. 

114–119,Nov. 1997. 

[12] J. Rabaey, Digital Integrated Circuits: A Design Perspective, Prentice-Hall, Inc., 1996. 

[13] K. L. Shepard, “Design Methodologies for Noise in Digital Integrated Circuits,” Proc. DAC, pp. 94–99, June 

1998. 

[14] H.-P. Tseng, L. Scheffer, and C. Sechen, “Timing and Crosstalk Driven Area Routing,” Proc. DAC, pp. 378– 

381, June 1998. 

[15] A. Vittal and M. Merek-Sadowska, “Crosstalk Reduction for VLSI,” IEEE Trans. CAD, pp. 290–298,Vol. 16, 

No. 3, Mar. 1997. 

[16] W. L. Winston, Operations Research: Applications and Algorithms, 3rd ed., Int Thomson Publishing, 1994. 

[17] T. Xue, E. S. Kuh and D. Wang, “Post Global Routing Crosstalk Risk Estimation and Reduction,” Proc. 

ICCAD, pp. 302–309, Nov. 1996.


Simultaneous Routing and Buffer Insertion with Restrictions on Buffer Locations 

Hai Zhou 1 , D.F. Wong 1 , I-Min Liu 2 , and Adnan Aziz 2 

1 Department of Computer Sciences, University of Texas, Austin, TX 78712 

2 Department of Electrical and Computer Engineering, University of Texas, Austin, TX 78712 

Abstract 

During the routing of global interconnects, macro blocks form useful routing regions which 

allow wires to go through but forbid buffers to be inserted. They give restrictions on buffer 

locations. In this paper, we take these buffer location restrictions into consideration and solve the 

simultaneous maze routing and buffer insertion problem. Given a block placement defining 

buffer location restrictions and a pair of pins (a source and a sink), we give a polynomial time 

exact algorithm to find a buffered route from the source to the sink with minimum Elmore delay. 

References 

[1] Semiconductor Industry Association. National technology roadmap for semiconductors, 1994. 

[2] H. B. Bakoglu. Circuits, Interconnections, and Packaging for VLSI. Addison-Wesley, 1990. 

[3] T. H. Cormen, C. E. Leiserson, and R. H. Rivest. Introduction to Algorithms. MIT Press, 1989. 

[4] E. W. Dijkstra. A note on two problems in connection with graphs. Numerische Math., 1:269-271, 1959. 

[5] W. C. Elmore. The transient response of dampled linear networks with particular regard to wide-band amplifiers. 

Journal of Applied Physics, 19(1):55-63, January 1948. 

[6] M. R. Garey and D. S. Johnson. Computers and Intractability. W. H. Freeman and Co., 1979. 

[7] L. N. Kannan, P. R. Suaris, and H.-G. Fang. A methodology and algorithms for post-placement delay 

optimization. In DAC, pages 327-332, 1994. 

[8] J. Lillis, C. K. Cheng, and T. T. Lin. Optimal and Efficient Buffer insertion and wire sizing. In CICC, pages 259- 

262, 1995. 

[9] T. Okamoto and J. Cong. Buffered Steiner tree construction with wire sizing for interconnect layout 

optimization. In ICCAD, pages 44-49, 1996. 

[10] A. H. Salek, J. Lou, and M. Pedram. A simultaneous routing tree construction and fanout optimization 

algorithm. In ICCAD, 1998. 

[11] L. P. P. P. van Ginneken. Buffer placement in distributed RC-tree networks for minimal Elmore delay. In 

ISCAS, pages 865-868, 1990.

DAC'99, pages 100-103 

Crosstalk Minimization using Wire Perturbations 

Prashant Saxena 

Strategic CAD Labs, Intel Corporation, Hillsboro, OR 97124 

C. L. Liu 

Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan, R.O.C. 

Abstract 

We study the variation of the crosstalk in a net and its neighbors when one of its trunks is 

perturbed, showing that the trunk's perturbation range can be efficiently divided into subintervals 

having monotonic or unimodal crosstalk variation. We can therefore determine the optimum 

trunk location without solving any non-linear equations. Using this, we construct and 

experimentally verify an algorithm to minimize the peak net crosstalk in a gridless channel. 

References 

[1] Bakoglu, H. B., Circuits, Interconnections and Packaging for VLSI, Addison-Wesley Publishing Company, 

1990. 

[2] Chaudhary, K., A. Onazawa and E. S. Kuh, “A Spacing Algorithm for Performance Enhancement and Crosstalk 

Reduction”, Proc. Intl. Conf. Computer-Aided Design, 697–702, 1993. 

[3] Deutsch, D. N., “A Dogleg Channel Router”, Proc. Design Automation Conf., 425–433, 1976. 

[4] Gao, T. and C. L. Liu, “Minimum Crosstalk Channel Routing”, IEEE Trans. Computer-Aided Design 15 (5), 

465–474, 1996. 

[5] Hall, H. S. and S. R. Knight, Higher Algebra, Macmillan & Co. Ltd., London, 4th ed., 1891. Reprinted, 1960. 

[6] Karnik, T., Intel Corp., Hillsboro, OR, Private communication, 1997. 

[7] Leong, H. W. and C. L. Liu, “A New Channel Router”, Proc. Design Automation Conf., 584–590, 1983. 

[8] Sakurai, T. and K. Tamaru, “Simple Formulas for Two- and Three-Dimensional Capacitances”, IEEE Trans. 

Electron Devices ED-30 (2), 183–185, 1983. 

[9] Saxena, P., The Retiming and Routing of VLSI Circuits, Ph.D. Thesis, Tech. Rep. UIUCDCS-R-98-2059, Dept. 

of Computer Science, Univ. of Illinois at Urbana-Champaign, 1998. 

[10] Wong, D. F. and C. L. Liu, “Compacted Channel Routing with Via Placement Restriction”, Integration 4 (4), 

267–307, 1986. 

[11] Yoshimura, T. and E. S. Kuh, “Efficient Algorithms for Channel Routing”, IEEE Trans. Computer-Aided 

Design CAD-1 (1), 25–35, 1982.

DAC'99, pages 104-109 

Practical Advances in Asynchronous Design and in Asynchronous/Synchronous Interfaces 

Erik Brunvand 

Dept. of Computer Science, University of Utah, SLC, Utah 84112 

Steven Nowick 

Dept. of Computer Science, Columbia University, New York, NY 10027 

Kenneth Yun 

Department of ECE, University of California, San Diego, CA 92093 

Abstract 

Asynchronous systems are being viewed as an increasingly viable alternative to purely 

synchronous systems. This paper gives an overview of the current state of the art in practical 

asynchronous circuit and system design in four areas: controllers, datapaths, processors, and the 

design of asynchronous/synchronous interfaces. 

References 

[1] S.S. Appleton, S.V. Morton, and M.J. Liebelt. Two-phase asynchronous pipeline control. In IEEE Int. Symp. on 

Advanced Research in Asynchronous Circuits and Systems, April 1997. 

[2] P.A. Beerel and T. Meng. Automatic gate-level synthesis of speed-independent circuits. In ICCAD, pages 581- 

586. IEEE Computer Society Press, November 1992. 

[3] M. Benes, S.M. Nowick, and A. Wolfe. A fast asynchronous Hu_man decoder for compressed-code embedded 

processors. In IEEE Int. Symp. on Advanced Research in Asynchronous Circuits and Systems, pages 43-56, 1998. 

[4] K. van Berkel, R. Burgess, J. Kessels, A. Peeters, M. Roncken, and F. Schalij. A fully-asynchronous low-power 

error corrector for the DCC player. IEEE JSSC, 29(12):1429-1439, December 1994. 

[5] J.G. Bredeson and P.T. Hulina. Elimination of static and dynamic hazards for multiple input changes in 

combinational switching circuits. Information and Control, 20:114-224, 1972. 

[6] E. Brunvand. The NSR processor. In Proceedings of the 26 th International Conference on System Sciences, Jan 

1993. 

[7] E. Brunvand and R.F. Sproull. Translating concurrent programs into delay-insensitive circuits. In ICCAD, pages 

262-265. IEEE Computer Society Press, November 1989. 

[8] S.M. Burns. General condition for the decomposition of state holding elements. In Int. Symp. on Advanced 

Research in Asynchronous Circuits and Systems, pages 48-57. IEEE Computer Society Press, November 1996. 

[9] S.M. Burns and A.J. Martin. Syntax-directed translation of concurrent programs into self-timed circuits. In 

Advanced Research in VLSI, pages 35-50. MIT Press, Cambridge, MA, 1988. 

[10] S. Chakraborty, D.L. Dill, and K.Y. Yun. Min-max timing analysis and its application to asynchronous circuits. 

Proceedings of the IEEE, 87(2), Feb 1999. 

[11] D.M. Chapiro. Globally-Asynchronous Locally-Synchronous Systems. PhD thesis, Stanford University, October 

1984. 

[12] T.-A. Chu. Synthesis of self-timed vlsi circuits from graph-theoretic specifications. Technical Report MIT- 

LCS-TR-393, MIT, 1987. Ph.D. Thesis. 

[13] W.A. Clark. Macromodular computer systems. In Spring Joint Computer Conference. AFIPS, April 1967. 

[14] W.A. Clark and C.E. Molnar. Macromodular system design. Technical Report 23, Computer Systems 

Laboratory, Washington University, April 1973. 

[15] W.S. Coates, J.K. Lexau, I.W. Jones, S.M. Fairbanks, and I. E. Sutherland. A FIFO data switch design 

experiment. In IEEE Int. Symp. on Advanced Research in Asynchronous Circuits and Systems, pages 4-17, 1998. 

[16] J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno, and A. Yakovlev. Methodology and tools for state 

encoding in asynchronous circuit synthesis. In DAC, June 1996. 

[17] A. Davis, B. Coates, and K. Stevens. Automatic synthesis of fast compact self-timed control circuits. In IFIP 

Working Conference on Asynchronous Design Methodologies, 1993. 

[18] A.L. Davis. The architecture and system method for DDM1: A recursively structured data-driven machine. In 

5th Annual Symp. on Computer Architecture, April 1978.

[19] P. Day and J.V. Woods. Investigation into micropipeline latch design styles. IEEE TVLSI, 3(2):264-272, June 

1995. 

[20] M.E. Dean. STRiP: A Self-Timed RISC Processor Architecture. PhD thesis, Stanford University, 1992. 

[21] J.C. Ebergen. A formal approach to designing delay-insensitive circuits. Distributed Computing, 5(3):107-119, 

1991. 

[22] E.B. Eichelberger. Hazard detection in combinational and sequential switching circuits. IBM Journal of 

Research and Development, 9(2):90-99, 1965. 

[23] R.M. Fuhrer, B. Lin, and S.M. Nowick. Symbolic hazard-free minimization and encoding of asynchronous 

finite state machines. In ICCAD, pages 604-611, November 1995. 

[24] S. Furber. Computing without clocks: Micropipelining the ARM processor. In Graham Birtwistle and Al Davis, 

editors, Asynchronous Digital Circuit Design, Workshops in Computing, pages 211-262. Springer-Verlag, 1995. 

[25] S.B. Furber and P. Day. Four-phase micropipeline latch control circuits. IEEE TVLSI, 4(2):247-253, June 1996. 

[26] S.B. Furber, P. Day, J.D. Garside, N.C. Paver, and J.V. Woods. A micropipelined ARM. In Proceedings of 

VLSI93, Grenoble, France, 1993. 

[27] S.B. Furber, J. D. Garside, S. Temple, J. Liu, P. Day, and N.C. Paver. AMULET2e: An asynchronous 

embedded controller. In IEEE Int. Symp. on Advanced Research in Asynchronous Circuits and Systems, April 1997. 

[28] S.B. Furber and J. Liu. Dynamic logic in four-phase micropipelines. In IEEE Int. Symp. on Advanced Research 

in Asynchronous Circuits and Systems, March 1996. 

[29] J. D. Garside, S. Temple, and R. Mehra. The AMULET2e cache system. In IEEE Int. Symp. on Advanced 

Research in Asynchronous Circuits and Systems, March 1996. 

[30] J.D. Garside, S.B. Furber, and S.-H. Chung. AMULET3 revealed. In IEEE Int. Symp. on Advanced Research in 

Asynchronous Circuits and Systems, April 1999. 

[31] R. Ginosar and R. Kol. Adaptive synchronization. In ICCD, pages 188-189, October 1998. 

[32] D. Harris and M.A. Horowitz. Skew-tolerant domino circuits. IEEE JSSC, 32(11):1702-1711, November 1997. 

[33] M.B. Josephs and J.T. Udding. An overview of D-I algebra. In HICSS, volume I, pages 329-338. IEEE 

Computer Society Press, January 1993. 

[34] D. Kearney and N.W. Bergmann. Bundled data asynchronous multipliers with data dependant computation 

times. In IEEE Int. Symp. on Advanced Research in Asynchronous Circuits and Systems, April 1997. 

[35] A. Kondratyev, M. Kishinevsky, B. Lin, P. Vanbekbergen, and A. Yakovlev. Basic gate implementation of 

speed-independent circuits. In DAC, pages 56-62. ACM, June 1994. 

[36] D.S. Kung. Hazard-non-increasing gate-level optimization algorithms. In ICCAD, pages 631-634, November 

1992. 

[37] L. Lavagno, C.W. Moon, R.K. Brayton, and A. Sangiovanni-Vincentelli. Solving the state assignment problem 

for signal transition graphs. In DAC, pages 568-572, June 1992. 

[38] L. Lavagno and A. Sangiovanni-Vincentelli. Algorithms for synthesis and testing of asynchronous circuits. 

Kluwer Academic, 1993. 

[39] B. Lin and S. Devadas. Synthesis of hazard-free multi-level logic under multiple-input changes from binary 

decision diagrams. In ICCAD, pages 542-549, Nov. 1994. 

[40] A. Marshall, B. Coates, and P. Siegel. The design of an asynchronous communications chip. IEEE Design and 

Test, 11(2):8-21, Summer 1994. 

[41] A. Martin, S. Burns, T.K. Lee, D. Borkovic, and P. Hazewindus. The design of an asynchronous 

microprocessor. In Proc. Cal Tech Conference on VLSI, 1989. 

[42] A.J. Martin. Programming in VLSI: From communicating processes to delay-insensitive circuits. In C.A.R. 

Hoare, editor, Developments in Concurrency and Communication, pages 1-64. Addison-Wesley, Reading, MA, 

1990. 

[43] A.J. Martin. Asynchronous datapaths and the design of an asynchronous adder. Formal Methods in System 

Design, 1(1):119-137, July 1992. 

[44] A.J. Martin, A. Lines, R. Manohar, M. Nystroem, P. Penzes, R. Southworth, and U. Cummings. The design of 

an asynchronous MIPS R3000 microprocessor. In Advanced Research in VLSI, September 1997. 

[45] G. Matsubara and N. Ide. A low power zero-overhead self-timed division and square root unit combining a 

single-rail static circuit with a dual-rail dynamic circuit. In IEEE Int. Symp. on Advanced Research in Asynchronous 

Circuits and Systems, April 1997. 

[46] T.H.-Y. Meng, R.W. Brodersen, and D.G. Messerschmitt. Automatic synthesis of asynchronous circuits from 

high-level specifications. IEEE TCAD, 8(11):1185-1205, November 1989. 

[47] S. Moore, P. Robinson, and S. Wilcox. Rotary pipeline processors. IEE Proceedings, Computers and Digital 

Techniques, 143(5), September 1996.

[48] C. Myers and T. Meng. Synthesis of Timed Asynchronous Circuits. IEEE TVLSI, 1(2):106-119, June 1993. 

[49] T. Nanya, Y. Ueno, H. Kagotani, M. Kuwako, and A. Takamura. TITAC: Design of a quasi-delay-insensitive 

microprocessor. IEEE Design & Test of Computers, 11(2):50-63, 1994. 

[50] S.M. Nowick. Automatic synthesis of burst-mode asynchronous controllers. Technical report, Stanford 

University, March 1993. Ph.D. Thesis (available as Stanford Univ. Cptr. Sys. Lab. tech report, CSL-TR-95-686, 

Dec. 95). 

[51] S.M. Nowick, M.E. Dean, D.L. Dill, and M. Horowitz. The design of a high-performance cache controller: a 

case study in asynchronous synthesis. INTEGRATION, the VLSI journal, 15(3):241-262, October 1993. 

[52] S.M. Nowick and D.L. Dill. Synthesis of asynchronous state machines using a local clock. In ICCD, pages 192- 

197. IEEE Computer Society Press, October 1991. 

[53] S.M. Nowick and D.L. Dill. Exact two-level minimization of hazard-free logic with multiple-input changes. 

IEEE TCAD, 14(8):986-997, August 1995. 

[54] S.M. Nowick, N.K. Jha, and F.-C. Cheng. Synthesis of asynchronous circuits for stuck-at and robust path delay 

fault testability. IEEE TCAD, 16(12):1514-1521, December 1997. 

[55] S.M. Nowick, K.Y. Yun, and P.A. Beerel. Speculative completion for the design of high-performance 

asynchronous dynamic adders. In IEEE Int. Symp. on Advanced Research in Asynchronous Circuits and Systems, 

April 1997. 

[56] N.C. Paver. The Design and Implementation of an Asynchronous Microprocessor. PhD thesis, University of 

Manchester, 1994. 

[57] W.F. Richardson and E. Brunvand. Precise exception handling for a self-timed processor. In ICCD, pages 32- 

37, Los Alamitos, CA, October 1995. IEEE Computer Society Press. 

[58] W.F. Richardson and E. Brunvand. Architectural considerations for a self-timed decoupled processor. IEE 

Proceedings, Computers and Digital Techniques, 143(5), September 1996. 

[59] W.F. Richardson and E. Brunvand. Fred: An architecture for a self-timed decoupled computer. In IEEE Int. 

Symp. on Advanced Research in Asynchronous Circuits and Systems, 1996. 

[60] S. Rotem, K. Stevens, R. Ginosar, P. Beerel, C. Myers, K. Yun, R. Kol, C. Dike, M. Roncken, and B. Agapiev. 

RAPPID: an asynchronous instruction length decoder. In IEEE Int. Symp. on Advanced Research in Asynchronous 

Circuits and Systems, 1999. 

[61] P. Siegel, G. De Micheli, and D. Dill. Technology mapping for generalized fundamental-mode asynchronous 

designs. In DAC, pages 61-67. ACM, June 1993. 

[62] R.F. Sproull, I.E. Sutherland, and C.E. Molnar. The counterflow pipeline processor architecture. IEEE Design 

& Test of Computers, 11(3):48-59, Fall 1994. 

[63] I.E. Sutherland. Micropipelines. CACM, 32(6):720-738, June 1989. 

[64] A. Takamura, M. Kuwako, M. Imai, T. Fujii, M. Ozawa, I. Fukasaku, U. Ueno, and T. Nanya. TITAC-2: an 

asynchronous 32-bit microprocessor based on scalable-delay-insensitive model. In ICCD, pages 288-294, October 

1997. 

[65] H. Terada, S. Miyata, and M. Iwata. Ddmps: Self-timed super-pipelined data-driven multimedia processors. 

Proceedings of the IEEE, 87(2), Feb 1999. 

[66] M. Theobald, S.M. Nowick, and T. Wu. Espresso-HF: a heuristic hazard-free minimizer for two-level logic. In 

DAC, pages 71-76, June 1996. 

[67] S. H. Unger. Asynchronous Sequential Switching Circuits. Wiley-Interscience, John Wiley & Sons, Inc., New 

York, 1969. 

[68] C.H. van Berkel and R.W.J.J. Saeijs. Compilation of communicating processes into delay-insensitive circuits. 

In ICCD, pages 157-162. IEEE Computer Society Press, 1988. 

[69] K. van Berkel, R. Burgess, J. Kessels, A. Peeters, M. Roncken, and F. Schalij. Asynchronous Circuits for Low 

Power: a DCC Error Corrector. IEEE Design & Test, 11(2):22-32, June 1994. 

[70] H. van Gageldonk. An asynchronous low-power 80C51 microcontroller. Technical report, Eindhoven 

University of Technology, Sept 1998. Ph.D. Thesis. 

[71] H. van Gageldonk, K. van Berkel, A. Peeters, D. Baumann, D. Gloor, and G. Stegmann. An asynchronous lowpower 

80C51 microcontroller. In IEEE Int. Symp. on Advanced Research in Asynchronous Circuits and Systems, 

April 1998. 

[72] V.I. Varshavsky, M.A. Kishinevsky, V.B. Marakhovsky, V.A. Peschansky, L.Y. Rosenblum, A.R. Taubin, and 

B.S. Tzirlin. Self-timed Control of Concurrent Processes. Kluwer 

Academic Publishers, 1990. Russian edition: 1986. 

[73] T. Williams, N. Patkar, and G. Shen. SPARC64: A 64-b 64-active-instruction out-of-order-execution MCM 

processor. IEEE JSSC, 30(11):1215-1226, November 1995.

[74] T.E. Williams. Self-Timed Rings and their Application to Division. PhD thesis, Stanford University, June 1991. 

[75] T.E. Williams and M.A. Horowitz. A zero-overhead self-timed 160ns 54b CMOS divider. IEEE JSSC, 

26(11):1651-1661, November 1991. 

[76] K.Y. Yun, P.A. Beerel, and J. Arceo. High-performance two-phase micropipeline building blocks: double edgetriggered 

latches and burst-mode select and toggle circuits. IEE Proceedings, Circuits, Devices and Systems, 

143(5):282-288, October 1996. 

[77] K.Y. Yun, P.A. Beerel, V. Vakilotojar, A.E. Dooply, and J. Arceo. The design and verification of a highperformance 

low-control-overhead asynchronous differential equation solver. IEEE TVLSI, 6(4):643-655, December 

1998. 

[78] K.Y. Yun, S. Chakraborty, K.W. James, R. Fairlie-Cuninghame, and R.L. Cruz. A self-timed real-time sorting 

network. In ICCD, pages 427-434, October 1998. 

[79] K.Y. Yun and D.L. Dill. Automatic synthesis of 3D asynchronous finite-state machines. In ICCAD, Nov. 1992. 

[80] K.Y. Yun and D.L. Dill. Unifying synchronous/asynchronous state machine synthesis. In ICCAD, pages 255- 

260. IEEE Computer Society Press, November 1993. 

[81] K.Y. Yun and D.L. Dill. A high-performance asynchronous SCSI controller. In ICCD, pages 44-49, Oct. 1995. 

[82] K.Y. Yun and D.L. Dill. Automatic synthesis of extended burst-mode circuits: part I (specification and hazardfree 

implementations). IEEE TCAD, 18(2):101-117, February 1999. 

[83] K.Y. Yun and D.L. Dill. Automatic synthesis of extended burst-mode circuits: part II (automatic synthesis). 

IEEE TCAD, 18(2):118-132, February 1999. 

[84] K.Y. Yun and R.P. Donohue. Pausible clocking: A first step toward heterogeneous systems. In ICCD, pages 

118-123, October 1996. 

[85] K.Y. Yun and A.E. Dooply. Optimal evaluation clocking of self-resetting domino pipelines. In Proc. of Asia 

and South Pacific Design Automation Conference, pages 121-124, January 1999.

DAC'99, pages 110-115 

Automatic synthesis and optimization of partially specified asynchronous systems 

Alex Kondratyev 1 , Jordi Cortadella 2 , Michael Kishinevsky 3 , Luciano Lavagno 4 , 

Alexander Yakovlev 5 

1 Univ. of Aizu, Japan 

2 Univ. Politècnica, Catalunya, Spain 

3 Intel Corp., USA 

4 Univ. of Udine, Italy 

5 Univ. of Newcastle upon Tyne, UK 

Abstract 

A method for automating the synthesis of asynchronous control circuits from high level (CSPlike) 

and/or partial STG (involving only functionally critical events) specifications is presented. 

The method solves two key subtasks in this new, more flexible, design flow: handshake 

expansion, i.e. inserting reset events with maximum concurrency, and event reshuffling under 

interface and concurrency constraints, by means of concurrency reduction. In doing so, the 

algorithm optimizes the circuit both for size and performance. Experimental results show a 

significant increase in the solution space explored when compared to existing CSP-based or 

STG-based synthesis tools. 

References 

[1] Kees van Berkel. Handshake Circuits: an Asynchronous Architecture for VLSI Programming, volume 5 of 

International Series on Parallel Computation. Cambridge University Press, 1993. 

[2] T.-A. Chu. Synthesis of Self-timed VLSI Circuits from Graph-theoretic Specifications. PhD thesis, MIT, June 

1987. 

[3] J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno, and A. Yakovlev. Automatic handshake expansion 

and reshuffling using concurrency reduction. In Workshop on Hardware Design and Petri Nets, pages 86–110, June 

1998. 

[4] Jordi Cortadella, Michael Kishinevsky, Alex Kondratyev, Luciano Lavagno, and Alex Yakovlev. Petrify: a tool 

for manipulating concurrent specifications and synthesis of asynchronous controllers. IEICE Transactions on 

Information and Systems, E80-D(3):315–325, 1997. 

[5] Bill Lin, Chantal Ykman-Couvreur, and Peter Vanbekbergen. A general state graph transformation framework 

for asynchronous synthesis. In Proc. European Design Automation Conference (EURO-DAC), pages 448–453. IEEE 

Computer Society Press, September 1994. 

[6] Alain J. Martin. Synthesis of asynchronous VLSI circuits. In J. Straunstrup, editor, Formal Methods for VLSI 

Design, chapter 6, pages 237–283. North-Holland, 1990. 

[7] T. Murata. Petri Nets: Properties, analysis and applications. Proceedings of the IEEE, pages 541–580, April 

1989. 

[8] Chris J. Myers and Teresa H.-Y. Meng. Synthesis of timed asynchronous circuits. IEEE Transactions on VLSI 

Systems, 1(2):106–119, June 1993. 

[9] Ad Peeters. Implementation of a parallel component in tangram. Personal communication, 1997.

DAC'99, pages 116-121 

CAD Directions for High Performance Asynchronous Circuits 

Ken Stevens 1 , Shai Rotem 1 , StevenM. Burns 1 , Jordi Cortadella 2 , 

Ran Ginosar 1;3 , Michael Kishinevsky 1 , and Marly Roncken 1 

1 Strategic CAD Labs, Intel Corporation, Hillsboro, OR, USA 

2 Universitat Politècnica de Catalunya, Barcelona, Spain 

3 VLSI Systems Research Center, Technion, Haifa, Israel 

Abstract 

This paper describes a novel methodology for high performance asynchronous design based on 

timed circuits and on CAD support for their synthesis using Relative Timing. This methodology 

was developed for a prototype iA32 instruction length decoding and steering unit called RAPPID 

("Revolving Asynchronous Pentium ® Processor Instruction Decoder") that was fabricated and 

tested successfully. Silicon results show significant advantages - in particular, performance of 

2.5-4.5 instructions per nS - with manageable risks using this design technology. RAPPID 

achieves three times faster performance and half the latency dissipating only half the power and 

requiring a minor area penalty as a comparable 400MHz clocked circuit. 

Relative Timing is based on user-defined and automatically extracted relative timing 

assumptions between signal transitions in a circuit and its environment. It supports the 

specification, synthesis, and verification of high-performance asynchronous circuits, such as 

pulse-mode circuits, that can be derived from an initial speed-independent specification. Relative 

timing presents a "middle-ground" between clocked and asynchronous circuits, and is a fertile 

area for CAD development. We discuss possible directions for future CAD development. 

References 

[1] Wendy Belluomini, Chris J.Myers, and H. Peter Hofstee. Verification of Delayed-Reset Domino Circuits using 

ATACS. In 1999 International Workshop on Timing Issues in the Specification and Synthesis of Digital Systems 

(TAU99), pages 39–44, Monterey, CA, March 1999. ACM/IEEE. 

[2] Kees van Berkel. Handshake Circuits: an Asynchronous Architecture for VLSI Programming, volume 5 of 

International Series on Parallel Computation. Cambridge University Press, 1993. 

[3] S.M. Burns. General condition for the decomposition of state holding elements. In Proc. International 

Symposium on Advanced Research in Asynchronous Circuits and Systems. IEEE Computer Society Press, March 

1996. 

[4] J. Cortadella, M. Kishinevsky, A. Kondratyev, L. Lavagno, A. Taubin, and A. Yakovlev. Lazy transition 

systems: application to timing optimization of asynchronous circuits. In Proceedings of the International 

Conference on Computer-Aided Design, pages 324–331, November 1998. 

[5] Henrik Hulgaard and Steven M. Burns. Bounded delay timing analysis of a class of CSP programs. Formal 

Methods in System Design, 11(3):265–294, October 1997. 

[6] M. Kishinevsky, J. Cortadella, and A. Kondratyev. Asynchronous interface specification, analysis and synthesis. 

In Proceedings of the Design Automation Conference, pages 2–7, June 1998. 

[7] Alain J. Martin. Synthesis of asynchronous VLSI circuits. In J. Straunstrup, editor, Formal Methods for VLSI 

Design, chapter 6, pages 237–283. North-Holland, 1990. 

[8] Chris J. Myers. Computer-Aided Synthesis and Verification of Gate-Level Timed Circuits. PhD thesis, Dept. of 

Elec. Eng., Stanford University, October 1995. 

[9] Radu Negulescu and Ad Peeters. Verification of speed-dependences in single-rail handshake circuits. In Proc. 

International Symposium on Advanced Research in Asynchronous Circuits and Systems, pages 159–170, 1998. 

[10] S. Rotem, K. S. Stevens, R. Ginosar, P. A. Beerel, C. J.Myers, K. Yun, R. Kol, C. Dike, M. Roncken, and B. 

Agapiev. RAPPID: An asynchronous instruction length decoder. In Proc. International Symposium on Advanced 

Research in Asynchronous Circuits and Systems, April 1999.

[11] K. S. Stevens, S. Rotem, and R. Ginosar. Relative timing. In Proc. International Symposium on Advanced 

Research in Asynchronous Circuits and Systems, April 1999. 

[12] Kenneth S. Stevens. Practical Verification and Synthesis of Low Latency Asynchronous Systems. PhD thesis, 

University of Calgary, Calgary, Alberta, September 1994. 

[13] Frank C. D. Young, Kenneth S. Stevens, and Robert P. Graham. Timed Logic Conformance and its 

Application. In 1999 International Workshop on Timing Issues in the Specification and Synthesis of Digital Systems 

(TAU99), pages 95–100, Monterey, CA, March 1999. ACM/IEEE. 

[14] Kenneth Yi Yun. Synthesis of Asynchronous Controllers for Heterogeneous Systems. PhD thesis, Stanford 

University, August 1994.

DAC'99, pages 122-127 

A Low Power Hardware/Software Partitioning Approach for Core-based Embedded 

Systems 

Jörg Henkel 

C&C Research Laboratories, NEC USA, Princeton, NJ 08540 

Abstract 

We present a novel approach that minimizes the power consumption of embedded core-based 

systems through hardware/ software partitioning. Our approach is based on the idea of mapping 

clusters of operations/instructions to a core that yields a high utilization rate of the involved 

resources (ALUs, multipliers, shifters, ...) and thus minimizing power consumption. Our 

approach is comprehensive since it takes into consideration the power consumption of a whole 

embedded system comprising a microprocessor core, application specific (ASIC) core(s), cache 

cores and a memory core. We report high reductions of power consumption between 35% and 

94% at the cost of a relatively small additional hardware overhead of less than 16k cells while 

maintaining or even slightly increasing the performance compared to the initial design. 

References 

[1] M. Keaton, P. Bricaud, Reuse Methodology Manual For System–On–A–Chip Designs, Kluwer Academic 


[2] TI’s 0.07 Micron CMOS Technology Ushers In Era of Gigahertz DSP and Analog Performance, Texas 

Instruments, Published in the Internet, http://www.ti.com/sc/docs/news/1998/98079.htm, 1998. 

[3] R.K. Gupta, Y. Zorian, Introducing Core-Based System Design, IEEE Design & Test of Computers Magazine, 

Vol. 13, No. 4, pp. 15–25. 1997. 

[4] F. Vahid, D.D. Gajski, J. Gong, A Binary–Constraint Search Algorithm for Minimizing Hardware during 

Hardware/Software Partitioning, IEEE/ACM Proc. of The European Conference on Design Automation (EuroDAC) 

1994, pp. 214–219, 1994. 

[5] R.K. Gupta and G.D. Micheli, System-level Synthesis using Reprogrammable Components, IEEE/ACM Proc. of 

EDAC’92, IEEE Comp. Soc. Press, pp. 2–7, 1992. 

[6] Z. Peng, K. Kuchcinski, An Algorithm for Partitioning of Application Specific System, IEEE/ACM Proc. of The 

European Conference on Design Automation (EuroDAC) 1993, pp. 316–321, 1993. 

[7] J. Madsen, P. V. Knudsen, LYCOS Tutorial, Handouts from Eurochip course on Hardware/Software Codesign, 

Denmark, 14.–18. Aug. 1995. 

[8] T. Y. Yen, W. Wolf, Multiple–Process Behavioral Synthesis for Mixed Hardware–Software Systems, 

IEEE/ACM Proc. of 8th. International Symposium on System Synthesis, pp. 4–9, 1995. 

[9] A. Kalavade, E. Lee, A Global Critically/Local Phase Driven Algorithm for the Constraint Hardware/Software 

Partitioning Problem, Proc. of 3rd. IEEE Int. Workshop on Hardware/Software Codesign, pp. 42–48, 1994. 

[10] I. Hong, D. Kirovski et al., Power Optimization of Variable Voltage Core-Based Systems, IEEE Proc. of 35th. 

Design Automation Conference (DAC98), pp.176-181, 1998. 

[11] B.P. Dave, G. Lakshminarayana, N.K. Jha, COSYN: Hardware-Software Co-Synthesis of Embedded Systems’ 

IEEE Proc. of 34th. Design Automation Conference (DAC97), pp.703-708, 1997. 

[12] V. Tiwari, S. Malik, A.Wolfe, Instruction Level Power Analysis and Optimization of Software, Kluwer 

Academic Publishers, Journal of VLSI Signal Processing, pp. 1–18, 1996. 

[13] Ch.Ta Hsieh, M. Pedram, G. Mehta, F.Rastgar, Profile-Driven Program Synthesis for Evaluation of System 

Power Dissipation, IEEE Proc. of 34th. Design Automation Conference (DAC97), pp.576-581, 1997. 

[14] P.-W. Ong, R.-H. Ynn, Power-Conscious Software Design – a framework for modeling software on hardware, 

IEEE Proc. of Symposium on Low Power Electronics, pp. 36–37, 1994. 

[15] T. Sato, M. Nagamatsu, H. Tago, Power and Performance Simulator: ESP and its Application for 100 MIPS/W 

Class RISC Design, IEEE Proc. of Symposium on Low Power Electronics, pp. 46–47, 1994. 

[16] A.W. Aho, R. Sethi and J.D. Ullmann,COMPILERS Principles, Techniques and Tools, Bell Telephone 

Laboratories, 1987.

[17] M. D. Hill, J. R. Laurus, A. R. Lebeck et al., WARTS: Wisconsin Architectural Research Tool Set, Computer 

Science Department University of Wiscocnsin. 

[18] P. Landman and J. Rabaey, Architectural Power Analysis: The Dual Bit Type Method, IEEE Transactions on 

VLSI Systems, Vol.3, No.2, June 1995.

DAC'99, pages 128-133 

Synthesis of Low-Overhead Interfaces for Power-Efficient Communication 

over Wide Buses 

L. Benini 1 , A. Macii 2 , E. Macii 2 , M. Poncino 2 , R. Scarsi 2 

1 Università di Bologna, Bologna, ITALY 40136 

2 Politecnico di Torino, Torino, ITALY 10129 

Abstract 

In this paper we present algorithms for the synthesis of encoding and decoding interface logic 

that minimizes the average number of transitions on heavily-loaded global bus lines. The 

approach automatically constructs low-transition activity codes and hardware implementation of 

encoders and decoders, given information on word-level statistics. We present an accurate 

method that is applicable to low-width buses, as well as approximate methods that scale well 

with bus width. Furthermore, we introduce an adaptive architecture that automatically adjusts 

encoding to reduce transition activity on buses whose word-level statistics are not known apriori. 

Experimental results demonstrate that our approach well outperforms low-power encoding 

schemes presented in the past. 

References 

[1] M. R. Stan, W. P. Burleson, “Bus-Invert Coding for Low-Power I/O," 

[2] L. Benini, G. De Micheli, E. Macii, D. Sciuto, C. Silvano, “Address Bus Encoding Techniques for System-Level 

Power Optimization," DATE-98, pp. 861-866, Feb. 1998. 

[3] E. Musoll, T. Lang, J. Cortadella, “Working-Zone Encoding for Reducing the Energy in Microprocessor Address 

Buses," IEEE Trans. on VLSI Systems, Vol. 6, No. 4, pp. 568-572, Dec. 1998. 

[4] M. R. Stan, W. P. Burleson, “Low-Power Encodings for Global Communication in CMOS VLSI," IEEE Trans. 

on VLSI Systems, Vol. 5 No. 4, pp. 444-455, Dec. 1997. 

[5] H. Mehta, R. M. Owens, M. J. Irwin, “Some Issues in Gray Code Addressing," GLS-VLSI-96, pp. 178-180, Mar. 

1996. 

[6] L. Benini G. De Micheli, E. Macii, M. Poncino, S. Quer, “Reducing Power Consumption of Core-Based Systems 

By Address Bus Encoding", IEEE Trans. on VLSI Systems, Vol. 6, No. 4, pp. 554-562, Dec. 1998. 

[7] S. Ramprasad, N. R. Shanbhag, I. N. Hajj, “Achievable Bounds on sIgnal Transition Activity," ICCAD-97, pp. 

126-131, Nov. 1997. 

[8] S. Ramprasad, N. R. Shanbhag, I. N. Hajj, “Signal Coding for Low Power: Fundamental Limits and Practical 

Realizations," ISCAS-98, pp. 1-4, Jun. 1998.

DAC'99, pages 134-139 

Power Conscious Fixed Priority Scheduling for Hard Real-Time Systems 

Youngsoo Shin and Kiyoung Choi 

School of Electrical Engineering, Seoul National University, Seoul 151-742, Korea 

Abstract 

Power efficient design of real-time systems based on programmable processors becomes more 

important as system functionality is increasingly realized through software. This paper presents a 

power-efficient version of a widely used fixed priority scheduling method. The method yields a 

power reduction by exploiting slack times, both those inherent in the system schedule and those 

arising from variations of execution times. The proposed run-time mechanism is simple enough 

to be implemented in most kernels. Experimental results show that the proposed scheduling 

method obtains a significant power reduction across several kinds of applications. 

References 

[1] C. L. Liu and J. W. Layland, “Scheduling algorithms for multiprogramming in a hard real time environment,” J. 

ACM, vol. 20, pp. 46–61, Jan. 1973. 

[2] J. Lehoczky, L. Sha, and Y. Ding, “The rate monotonic scheduling algorithm: exact characterization and average 

case behavior,” in Proc. IEEE Real-Time Systems Symposium, pp. 166–171, Dec. 1989. 

[3] M. Joseph and P. Pandya, “Finding response times in a real-time system,” The Computer J., vol. 29, pp. 390– 

395, Oct. 1986. 

[4] N. Audsley, A. Burns,M. Richardson, and A.Wellings, “Hard real-time scheduling: The deadline-monotonic 

approach,” in Proc. IEEE Workshop on Real-Time Operating Systems and Software, pp. 133–137, May 1991. 

[5] C. Park and A. C. Shaw, “Experiments with a program timing tool based on source-level timing schema,” IEEE 

Computer, pp. 48–57, May 1991. 

[6] S. Lim, Y. Bae, G. Jang, B. Rhee, S. Min, C. Park, H. Shin, K. Park, and C. Kim, “An accurate worst case timing 

analysis for RISC processors,” in Proc. IEEE Real-Time Systems Symposium, pp. 97–108, Dec. 1994. 

[7] Y. S. Li, S. Malik, and A. Wolfe, “Performance estimation of embedded software with instruction cache 

modeling,” in Proc. Int’l Conf. on Computer Aided Design, pp. 380–387, Nov. 1995. 

[8] R. Ernst and W. Ye, “Embedded program timing analysis based on path clustering and architecture 

classification,” in Proc. Int’l Conf. on Computer Aided Design, pp. 598–604, Nov. 1997. 

[9] S. Gary, “PowerPC: A microprocessor for portable computers,” IEEE Design & Test of Computers, pp. 14–23, 

Dec. 1994. 

[10] M. B. Srivastava, A. P. Chandrakasan, and R. W. Brodersen, “Predictive system shutdown and other 

architectural techniques for energy efficient programmable computation,” IEEE Trans. on VLSI Systems, vol. 4, pp. 

42–55, Mar. 1996. 

[11] C. Hwang and A. Wu, “A predictive system shutdown method for energy saving of event-driven computation,” 

in Proc. Int’l Conf. on Computer Aided Design, pp. 28–32, Nov. 1997. 

[12] M.Weiser, B.Welch, A. Demers, and S. Shenker, “Scheduling for reduced CPU energy,” in Proc. USENIX 

Symposium on Operating Systems Design and Implementation, pp. 13–23, 1994. 

[13] K. Govil, E. Chan, and H. Wasserman, “Comparing algorithms for dynamic speed-setting of a low-power 

CPU,” in Proc. ACM Int’l Conf. on Mobile Computing and Networking, pp. 13–25, Nov. 1995. 

[14] F. Yao, A. Demers, and S. Shenker, “A scheduling model for reduced CPU energy,” in Proc. IEEE Annual 

Foundations of Computer Science, pp. 374–382, 1995. 

[15] I. Hong, D. Kirovski, G. Qu, M. Potkonjak, and M. B. Srivastava, “Power optimization of variable voltage 

core-based systems,” in Proc. Design Automat. Conf., pp. 176–181, June 1998. 

[16] T. Ishihara and H. Yasuura, “Voltage scheduling problem for dynamically variable voltage processors,” in 

Proc. Int’l Symposium on Low Power Electronics and Design, pp. 197–202, Aug. 1998. 

[17] D. Katcher, H. Arakawa, and J. Strosnider, “Engineering and analysis of fixed priority schedulers,” IEEE 

Trans. on Software Eng., vol. 19, pp. 920–934, Sept. 1993. 

[18] A. Burns, K. Tindell, and A. Wellings, “Effective analysis for engineering realtime fixed priority schedulers,” 

IEEE Trans. on Software Eng., vol. 21, pp. 475–480, May 1995.

[19] T. Burd and R. Brodersen, “Processor design for portable systems,” Journal of VLSI Signal Processing, vol. 13, 

pp. 203–222, Aug. 1996. 

[20] T. Pering, T. Burd, and R. Brodersen, “The simulation and evaluation of dynamic voltage scaling algorithms,” 

in Proc. Int’l Symposium on Low Power Electronics and Design, pp. 76–81, Aug. 1998. 

[21] C. Locke, D. Vogel, and T. Mesler, “Building a predictable avionics platform in Ada: a case study,” in Proc. 

IEEE Real-Time Systems Symposium, Dec. 1991. 

[22] J. Liu, J. Redondo, Z. Deng, T. Tia, R. Bettati, A. Silberman, M. Storch, R. Ha, and W. Shih, “PERTS: A 

prototyping environment for real-time systems,” Tech. Rep. UIUCDCS-R-93-1802, University of Illinois, 1993. 

[23] N. Kim, M.Ryu, S. Hong, M. Saksena, C. Choi, and H. Shin, “Visual assessment of a real-time system design: a 

case study on a CNC controller,” in Proc. IEEE Real-Time Systems Symposium, Dec. 1996.

DAC'99, pages 140-145 

Memory Exploration for Low Power, Embedded Systems 

Wen-Tsong Shiue 

Arizona State University, Department of Electrical Engineering, Tempe, AZ 85287-5706 

Chaitali Chakrabarti 

Arizona State University, Department of Electrical Engineering, Tempe, AZ 85287-5706 

ABSTRACT 

In embedded system design, the designer has to choose an on-chip memory configuration that is 

suitable for a specific application. To aid in this design choice, we present a memory exploration 

strategy based on three performance metrics, namely, cache size, the number of processor cycles 

and the energy consumption. We show how the performance is affected by cache parameters 

such as cache size, line size, set associativity and tiling, and the off-chip data organization. We 

show the importance of including energy in the performance metrics, since an increase in the 

cache line size, cache size, tiling and set associativity reduces the number of cycles but does not 

necessarily reduce the energy consumption. These performance metrics help us find the 

minimum energy cache configuration if time is the hard constraint, or the minimum time cache 

configuration if energy is the hard constraint. 

Keywords: Design automation, Low power design, Memory hierarchy, Low power embedded 

systems, Memory exploration and optimization, Cache simulator, Off-chip data assignment. 

REFERENCES 

[1] P. R. Panda, N. D. Dutt, and A. Nicolau. “Data Cache Sizing for Embedded Processor Applications.” Technical 

Report ICS-TR-97-31, University of California, Irvine, June 1997. 

[2]P. R. Panda, N. D. Dutt, and A. Nicolau. “Architectural Exploration and Optimization of Local Memory in 

Embedded Systems.” International Symposium on System Synthesis (ISSS 97), Antwerp, Sept. 1997. 

[3] M. B. Kamble and K. Ghose, “Analytical Energy Dissipation Models for Low Power Caches”, International 

Symposium on Low Power Electronics and Design, 1997. 

[4] S. E. Wilton and N. Jouppi, “An Enhanced Access and Cycle Time Model for On-chip Caches”, Digital 

Equipment Corporation Western Research Lab, Tech. Report 93/5, 1994. 

[5] C. Su and A. Despain, “Cache Design Trade-offs for Power and Performance Optimization: A Case Study”, 

International Symposium on Low Power Electronics and Design, pages 63-68, 1995. 

[6] P. Hicks, M. Walnock, R. M. Owens, “Analysis of Power Consumption in Memory Hierarchies”, International 

Symposium on Low Power Electronics and Design, pages 239-242, 1997. 

[7] A. Thordarson, “Comparison of Manual and Automatic Behavioral Synthesis of MPEG Algorithm”, Master’s 

thesis, University of California, Irvine, 1995. 

[8] D. Kirovski, C. Lee, M. Potkonjak, and W. Mangione-Smith, “Application –Driven Synthesis of Core-based 

Systems”, In Proceedings of the IEEE/ACM International Conference on Computer Aided Design, pages 104-107, 

San Jose, CA, November 1997. 

[9] M. E. Wolf and M. Lam. “A Data Locality Optimizing Algorithm.” In proceedings of the SIGPLAN’9 

Conference on Programming Language Design and Implementation, pages 30-44, June 1991. 

[10] J. L. Hennessy and D. A. Patterson, “Computer Architecture A Quantitative Approach”, 2nd edition Morgan 

Kaufman Publishers, 1996. 

[11] J. Edler and M. D. Hill, “ Dinero IV Trace-Driven Uniprocessor Cache Simulator”, web site: 

http://www.neci.nj.nec.com/homepages/edler/d4 or http://www.cs.wisc.edu/~markhill/DineroIV.

DAC'99, pages 146-150 

Distributed Application Development with Inferno 

Ravi Sharma 

Inferno Network Software Solutions 

Bell Laboratories, Lucent Technologies, Freehold, NJ 07728 

ABSTRACT 

Distributed computing has taken a new importance in order to meet the requirements of users 

demanding information "anytime, anywhere." Inferno facilitates the creation and support of 

distributed services in the new and emerging world of network environments. These 

environments include a world of varied terminals, network hardware, and protocols. The 

Namespace is a critical Inferno concept that enables the participants in this network environment 

to deliver resources to meet the many needs of diverse users. This paper discusses the elements 

of the Namespace technology. Its simple programming model and network transparency is 

demonstrated through the design of an application that can have components in several different 

nodes in a network. The simplicity and flexibility of the solution is highlighted. 

Keywords: Inferno, InfernoSpaces, distributed applications, Styx, networking protocols. 

REFERENCES 

[1] Inferno Home Page. http://www.lucent.com/inferno. 

[2] Dorward, Sean M., et al, “The Inferno Operating System”, Bell Labs Technical Journal, Volume 2, Number 1 

(Winter 1997), pp. 5-18. 

[3] Mooken, Thomas, “Inferno, InfernoSpaces, and Distributed Computing”, Proceedings of the Embedded Systems 

Conference, Spring 1999, Chicago, IL. 

[4] Rau, Larry, “Inferno: One Hot OS”, BYTE, Volume 22, Issue 6 (June 1997), pp. 53-54. 

[5] Sharma, Ravi, “Inferno, Limbo take Java to coding task,” EE Times, January 1, 1997, p.60.

DAC'99, pages 151-156 

Embedded Application Design Using a Real-Time OS 

David Stepner, Nagarajan Rajan, David Hui 

Integrated Sysems, Inc., Sunnyvale, CA 

You read about it everywhere: distributed computing is the next revolution, perhaps relegating 

our desktop computers to the museum. But in fact the age of distributed computing has been 

around for quite a while. Every time we withdraw money from an ATM, start our car, use our 

cell phone, or microwave our dinner, microprocessors are at work performing dedicated 

functions. These are examples of just a very few of the thousands of "embedded systems." 

Until recently the vast majority of these embedded systems used 8- and 16-bit microprocessors, 

requiring little in the way of sophisticated software development tools, including an Operating 

System (OS). But the breaking of the $5 threshold for 32-bit processors is now driving an 

explosion in high-volume embedded applications. And a new trend towards integrating a full 

system-on-a-chip (SOC) promises a further dramatic expansion for 32-bit embedded applications 

as we head into the 21 st century…

DAC'99, pages 157-162 

The Jini Architecture: Dynamic Services in a Flexible Network 

Ken Arnold 

Sun Microsystems, Inc., Burlington, MA 01804 

ABSTRACT 

This paper gives an overview of the JiniTM architecture, which provides a federated infrastructure 

for dynamic services in a network. Services may be large or small. 

Keywords: Jini, Java, networks, distribution, distributed computing 

REFERENCES 

[1] The Jini Architecture Team, http://sun.com/jini/specs/. See also Arnold, K., O’Sullivan, B., Scheiffler, R.W., 

Waldo, J., and Wollrath, A. The Jini Specification, Addision-Wesley, in press. 

[2] Arnold, K. and Gosling, J., The Java Programming Language, Second Edition, Addison-Wesley, ISBN 0-201- 

31006-6. 

[3] Gosling, J., Joy, W., and Steele, G., The Java Language Specification, Addison-Wesley, ISBN 0-201-63451-1. 

[4] Lindholm, T. and Yellin, F., The Java Virtual Machine Specification, Addision-Wesley, ISBN 0-201-63452-X. 

[5] Carriero, N. and Gelernter, D., How to Write Parallel Programs: A Guide to the Perplexed, ACM Computing 

Surveys, Sept., 1989 

[6] The Object Management Group, Common Object Request Broker: Architecture and Specification, OMG 

Document Number 91.12.1 (1991) 

[7] Rogerson, D., y Microsoft Press (1997)

DAC'99, pages 163-168 

Verifying Large-Scale Multiprocessors Using an Abstract Verification Environment 

Dennis Abts, Mike Roberts 

Silicon Graphics Inc., Vector Systems Division, Chippewa Falls, WI 

Abstract 

The complexity of large-scale multiprocessors has burdened the design and verification process 

making complexity-effective functional verification an elusive goal. We propose a solution to the 

verification of complex systems by introducing an abstracted verification environment called 

Raven. We show how Raven uses standard C/C++ to extend the capability of contemporary 

discrete-event logic simulators. We introduce new data types and a diagnostic programming 

interface (DPI) that provide the basis for Raven. Finally, we show results from an interconnect 

router ASIC used in a large-scale multiprocessor. 

References 

[1] James Laudon and Daniel Lenoski, “The SGI Origin: A cc-NUMA Highly Scalable Server,” Proceedings of the 

24 th Annual International Symposium on Computer Architecture (ISCA-97), p. 241–251. 

[2] ´ Asgeir Th. Eiríksson, John Keen, Alex Silbey, Swami Venkataraman, and Michael Woodacre, “Origin System 

Design Methodology and Experience: 1M-gate ASICs and Beyond,” COMPCON-97. 

[3] John Keen and Jon Michelson, “How to Use the KML Language,” SGI Internal Report. 

[4] “Spec-based Verification: A New Methodology for Functional Verification of Systems/ASICs,” white paper, 

Verisity Design web page: www.verisity.com 

[5] Mehdi Mohtashemi, “High-Performance Functional Validation,” white paper, System Science Inc company web 

page: www.systems.com/products/vera/vera.htm 

[6] K.D. Jones and J.P. Privitera, “The Automatic Generation of Functional Test Vectors for Rambus Designs,” 

Proceedings of the 33rd Annual Design Automation Conference, June 1996, p. 415-420. 

[7] “The Verilog-XL Reference Manual,” Cadence Design Systems, 1991. 

[8] K. Robbins and S. Robbins, “Practical UNIX Programming,” Prentice Hall, 1996. p. 347-364. 

[9] “Synopsys VCS Reference Manual,” Synopsys, Inc., July, 1997. 

[10] Summit Design, Inc. web page: http://www.sd.com

DAC'99, pages 169-174 

Functional Verification of the Equator MAP1000 Microprocessor 

Jian Shen, Jacob Abraham, 

Computer Engineering Research Center, The University pf Texas at Austin, Austin, TX 

Dave Baker, Tony Hurson, Martin Kinkade, Gregorio Gervasio, Chen-Chau Chu, Guanghui Hu 

Equator Technologies Inc., Austin, TX 

Abstract 

The Advanced VLIW architecture of the Equator MAP1000 processor hasmany features that 

present significant verification challenges. We describe a functional verification methodology to 

address this complexity. In particular, we present an efficient method to generate directed 

assembly tests and a novel technique using the processor itself to control self-tests and check the 

results at speed using native instructions only. We also describe the use of emulation in both presilicon 

and post-silicon verification stages. 

References 

[1] T. B. Alexander, K. A. Dickey, D. N. Goldberg, R. V. La Fetra, J. R. McGee, N. Noordeen, and A. Prakash. 

Verification, characterization, and debugging of the HP PA 7200 processor. In Hewlett-Packard Journal, pages 1– 

12, February 1996. 

[2] M. Kantrowitz and L.M. Noack. I’m Done Simulating; Now What? Verification Coverage Analysis and 

Correctness Checking of the DECchip 21164 Alpha microprocessor. In Proc. of the Design Automation Conf., pages 

325–333, June 1996. 

[3] S. T. Mangelsdorf, R. P. Gratias, R. M. Blumberg, and R. Bhatia. Functional verification of the HP PA 8000 

processor. In Hewlett-Packard Journal, pages 1–13, August 1997. 

[4] A. Aharon, D. Goodman,M. Levinger, Y. Lichtenstein, Y. Malka, C. Metzger, M. Molcho, and G. Shurek. Test 

Program Generation for Functional Verification of PowerPC Processors in IBM. In Proc. of the Design Automation 

Conf., pages 279–285, June 1995. 

[5] J. Shen and J. A. Abraham. Native Mode Functional Test Generation for Microprocessors with Applications to 

Self Test and Design Validation. In Proc. Intl. Test Conf., pages 990–999, 1998. 

[6] C. Hinchcliff. Simplified Microprocessor Test Generation. In Proc. Intl. Test Conf., pages 176–180, 1982. 

[7] A.J. van de Goor and O. Jansen. Self Test for the Intel 8085. In Microprocessing and Microprogramming, 

29:165–175, 1990.

DAC'99, pages 175-180 

Micro Architecture Coverage Directed Generation of Test Programs 

Shmuel Ur, Yoav Yadin 

IBM Haifa Research Lab 

Abstract 

In this paper, we demonstrate a method for generation of assembler test programs that 

systematically probe the micro architecture of a PowerPC superscalar processor. We show 

innovations such as ways to make small models for large designs, predict, with cycle accuracy 

the movement of instructions through the pipes (taking into account stalls and dependencies) and 

generation of test programs such that each reaches a new micro architectural state. We compare 

our method to the established practice of massive random generation and show that the quality of 

our tests, as measured by transition coverage, is much higher. The main contribution of this 

paper is not in theory, as the theory has been discussed in previous papers, but in describing how 

to translate this theory into practice in a practical way, a task that was far from trivial. 

Bibliography 

1. B. Beizer, “The Pentium Bug, an Industry Watershed”, Testing Techniques Newsletter On-Line Edition, 

September 1995 

2. A. Aharon, D. Goodman, M. Levinger, Y Lichtenstein, Y. Malka, C. Metzger, M. Molco, G. Shurek “Test 

Program Generation for Functional Verification of PowerPC Processors in IBM”, In proceeding of ACM/IEEE 

Design Automation Conference 1995 

3. E. Buchnik, S. Ur. “Compacting Regression Suites On-The-Fly” APSEC, December 1997 

4. Y. Lichtenstein, Y. Malka, A. Aharon “Model Based Test Generation for Processor Design Verification”, In 

Innovative Applications of Artificial Intelligence (IAAI) AAAI Press 1994 

5. A. M. Ahi, G.D. Burroughs, A.B. Gore, S.W. LaMar, C.R. Lin, A.L Wieman, “Design Verification of the HP9000 

Series 7000 pa-risc Workstations”, Hewlett-Packard-Journal num. 8 vol. 14 August 1992 

6. A. Chandra, V. Iyengar, D. Jameson, R. Jawalker, I. Nair, B. Rosen, M. Mullen, J. Yoor, R. Armoni, D. Geist, Y. 

Wolfstal “AVPGEN - A Test Case Generator for Architecture Verification”, IEEE Transactions on VLSI Systems 

6(6) June 1995 

7. D. Geist, M. Farkas, A. Landver, Y. Lichtenstein, S. Ur, Y. Wolfsthal “Coverage Directed Generation Using 

Symbolic Techniques”, FMCAD Conference November 96 

8. G. J. Holtzman, “Design and Validation of Computer Protocols”, Prentice Hall, Englewood Cliffs, NJ 1991 

9. K.L McMillan “Symbolic Model Checking” Kluwer Academic Press, Norwell MA 1993 

10.K.L McMillan “The SMV System DRAFT”, Carnegie Mellon University, Pittsburgh PA 1992 

11.A.K. Chandra, V.S. Iyengar, R.V. Jawalekar, M.P. Mullen, I. Nair, B.K. Rosen “Architectural Verification of 

Processors Using Symbolic Instruction Graphs”, In Proceedings of the International Conference on Computer 

Design, October 1994 

12.R. C. Ho, C. Han Yang, M. A. Horowitz, D. L. Dill “Architecture Validation for Processors” In ACM ISCA 1995 

13.H. Iwashita, S. Kowatari, T. Nakata, F. Hirose “Automatic Test Program Generation for Pipelined Processors”, 

In Proceedings of the International Conference on Computer Aided Design, November 1994 

14.I. Beer, M. Yoeli, S. Ben-David, D. Geist and R. Gewirtzman, “Methodology and System for Practical Formal 

Verification of Reactive Systems”, CAV94 Conference, LNCS818, pp 182-193 

15.T. A. Diep, J. P. Shen “Systematic Validation of Pipeline Interlock for Superscalar Microarchitectures” In 

Proceedings of the 25’th Annual International Symposium on Fault Tolerance, June 1995 

16.C. May, E. Silha, R. Simpson, H. Warren editors “The PowerPC Architecture”, Morgan Kaufmann, 1994 

17.S. Weiss, J. E. Smith “POWER and PowerPC”, Morgan Kaufmann, 1994 

18.D. Lewin, D. Lorenz, S. Ur “A Methodology for Processor Implementation Verification”, FMCAD conference 

November 96 

19.A. Hosseini, D. Mavroidis and P. Konas “Code Generation and Analysis for the Functional Verification of 

Microprocessors”, In Proceeding of the 33rd Design Automation Conference, June, 1996.

20.M. Kantrowitz and L.M. Noack, “I’m Done Simulating: Now What? Verification Coverage Analysis and 

Correctness Checking of the DECchip 21164 Alpha Microprocessor”, In Proceeding of the 33rd Design Automation 

Conference, June, 1996.

DAC'99, pages 181-184 

Verification of a Microprocessor Using Real World Applications 

You-Sung Chang, Seungjong Lee, In-Cheol Park, and Chong-Min Kyung 

Dept. of EE, KAIST, Taejon, Korea 

Abstract 

In this paper, we describe a fast and convenient verification methodology for microprocessor 

using large-size, real application programs as test vectors. The verification environment is based 

on automatic consistency checking between the golden behavioral reference model and the target 

HDL model, which are run in an hand-shaking fashion. In conjunction with the automatic 

comparison facility, a new HDL saver is proposed to accelerate the verification process. The 

proposed saver allows 'restart' from the nearest checkpoint before the point of inconsistency 

detection regardless of whether any modification on the source code is made or not. It is to be 

contrasted with conventional saver that does not allow restart when some design change, or 

debugging is made. We have proved the effectiveness of the environment through applying it to 

a real-world example, i.e., Pentium-compatible processor design process. It was shown that the 

HDL verification with the proposed saver can be faster and more flexible than the hardware 

emulation approach. In short, it was demonstrated that restartability with source code 

modification capability is very important in obtaining the short debugging turnaround time by 

eliminating a large number of redundant simulations. 

References 

[1] R. C. Ho, C. H. Yang, M. A. Horowitz, and D. L. Dill. “Architecture Validation for Processors”. Proceedings of 

the 22th Annual International Symposium on Computer Architecture, pp. 404–413, 1995. 

[2] G. Ganapathy, R. Narayan, G. Jorden, and D. Fernandez. “Hardware Emulation for Functional Verification of 

K5”. Proceedings of 33th Design Automation Conference, pp. 315–318, 1996. 

[3] V. Popescu and B. McNamara. “Innovative Verification Strategy Reduces Design Cycle Time For High-End 

SPARC Processor”. Proceedings of 33th Design Automation Conference, pp. 311–314, 1996. 

[4] S. Mehta, S. Al-Ashari, D. Chen, D. Chen, S. Cokmez, P. Desai, R. Eltejaein, P. Fu, J. Gee, T. Granvold, A. Iyer, 

K. Lin, G. Maturana, D. McConn, H. Mohammed, J. Mostoufi, A. Moudgal, S. Nori, N. Parveen, G. Peterson, M. 

Splain, and T. Yu. “Verification of the UltraSPARCTM Microprocessor”. COMPCON, pp. 452–461, 1995. 

[5] J.-S. Yim, Y.-H. Hwang, C.-J. Park, H. Choi, W.-S. Yang, H.-S. Oh, I.-C. Park, and C.-M. Kyung. “A C-Based 

RTL Design Verification Methodology for Complex Microprocessor”. Proceedings of 34th Design Automation 

Conference, pp. 83–88, 1997. 

[6] S. Lee, Y.-S. Chang, S.-I. Park, I.-C. Park, and C.-M. Kyung. “An Efficient Approach to Functional Verification 

of Complex Processors”. Proceedings of International Conference on Computer Systems Technology for Industrial 

Applications - Chip Technology, pp. 87–92, 1998. 

[7] Verilog-XL Reference Manual Volume 1,2. Cadence Design Systems, 1995. 

[8] VCS User's Guide. Chronologic Simulation, 1996. 

[9] Programming Language Interface Reference Manual Volume 1,2. Cadence Design Systems, 1992. 

[10] W. R. Stevens. Advanced Programming in the UNIX Environment. Addison-Wesley Publishing Company, 

1992.

DAC'99, pages 185-188 

High-Level Test Generation for Design Verification of Pipelined Microprocessors 

David Van Campenhout, Trevor Mudge, and John P. Hayes 

Department of Electrical Engineering and Computer Science 

The University of Michigan, Ann Arbor, MI 48109-2122, USA 

Abstract 

This paper addresses test generation for design verification of pipe-lined microprocessors. To 

handle the complexity of these designs, our algorithm integrates high-level treatment of the 

datapath with low-level treatment of the controller, and employs a novel "pipe-frame" 

organization that exploits high-level knowledge about the operation of pipelines. We have 

implemented the proposed algorithm and used it to generate verification tests for design errors in 

a representative pipelined microprocessor. 

Keywords: design verification, sequential test generation, high-level test generation, pipelined 

microprocessors. 

References 

[1] M. S. Abadir, J. Ferguson, and T. E. Kirkland. “Logic design verification via test generation.” IEEE TCAD, vol. 

7, no. 1, pp. 138–148, Jan. 1988. 

[2] M. Abramovici. Digital systems testing and testable design. Computer Science Press, New York, 1990. 

[3] A. Aharon et al. “Verification of the IBM RISC System/6000 by dynamic biased pseudo-random test program 

generator.” IBM Systems Journal, pp. 527–538, 1991. 

[4] H. Al-Asaad and J. P. Hayes. “Design verification via simulation and automatic test pattern generation.” In Proc. 

ICCAD, 1995, pp. 174–180. 

[5] B. Beizer. Software testing techniques. Van Nostrand Reinhold, New York, 2nd edition, 1990. 

[6] V. Bhagwati and S. Devadas. “Automatic verification of pipelined microprocessors.” In Proc. DAC, 1994, pp. 

603–608. 

[7] D. Bhattacharya and J. P. Hayes. “High-level test generation using bus faults.” In Dig. FTCS, 1985, pp. 65–70. 

[8] J. Burch and D. L. Dill. “Automatic verification of pipelined microprocessor control.” In Proc. CAV, June 1994, 

pp. 68–80. 

[9] A. K. Chandra et al. “AVPGEN - a test generator for architecture verification.” IEEE Trans. on VLSI, pp. 188– 

200, 1995. 

[10] K.-T. Cheng. “Gate-level test generation for sequential circuits.” ACM TODAES, vol. 1, no. 4, pp. 405–442, 

1996. 

[11] F. Fallah, S. Devadas, and K. Keutzer. “OCCOM: Efficient computation of observability-based code coverage 

metric for functional simulation.” In Proc. DAC, 1998, pp. 152–157. 

[12] A. Gupta, S. Malik, and P. Ashar. “Toward formalizing a validation methodology using simulation coverage.” 

In Proc. DAC, 1997, pp. 740–745. 

[13] J. Hennessy and D. Patterson. Computer Architecture: A quantitative Approach. Morgan Kaufman Publishers, 

San Mateo, Calif., 1990. 

[14] R. C. Ho and M. A. Horowitz. “Validation coverage analysis for complex digital designs.” In Proc. ICCAD, 

1996, pp. 146–151. 

[15] A. Hosseini, D. Mavroidis, and P. Konas. “Code generation and analysis for the functional verification of 

microprocessors.” In Proc. DAC, 1996, pp. 305–310. 

[16] H. Iwashita et al. “Automatic test program generation for pipelined processors.” In Proc. ICCAD, 1994, pp. 

580–583. 

[17] J. Lee and J. H. Patel. “An architectural level test generator based on nonlinear equation solving.” Journal of 

Electronic Testing: Theory and Applications, vol. 4, no. 2, pp. 137–150, 1993. 

[18] J. Lee and J. H. Patel. “Architectural level test generation for microprocessors.” IEEE TCAD, pp. 1288–1300, 

1994. 

[19] J. Levitt and K. Olukotun. “Verifying correct pipeline implementation for microprocessors.” In Proc. ICCAD, 

1997, pp. 162–169.

[20] D. Lewin, D. Lorenz, and S. Ur. “A methodology for processor implementation verification.” In Proc. FMCAD, 

1996, pp. 126–142. 

[21] D. Moundanos, J. A. Abraham, and Y. V. Hoskote. “Abstraction techniques for validation coverage analysis 

and test generation.” IEEE Trans. Computers, vol. 47, no. 1, pp. 2–14, Jan. 1998. 

[22] T. Niermann and J. H. Patel. “HITEC: A test generation packaged for sequential circuits.” In Proc. EDAC, 

1991, pp. 214–218. 

[23] S. Taylor et al. “Functional verification of a multiple-issue, out-of-order, superscalar Alpha processor - the 

DEC Alpha 21264 microprocessor.” In Proc. DAC, 1998, pp. 638–643. 

[24] D. Van Campenhout et al. “High-level design verification of microprocessors via error modeling.” ACM 

TODAES, vol. 3, no. 4, pp. 581–599, 1998.

DAC'99, pages 189-194 

Developing an Architecture Validation Suite Application to the PowerPC Architecture 

Laurent Fournier, Anatoly Koyfman, Moshe Levinger 

IBM Research Lab in Haifa 

Abstract 

This paper describes the efforts made and the results of creating an Architecture Validation Suite 

for the PowerPC architecture. Although many functional test suites are available for multiple 

architectures, little has been published on how these suites are developed and how their quality 

should be measured. This work provides some insights for approaching the difficult problem of 

building a high quality functional test suite for a given architecture. By defining a set of generic 

coverage models that combine program-based, specification-based, and sequential bug-driven 

models, it establishes the groundwork for the development of architecture validation suites for 

any architecture. 

Bibliography 

[1] Y. Lichtenstein, Y. Malka and A. Aharon, Model-Based Test Generation For Processor Design Verification, 

Innovative Applications of Artificial Intelligence (IAAI), AAAI Press, 1994. 

[2] A. Aharon, D. Goodman, M. Levinger, Y. Lichtenstein, Y. Malka, C. Metzger, M. Molcho, and G. Shurek, Test- 

Program Generation for Functional Verification of PowerPC Processors in IBM, DAC 95, San Francisco, pp. 279- 

285. 

[3] M. Scheitrum and A. Smith, Behavioral Verification and its application to Pentium Class Processors, PCI 

Developers’ Conference, 1995. 

[4] Y. Arbetman, L. Fournier, M. Levinger, Functional Verification of Microprocessors Using the Genesys Test 

Program Generation - Application to the X86 Microprocessor family. DATE 99. 

[5] D.A. Patterson and J.L. Hennessy, Computer Organization & Design The Hardware/Software Interface, Morgan 

Kaufmann, San Francisco, 1994. 

[6] E. Buchnick, S. Ur, On Minimizing Regression-Suites using On-Line Set-Cover, EuroStar 97. 

[7] Y. Abarbanel-Vinov, S. Ur, Processor Bug Classification and Modelling, Internal IBM Haifa publication. 

[8] S. Ur, A. Ziv, R. Grinwald, E. Harel, M. Orgad, User defined coverage - A Tool Supported Methodology for 

Design Verification, DAC 98. 

[9] S. Ur and A. Ziv, Off-The-Shelf Vs. Custom Made Coverage Models, Which is the one for You? STAR98. May 

1998.

DAC'99, pages 195-200 

Passive ReducedOrder Models for Interconnect Simulation and their Computation via 

Krylov-Subspace Algorithms 

Roland W. Freund 

Bell Laboratories, Lucent Technologies, Murray Hill, NJ 07974–0636, USA 

Abstract 

This paper studies a projection technique based on block Krylov subspaces for the computation 

of reducedorder models of multiport RLC circuits. We show that these models are always 

passive, yet they still match at least half as many moments as the corresponding reduced-order 

models based on matrixPadé approximation. For RC, RL, and LC circuits, the reduced-order 

models obtained by projection and matrix-Padé approximation are identical. For general RLC 

circuits, we show how the projection technique can easily be incorporated into the SyMPVL 

algorithm to obtain passive reduced-order models, in addition to the high-accuracy matrix-Padé 

approximations that characterize SyMPVL, at essentially no extra computational costs. 

Connections between SyMPVL and the recently proposed reduced-order modeling algorithm 

PRIMA are also discussed. Numerical results for interconnect simulation problems are reported. 

References 

[1] J.I. Aliaga, D.L. Boley, R.W. Freund, and V. Hernández, “A Lanczostype algorithm for multiple starting 

vectors,” revised version, Numerical Analysis Manuscript No. 98–3–05, Bell Laboratories, Murray Hill, NJ, Sep. 

1998. Also available online from http://cm.belllabs.com/cs/doc/98. 

[2] B.D.O. Anderson and S. Vongpanitlerd, Network Analysis and Synthesis, Englewood Cliffs, NJ: Prentice-Hall, 

1973. 

[3] W.E.Arnoldi, “The principle ofminimized iterations in the solution of thematrix eigenvalue problem,” Quart. 

Appl.Math., vol. 9, pp. 17–29, 1951. 

[4] Z. Bai, P. Feldmann, and R.W. Freund, “How to make theoretically passive reduced-order models passive in 

practice,” In Proc. IEEE 1998 CICC, May 1998. 

[5] G.A. Baker, Jr. and P. GravesMorris, Padé Approximants, 2nd Edition, New York: Cambridge University Press, 

1996. 

[6] I.M. Elfadel and D.D. Ling, “Zeros and passivity of Arnoldi-reduced-order models for interconnect networks,” in 

Proc. 34nd ACM/IEEE DAC, Jun. 1997. 

[7] P. Feldmann and R.W. Freund, “Efficient linear circuit analysis by Padé approximation via the Lanczos 

process,” IEEE Trans. Computer-Aided Design, vol. 14, pp. 639–649, May 1995. 

[8] P. Feldmann and R.W. Freund, “Reduced-order modeling of large linear subcircuits via a block Lanczos 

algorithm,” in Proc. 32 nd ACM/IEEE DAC, June 1995. 

[9] P. Feldmann and R.W. Freund, Interconnect-delay computation and signal-integrity verification using the 

SyMPVL algorithm, in Proc. 1997 ECCTD, Sep. 1997. 

[10] R.W. Freund, “Computation of matrix Padé approximations of transfer functions via a Lanczos-type process,” 

in Approximation Theory VIII, Vol. 1, C.K. Chui and L.L. Schumaker, eds., World Scientific Publishing Co., 1995. 

[11] R.W. Freund, “Passive reduced-order models for interconnect simulation and their computation via Krylovsubspace 

algorithms,” Numerical Analysis Manuscript No. 98–3–06, Bell Laboratories, Murray Hill, NJ, Oct. 1998. 

Also available online from http://cm.belllabs.com/cs/doc/98. 

[12] R.W. Freund and P. Feldmann, “Reduced-order modeling of large passive linear circuits by means of the 

SyPVL algorithm,” in Tech. Dig. 1996 IEEE/ACM ICCAD, Nov. 1996. 

[13] R.W. Freund and P. Feldmann, “The SyMPVL algorithm and its applications to interconnect simulation,” in 

Proc. SISPAD’97, IEEE, Sep. 1997. 

[14] R.W. Freund and P. Feldmann, “Reducedorder modeling of large linear passive multiterminal circuits using 

matrix- Padé approximation,” in Proc. DATE’98, Feb. 1998. 

[15] C. Lanczos, “An iteration method for the solution of the eigenvalue problem of linear differential and integral 

operators,” J. Res. Nat. Bur. Standards, vol. 45, pp. 255–282, 1950. 

[16] A. Odabasioglu, Provably passive RLC circuit reduction, M.S. thesis, Carnegie Mellon University, 1996.

[17] A. Odabasioglu, M. Celik, L.T. Pileggi, “PRIMA: passive reduced-order interconnect macromodeling 

algorithm,” in Tech. Dig. 1997 IEEE/ACM ICCAD, Nov. 1997. 

[18] L.T. Pillage and R.A. Rohrer, “Asymptotic waveform evaluation for timing analysis,” IEEE Trans. Computer- 

Aided Design, vol. 9, pp. 352–366, Apr. 1990. 

[19] L.M. Silveira, M. Kamon, I. Elfadel, and J. White, “A coordinate-transformed Arnoldi algorithm for generating 

guaranteed stable Reduced-order models of RLCcircuits,” in Tech. Dig. 1996 IEEE/ACM ICCAD, Nov. 1996. 

[20] M.R.Wohlers, Lumped and Distributed Passive Networks, NewYork: Academic Press, 1969.

DAC'99, pages 201-206 

Model Order-Reduction of RC(L) Interconnect including Variational Analysis 

Ying Liu, Lawrence T. Pileggi and Andrzej J. Strojwas 

Department of Electrical and Computer Engineering 

Carnegie Mellon University, Pittsburgh, PA 15213 

ABSTRACT 

As interconnect feature sizes continue to scale to smaller dimensions, long interconnect can 

dominate the IC timing performance, but the interconnect parameter variations make it difficult 

to predict these dominant delay extremes. This paper presents a model order-reduction technique 

for RLC interconnect circuits that includes variational analysis to capture manufacturing 

variations. Matrix perturbation theory is combined with dominant-pole-analysis and Krylovsubspace-analysis 

methods to produce reduced-order models with direct inclusion of statistically 

independent manufacturing variations. The accuracy of the resulting variational reduced-order 

models is demonstrated on several industrial examples. 

REFERENCES 

[1]Boning, D.S. and J.E. Chung, “Statistical metrology-measurement and modeling of variation for advanced 

process development and design rule generation”, Proc. 1998 Int. Conf. on Characterization and Metrology for 

ULSI Technology, March 1998. 

[2]Director, S.W. and R.A. Rohrer, “The generalized ajoint network and network sensitivities”, IEEE Tran. Circuit 

Theory, vol. CT-16, No. 3, August 1969. 

[3]Feldmann, P. and R.W. Fruend, “Efficient linear circuit analysis by Padé approximation via the Lanczos 

process”, IEEE Trans. CAD, vol. 14, May 1995. 

[4]Golub, G.H. and C.F. Van Loan, Matrix computations, 3rd ed., The Johns Hopkins University Press, Baltimore 

1996. 

[5]Harkness, C.L. and D.P. Lopresti, “Interval methods for modeling uncertainty in RC timing analysis”, IEEE 

Trans. CAD, vol. 11, No. 11, November 1992. 

[6]Kato, T., Perturbation theory for linear operator, 2nd ed., Springer-Verlag, 1995. 

[7]Kemble, E.C., The fundamental principles of quantum mechanics, Dover, 1958. 

[8]Kerns, K.J. and A.T. Yang, “Stable and efficient reduction of large, multiport RC networks by pole analysis via 

congruence transformations”, IEEE Trans. CAD, vol. 16, 1997. 

[9]Odabasioglu, A., M. Celik and L.T. Pileggi, “PRIMA: passive reduced-order interconnect macromodeling 

algorithm”, IEEE Trans. CAD, August 1998. 

[10]Pillage, L.T. and R.A. Rohrer, “Asymptotic waveform evaluation for timing analysis”, IEEE Trans. CAD, vol. 

9, April 1990. 

[11]Progler, C., H. Du and G. Wells, “Potential causes of across field CD variation”, SPIE, vol. 3051, 1997. 

[12]Rubinstein, J., P. Penfield Jr. and M.A. Horowitz, “Signal delay in RC tree networks”, IEEE Trans. CAD, vol. 

CAD-2, July 1983. 

[13]Silveira, L.M., M. Kamon and J. White, “Efficient reducedorder modeling of frequency-dependent coupling 

inductances associated with 3-D interconnect structures”, Proc. 32 nd ACM/IEEE Design Automation Conference, 

June 1995. 

[14]Stewart, G.W. and J.-G. Sun, Matrix perturbation theory, Academic Press, Inc., San Diego, 1990 

[15]Stine, B.E. et al, “The physical and electrical effects of metal filling patterning practices for oxide chemical 

mechanical polishing processes”, IEEE Trans. Electron Devices, vol. 45, No. 3, March 1998. 

[16]Stine, B.E. et al, “Rapid characterization and modeling of spatial variation: a CMP case study”, CMP Metrology 

Session, Semicon West ‘97, July 1997.

DAC'99, pages 207-212 

Robust Rational Function Approximation Algorithm for Model Generation 

Carlos P. Coelho 1 , Joel R. Phillips 2 , L. Miguel Silveira 1 

1 INESC / Cadence European Laboratories, Dept. of Electrical and Computer Engineering, 

Instituto Superior Técnico, Lisboa, 1000 Portugal 

2 Cadence Design Systems, San Jose, CA, 95134 

Abstract 

The problem of computing rational function approximations to tabulated frequency data is of 

paramount importance in the modeling arena. In this paper we present a method for generating a 

state space model from tabular data in the frequency domain that solves some of the numerical 

difficulties associated with the traditional fitting techniques used in linear least squares 

approximations. An extension to the MIMO case is also derived. 

References 

[1] L. Miguel Silveira, Ibrahim M. Elfadel, and Jacob K. White. Efficient Frequency-Domain Modeling and Circuit 

Simulation of Transmission Lines. In Proceedings of the 31st Design Automation Conference, pages 634–639, San 

Diego, CA, June 1994. 

[2] Tuyen V. Nguyen, Jing Li, and Zhaojun Bai. Dispersive coupled transmission line simulation using an adaptive 

block lanczos algorithm. In International Custom Integrated Circuits Conference, pages 457–460, 1996. 

[3] Guowu Zheng, Qi-Jun Zhang, Michel Nakhla, and Ramachandra Achar. An efficient approach to momentmatching 

simulation of linear subnetworks with measured or tabulated data. In International Conference on 

Computer Aided-Design, pages 20–23, San Jose, California, November 1996. 

[4] A. Deutsch et. al. When are transmission line effects important for on-chip interconnections? IEEE Trans. 

Microwave Theory and Techniques, 45:1836–1846, October 1997. 

[5] J. R. Phillips, E. Chiprout, and D. D. Ling. Efficient full-wave electromagnetic analysis via model-order 

reduction of fast integral transforms. In Proceedings 33rd Design Automation Conference, Las Vegas, Nevada, June 

1996. 

[6] J. E. Dennis, Jr. and Robert B. Schnabel. Numerical Methods for Uncontrained Optimization and Nonlinear 

Equations. Series in Computational Mathmatics. Prentice Hall, 1983. 

[7] Gene H. Golub and Charles F. Van Loan. Matrix Computations. Series in the Mathematical Sciences. The John 

Hopkins University Press, Baltimore, Maryland, third edition, 1996. 

[8] Yousef Saad. Iterative Methods for Sparse Linear Systems. Pws Publishing Co., 1996. 

[9] Lloyd N. Trefethen and David Bau. Numerical Linear Algebra. SIAM, 1999. 

[10] Andreas Antoniou. Digital filters analysis, design and applications. McGraw-Hill International Editions, 2nd 

edition, 1993. 

[11] Thomas Kailath. Linear Systems. Information and System Science Series. Prentice-Hall, Englewood Cliffs, 

New Jersey, First edition, 1980. 

[12] R. S. Varga. Matrix Iterative Analysis. Automatic Computation Series. Prentice-Hall Inc, Englewood Cliffs, 

New Jersey, 1962. 

[13] Zhaojun Bai, Peter Feldmann, and Roland W. Freund. Stable and passive reduced order models based on partial 

pade approximation via the lanczos process. Technical Report Numerical Analysis Manuscript No.97-3-10, Bell 

Laboratories, Lucent Technologies, Murray Hill, New Jersey, October 1997. 

[14] Eli Chiprout and Michael S. Nakhla. Analysis of interconnect networks using complex frequency hopping 

(CFH). IEEE Trans. CAD, 14(2):186–200, February 1995.

DAC'99, pages 213-218 

Behavioral Network Graph Unifying the Domains of High-Level and Logic Synthesis 

Reinaldo A. Bergamaschi 

IBM T. J. Watson Research Center, NY, USA 

Abstract 

High-level synthesis operates on internal models known as control/data flow graphs (CDFG) and 

produces a register-transfer-level (RTL) model of the hardware implementation for a given 

schedule. For high-level synthesis to be efficient it has to estimate the effect that a given 

algorithmic decision (e.g., scheduling, allocation) will have on the final hardware 

implementation (after logic synthesis). Currently, this effect cannot be measured accurately 

because the CDFGs are very distinct from the RTL/gate-level models used by logic synthesis, 

precluding interaction between high-level and logic synthesis. This paper presents a solution to 

this problem consisting of a novel internal model for synthesis which spans the domains of highlevel 

and logic synthesis. This model is an RTL/gate-level network capable of representing all 

possible schedules that a given behavior may assume. This representation allows high-level 

synthesis algorithms to be formulated as logic transformations and effectively interleaved with 

logic synthesis. 

References 

[1] P. G. Paulin and J. P. Knight, “Force-directed scheduling for the behavioral synthesis of ASIC's," IEEE 

Transactions on Computer-Aided Design, vol. CAD-8, pp. 661-679, June 1989. 

[2] R. Bergamaschi and S. Raje, “Control-owversus data-flow-based scheduling: Combining both approaches in an 

adaptive scheduling system," IEEE Transactions on VLSI Systems, vol. 5, March 1997. 

[3] M. C. McFarland, “The Value Trace: A data base for automated digital design," Tech. Rep. DRC-01-4-80, 

Design Research Center, Carnegie-Mellon University, December 1978. 

[4] R. Camposano and R. M. Tabet, “Design representation for the synthesis of behavioral VHDL models," in 

Proceedings 9th International Symposium on Computer Hardware Description Languages and Their Applications, 

(Washington, D.C.), pp. 49-58, Elsevier Science Publishers B.V., June 1989. 

[5] J. Darringer, W. Joyner, C. Berman, and L. Trevillyan, “Logic synthesis through local transformations," IBM 

Journal of Research and Development, vol. 25, July 1981. 

[6] R. Rudell, “Tutorial: Design of a logic synthesis system," in Proceedings of the 33rd ACM/IEEE Design 

Automation Conference, (Las Vegas, NV), pp. 191-196, ACM/IEEE, June 1996. 

[7] L. Stok, D. S. Kung, A. D. Brand, A. J. Sulivan, L. N. Reddy, N. Hieter, D. J. Geiger, H. H. Chao, and P. J. 

Osler, “Boole-Dozer: Logic synthesis for ASICs," IBM Journal of Research and Development, vol. 40, pp. 407-430, 

July 1996. 

[8] G. Goosens, J. Vandewalle, and H. De Man, “Loop optimization in register-transfer scheduling for dsp systems," 

in Proceedings of the 26th ACM/IEEE Design Automation Conference, pp. 826-831, ACM/IEEE, June 1989. 

[9] A. Aho, R. Sethi, and J. Ullman, Compilers, Principles, Techniques and Tools. Reading, MA: Addison-Wesley, 

1986. 

[10] R. A. Bergamaschi and D. J. Allerton, “A graph-basedsilicon compiler for concurrent VLSI systems," in 

Proceedings of the IEEE CompEuro Conference, (Brussels), pp. 36-47, IEEE, April 1988. 

[11] R. Camposano, “Path-based scheduling for synthesis," IEEE Transactions on Computer-Aided Design, vol. 

CAD-10, pp. 85-93, January 1991. 

[12] K. O'Brien, M. Rahmouni, and A. Jerraya, “A VHDL-based scheduling algorithm for control-ow dominated 

circuits," in Sixth International Workshop on High-Level Synthesis, (Dana Point, CA), ACM, November 1992.

DAC'99, pages 219-224 

Soft Scheduling in High Level Synthesis 

Jianwen Zhu, Daniel D. Gajski 

CECS, Information and Computer Science, 

University of California, Irvine, CA 92717-3425, USA 

Abstract 

In this paper, we establish a theoretical framework for a new concept of scheduling called soft 

scheduling. In contrasts to the traditional schedulers referred as hard schedulers, soft schedulers 

make soft decisions at a time, or decisions that can be adjusted later. Soft scheduling has a 

potential to alleviate the phase coupling problem that has plagued traditional high level synthesis 

(HLS), HLS for deep submicron design and VLIW code generation. We then develop a specific 

soft scheduling formulation, called threaded schedule, under which a linear, optimal (in the sense 

of online optimality) algorithm is guaranteed. 

References 

[1] D. Gajski, N. Dutt, A. Wu, S. Lin. High Level Synthesis: Introduction to Chip and System Design, Kluwer 


[2] J. Nestor and D.E Thomas. Behavioral Synthesis with Interfaces. Proceedings of the IEEE Conference on 

Computer Aided Design, November 1986. 

[3] P.G. Paulin, J.P. Knight. Force-Directed Scheduling for the Behavioral Synthesis of ASIC’s. IEEE Transactions 

on Computer-Aided Design of Integrated Circuits and Systems, June 1989. 

[4] D. Ku, G. De Micheli. Relative Scheduling under Timing Constraints: Algorithms for High-Level Synthesis of 

Digital Circuits. IEEE Transactions on CAD/ICAS, Vol. 11, No. 6, April 1992. 

[5] R. Camposano. Path-Based Scheduling for Synthesis. IEEE Transaction on CAD/ICAS, Vol. 10, No.1, January, 

1991. 

[6] C.H. Gebotys, M.I. Elmasry. Simultaneous Scheduling and Allocation for Cost Constrained Optimal 

Architectural Synthesis. Proceedings of 28th DAC, 1991. 

[7] B. Landwehr, P. Marwedel, R. Dömer. Optimum Simultaneous Scheduling, Allocation and Resource Binding 

Based on Integer Programming. Proceedings of Euro-DAC, 1994. 

[8] J. Weng, A.C. Parker. 3D Scheduling: High-Level Synthesis with Floorplanning. Proceedings of DAC, 1991. 

[9] C. Ewering. Automatic High-Level Synthesis of Partitioned Busses. Proceedings of EuroDAC, 1990. 

[10] M. Xu, F.J. Kurdahi. Layout-driven RTL Binding Techniques for High-Level Synthesis. Proceedings of 9 th ISSS, 

1996. 

[11] R. Cytron, J. Ferrante, B.K. Rosen, M.N. Wegman, F.K. Zadeck. Efficiently Computing Static Single 

Assignment Form and the Control Dependence Graph. ACM Transactions on Programming Languages and 

Systems, October, 1991. 

[12] J. Zhu, D.D. Gajski. Soft Scheduling in High Level Synthesis. Technical Report ICS-98-37, Information and 

Computer Science, UC, Irvine, August, 1998.

DAC'99, pages 225-230 

Graph Coloring Algorithms for Fast Evaluation of Curtis Decompositions 

Marek Perkowski, Rahul Malvi, Stan Grygiel, Mike Burns, Alan Mishchenko 

Portland State University, Department of Electrical and computer Engineering, Portland, OR 

Abstract 

Finding the minimum column multiplicity for a bound set of variables is an important problem in 

Curtis decomposition. To investigate this problem, we compared two graph-coloring programs: 

one exact, and another one based on heuristics which can give, however, provably exact results 

on some types of graphs. These programs were incorporated into the multi-valued decomposer 

MVGUD. We proved that the exact graph coloring is not necessary for high-quality functional 

decomposers. Thus we improved by orders of magnitude the speed of the column multiplicity 

problem, with very little or no sacrifice of decomposition quality. Comparison of our 

experimental results with competing decomposers shows that for nearly all benchmarks our 

solutions are best and time is usually not too high. 

REFERENCES 

[1] A.V. Anisimov, “Local Optimization of Colorings of Graphs,” Cybernetics, Vol. 22, No. 6, pp. 683-692, 1986. 

[2] L. Babel, “Finding Maximum Cliques in Arbitrary and in Special Graphs,” Comp. Vol 46, pp. 321-341, 1991. 

[3] E.A. Bender, and H.S. Wilf, “A Theoretical Analysis of Backtracking in the Graph Coloring Problem,” pp. 275- 

282, J. of Alg., Vol. 6. 

[4] A. Blum, “Newapproximation algorithms for graph coloring,” JACM, Vol. 41, No. 3, pp. 470-516, 1994. 

[5] P. Briggs, K. Cooper, K. Kennedy, and L. Torczon, “Coloring Heuristics for Register Allocation,” ASCM Conf. 

on Progr. Lan. Des. Impl, pp. 275-284, ACM, 1989. 

[6] M. Burns, M. Perkowski, L. Jozwiak, “An Efficient Approach to Decomposition of Multi-Output Boolean 

Functions with Large Sets of Bound Variables,” Proc. 1998 Euromicro, pp. 16-23, Vasteras, Sweden, August 25-27, 

1998. 

[7] A.N. Chebotarev, “Approach to Functional Specification of Automata,” Kibernetika i Sistemnyj Analiz, No. 3, 

pp. 31-42, 1991 (in Russian). 

[8] M.J. Ciesielski, S. Yang, and M.A. Perkowski, “Multiple-Valued Minimization Based on Graph Coloring,” . 

Proc. of ICCD’89, pp. 262 - 265, Oct. 1989. 

[9] O. Coudert, “Coloring of Real-Life Graphs is Easy,” Proc. DAC’97, 1997. 

[10] H.A. Curtis, “A New Approach to the Design of Switching Circuits,” Van Nostrand, Princeton, N.J., 1962. 

[11] R. Diestel, “Graph Theory,” Springer, 1997. 

[12] E. C. Freuder, “A Sufficient Condition of Backtrack-Free Search,” JACM, Vol. 29, No. 1, pp. 24-32, 1982 

[13] M.R. Garey, and D.S. Johnson, “The Complexity of Near-Optimal Graph Coloring,” JACM, Vol. 23, No. 1, 

Jan. 1976, pp. 43-49. 

[14] I.M. Gessel, “A coloring problem,” The Am. Math. Monthly, Vol. 98, pp. 530-533, 1991. 

[15] S. Grygiel,M. Perkowski,M.Marek-Sadowska, T. Luba, L. Jozwiak, “CubeDiagram Bundles: A New 

Representation of Strongly Unspecified Multiple-Valued Functions and Relations,” Proc. ISMVL’97, pp. 287-292. 

[16] S. Grygiel, M. Perkowski, “New Compact Representation of Multiple-Valued Functions, Relations, and Non- 

Deterministic State Machines, Proc. ICCD’98, Oct. 5, 1998. 

[17] T.R. Jensen, and B. Toft,“Graph Coloring Problems,” Wiley, 1995. 

[18] T.Luba, M. Mochocki, J. Rybnik, “Decomposition of Information Systems Using Decision Tables,” Bull. Pol. 

Acad. Sci., Techn. Sci., Vol. 41, No. 3, 1993. 

[19] A.A. Mishchenko, “A CAD System for Automated Synthesis of Controlling Automata,” Cybernetics and 

System Analysis, Plenum Press, No. 3, pp. 23-30, 1997. 

[20] C. Morgenstern, “Improved Implementations of Dynamic Sequential Coloring Algorithms,” Dept. Comp. Sci, 

Texas Christ. Univ, 1991. 

[21] L. Nguyen, M. Perkowski, and N. Goldstein, “PALMINI – Fast Boolean Minimizer for Personal Computers,” 

Proc. of DAC, pp. 615-621, 1987.

[22] M.A. Perkowski, “A New Representation of Strongly Unspecified Switching Functions and it Application to 

Multi-Level AND/OR/EXOR Synthesis.” Proc. RM’95 Work, 1995, pp. 143-151. 

[23] M. Perkowski, M. Marek-Sadowska, L. Jozwiak, M. Nowicka, R. Malvi, Z. Wang, J. Zhang, “Decomposition 

of Multiple-Valued Relations,” Proc. ISMVL’97, pp. 13-18. 

[24] T.D. Ross, M.J. Noviskey, T.N. Taylor, D.A. Gadd, “Pattern Theory: An Engineering Paradigm for Algorithm 

Design,” Final Technical Report WL-TR-91-1060, Wright Laboratories, USAF, WL/AART/WPAFB, OH 45433- 

6543,August 1991. 

[25] W. Wan, M.A. Perkowski, “A New Approach to the Decomposition of Incompletely Specified Multi-Output 

Functions Based on Graph-Coloring and Local Transformations and its Application to FPGA mapping,” Proc. of 

EURO-DAC’92, pp. 230-235, 1992.

DAC'99, pages 231-236 

Maximizing Performance by Retiming and Clock Skew Scheduling 

Xun Liu, Marios C. Papaefthymiou 

Department of Electrical Engineering and Computer Science, University of Michigan 

Ann Arbor, Michigan 48109 

Eby G. Friedman 

Department of Electrical and Computer Engineering, University of Rochester 

Rochester, New York 14627 

Abstract 

The application of retiming and clock skew scheduling for improving the operating speed of 

synchronous circuits under setup and hold constraints is investigated in this paper. It is shown 

that when both long and short paths are considered, circuits optimized by the simultaneous 

application of retiming and clock scheduling can achieve shorter clock periods than optimized 

circuits generated by applying either of the two techniques separately. A mixed-integer linear 

programming formulation and an efficient heuristic are given for the problem of simultaneous 

retiming and clock skew scheduling under setup and hold constraints. Experiments with 

benchmark circuits demonstrate the efficiency of this heuristic and the effectiveness of the 

combined optimization. All of the test circuits show improvement. For more than half of them, 

the maximum operating speed increases by more than 21% over the optimized circuits obtained 

by applying retiming or clock skew scheduling separately. 

References 

[1] S. Chakradhar and S. Dey. Resynthesis and retiming for optimum partial scan. In Proceedings of the 31st 

ACM/IEEE Design Automation Conf., pages 87–93, June 1994. 

[2] L.-F. Chao and E. H.-M. Sha. Retiming and clock skew for synchronous systems. In Proc. International Symp. 

on Circuits and Systems, pages 283–286, June 1994. 

[3] R. B. Deokar and S. S. Sapatnekar. A graph-theoretic approach to clock skew optimization. In Proc. 

International Symp. on Circuits and Systems, pages 407–410, May 1995. 

[4] S. Dey and S. Chakradhar. Retiming sequential circuits to enhance testability. In Proc. 12th IEEE VLSI Test 

Symp., pages 28–33, April 1994. 

[5] J. P. Fishburn. Clock skew optimization. IEEE Trans. on Computers, 39(7):945–951, July 1990. 

[6] E. G. Friedman. Clock Distribution Networks in VLSI Circuits and Systems. IEEE Press, 1995. 

[7] A. T. Ishii, C. E. Leiserson, and M. C. Papaefthymiou. Optimizing two-phase, level-clocked circuitry. Journal of 

the ACM, 41(1):148–199, January 1997. 

[8] K. N. Lalgudi and M. C. Papaefthymiou. DELAY: an efficient tool for retiming with realistic delay modeling. In 

Proc. 32nd ACM/IEEE Design Automation Conf., June 1995. 

[9] C. E. Leiserson and J. B. Saxe. Retiming synchronous circuitry. Algorithmica, 6(1), 1991. Also available as 

MIT/LCS/TM-372. 

[10] X. Liu, M. C. Papaefthymiou, and E. G. Friedman. Optimal clock skew scheduling tolerant to process 

variations. In Design, Automation, and Test in Europe, pages 643–649, March 1999. 

[11] B. Lockyear and C. Ebeling. Optimal retiming of multi-phase, level-clocked circuits. In Advanced Research in 

VLSI and Parallel Systems: Proc. 1992 Brown/MIT Conf. MIT Press, March 1992. 

[12] H.-G. Martin. Retiming by combination of relocation and clock delay adjustment. In Proc. European Design 

Automation Conf., pages 384–389, September 1993. 

[13] J. Monteiro, S. Devadas, and A. Ghosh. Retiming sequential circuits for low power. In Digest of Technical 

Papers of the 1993 IEEE International Conf. on CAD, pages 398–402, November 1993. 

[14] J. L. Neves and E. G. Friedman. Optimal clock skew scheduling tolerant to process variations. In Proc. 33rd 

ACM/IEEE Design Automation Conf., pages 623–628, June 1996.

[15] M. C. Papaefthymiou and K. H. Randall. TIM: a timing package for two-phase, level-clocked circuitry. In Proc. 

30th ACM/IEEE Design Automation Conf., June 1993. Also available as an MIT VLSI Memo 92–693, October 

1992. 

[16] N. Shenoy, R. K. Brayton, and A. Sangiovanni-Vincentelli. Retiming of circuits with single phase levelsensitive 

latches. In International Conf. on Computer Design, October 1991. 

[17] T. Soyata, E. G. Friedman, and J. H. Mulligan, Jr. Incorporating interconnect, register, and clock distribution 

delays into the retiming process. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, 

16(1):105–120, January 1997.

DAC'99, pages 237-242 

A Practical Approach to Multiple-Class Retiming 

Klaus Eckl 

Institute of EDA, Technical Univ. of Munich, 80290 Munich, Germany 

Jean Christophe Madre 

Synopsys, Inc., 38610 Gieres, France 

Peter Zepter 

Synopsys Inc., Mountain View, CA-94043 

Christian Legl 

Institute of EDA, Technical Univ. of Munich, 80290 Munich, Germany 

Abstract 

Retiming is an optimization technique for synchronous circuits introduced by Leiserson and Saxe 

in 1983. Although powerful, retiming is not very widely used because it does not handle in a 

satisfying way circuits whose registers have load enable, synchronous and asynchronous set/clear 

inputs. We propose an extension of retiming whose basis is the characterization of registers into 

register classes. The new approach called multiple-class retiming handles circuits with an 

arbitrary number of register classes. We present results on a set of industrial FPGA designs 

showing the effectiveness and efficiency of multiple-class retiming. 

References 

[1] R. Camposano and P. G. Plöger. Retiming and high-level synthesis. In International Workshop on High-Level- 

Synthesis, pages 191–201, Nov. 1992. 

[2] G. De Micheli. Synchronous logic synthesis: Algorithms for cycletime minimization. IEEE Transactions on 

Computer-Aided Design of Integrated Circuits and Systems, 10(1):63–73, Jan. 1991. 

[3] S. Dey, M. Potkonjak, and S. G. Rothweiler. Performance optimization of sequential circuits by eliminating 

retiming bottlenecks. In IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 504–509, 

Nov. 1992. 

[4] G. Even, I. Y. Spillinger, and L. Stok. Retiming revisited and reversed. IEEE Transactions on Computer-Aided 

Design of Integrated Circuits and Systems, 15(3):348–357, Mar. 1996. 

[5] S. Hassoun and C. Ebeling. Experiments in the iterative application of resynthesis and retiming. In ACM/IEEE 

International Workshop on Timing Issues in the Specification and Synthesis of Digital Systems, Dec. 1997. 

[6] A. T. Ishii, C. E. Leiserson, and M. C. Papaefthymiou. Optimizing two-phase, level-clocked circuitry. In T. 

Knight and J. Savage, editors, Advanced Research in VLSI and Parallel Systems: Proceedings of the 1992 

Brown/MIT Conference, pages 245–264. MIT Press, 1992. 

[7] C. Legl, P. Vanbekbergen, and A. Wang. Retiming of edge-triggered circuits with multiple clocks and load 

enables. In International Workshop on Logic Synthesis (IWLS), volume 1, May 1997. 

[8] C. E. Leiserson and J. B. Saxe. Optimizing synchronous systems. Journal of VLSI and Computer Systems, 

1(1):41–67, Spring 1983. 

[9] C. E. Leiserson and J. B. Saxe. Retiming synchronous circuitry. Algorithmica, 6(1):5–35, 1991. 

[10] B. Lockyear and C. Ebeling. Optimal retiming of multi-phase, level-clocked circuits. In T. Knight and J. 

Savage, editors, Advanced Research in VLSI and Parallel Systems: Proceedings of the 1992 Brown/MIT 

Conference, pages 265–280. MIT Press, 1992. 

[11] N. Maheshwari and S. Sapatnekar. Efficient retiming of large circuits. IEEE Transactions on VLSI Systems, 

6(1):74–83, Mar. 1998. 

[12] N. Maheshwari and S. S. Sapatnekar. An improved algorithm for minimum-area retiming. In ACM/IEEE 

Design Automation Conference (DAC), pages 2–7, June 1997. 

[13] N. Maheshwari and S. S. Sapatnekar. Minimum area retiming with equivalent initial states. In IEEE/ACM 

International Conference on Computer-Aided Design (ICCAD), pages 216–219, Nov. 1997.

[14] S. Malik, E. M. Sentovich, R. K. Brayton, and A. Sangiovanni-Vincentelli. Retiming and resynthesis: 

Optimizing sequential networks with combinational techniques. IEEE Transactions on Computer-Aided Design of 

Integrated Circuits and Systems, 10(1):74–84, Jan. 1991. 

[15] P. Pan and C. L. Liu. Optimal clock period FPGA technology mapping for sequential circuits. In ACM/IEEE 

Design Automation Conference (DAC), pages 720–725, June 1996. 

[16] N. Shenoy and R. Rudell. Efficient implementation of retiming. In IEEE/ACM International Conference on 

Computer-Aided Design (ICCAD), pages 226–233, Nov. 1994. 

[17] N. V. Shenoy, K. J. Singh, R. K. Brayton, and A. L. Sangiovanni-Vincentelli. On the temporal equivalence of 

sequential circuits. In ACM/IEEE Design Automation Conference (DAC), pages 405–409, June 1992. 

[18] V. Singhal, S. Malik, and R. K. Brayton. The case for retiming with explicit reset circuitry. In IEEE/ACM 

International Conference on Computer-Aided Design (ICCAD), pages 618–625, Nov. 1996. 

[19] H. J. Touati and R. K. Brayton. Computing the initial states of retimed circuits. IEEE Transactions on 

Computer-Aided Design of Integrated Circuits and Systems, 12(1):157–162, Jan. 1993. 

[20] Xilinx Inc., San Jose, California 95124. The Programmable Logic Data Book, 1996.

DAC'99, pages 243-246 

Performance-driven Integration of Retiming and Resynthesis 

Peichen Pan 

Strategic CAD Labs, Intel Corporation, Hillsboro, OR 97124 

Abstract 

We present a novel approach to performance optimization by integrating retiming and 

resynthesis. The approach is oblivious of register boundaries during resynthesis. In addition, it 

guides resynthesis by a criterion that is directly tied to the performance target. The proposed 

approach obtains provable results. Experimental results further demonstrate the effectiveness of 

our approach. 

References 

[1] S. Bommu, M. Ciesielski, N. O'Neill, and P. Kalla. Retiming-based factorization for multi-level logic 

optimization. In Intl. Workshop on Logic Synthesis, 1997. 

[2] S. T. Chakradhar, S. Dey, M. Potkonjak, and S. G. Rothweiler. Sequential circuit delay optimization using global 

path delays. In ACM/IEEE Design Automation Conf. (DAC), pages 483-489, 1993. 

[3] K. C. Chen and S. Muroga. Timing optimization for multi-level combinational circuits. In ACM/IEEE Design 

Automation Conf. (DAC), pages 339-344, 1990. 

[4] G. DeMicheli. Synchronous logic synthesis: algorithms for cycle-time minimization. IEEE Trans. on Computer- 

Aided Design, 10:63-73, 1991. 

[5] S. Dey, M. Potkonjak, and S. G. Rothweiler. Performance optimization of sequential circuits by eliminating 

retiming bottlenecks. In Intl. Conf. on Computer-Aided Design (ICCAD), pages 504-509, 1992. 

[6] S. Hassoun and C. Ebeling. Experiments in the iterative application of resynthesis and retiming. In Intl. 

Workshop on Timing Issues in the Specification and Synthesis of Digital Systems, 1997. 

[7] C. E. Leiserson and J. B. Saxe. Retiming synchronous circuitry. Algorithmica, 6:5-35, 1991. 

[8] B. Lin. Restructuring of synchronous logic circuits. In European Conf. on Design Automation, pages 205-209, 

1993. 

[9] S. Malik, K. J. Singh, R. Brayton, and A. L. Sangiovanni-Vincentelli. Performance optimization of pipelined 

logic circuits using peripheral retiming and resynthesis. IEEE Trans. on Computer-Aided Design, 12:568-578, 1993. 

[10] P. Pan and C.L. Liu. Optimal clock period FPGA technology mapping for sequential circuits with retiming. 

ACM Trans. on Design Automation of Electronic Systems, 3(3), 1998. 

[11] K. J. Singh, A. R. Wang, R. Brayton, and A. L. Sangiovanni-Vincentelli. Timing optimization of combinational 

logic. In Intl. Conf. on Computer-Aided Design (ICCAD), pages 282-285, 1988. 

[12] H. J. Touati, H. Savoj, and R. K. Brayton. Delay optimization of combinational logic circuits by clustering and 

partial collapsing. In Intl. Conf. on Computer-Aided Design (ICCAD), pages 188-191, 1991.

DAC'99, pages 247-252 

Kernel-Based Power Optimization of RTL Components: 

Exact and Approximate Extraction Algorithms 

L. Benini 1 , G. De Micheli 2 , E. Macii 3 , G. Odasso 3 , M. Poncino 3 

1 Università di Bologna, Bologna, ITALY 40136 

2 Stanford University, Stanford, CA 94305 

3 Politecnico di Torino, Torino, ITALY 10129 

Abstract 

Sequential logic optimization based on the extraction of computational kernels has proved to be 

very promising when the target is power minimization. Efficient extraction of the kernels is at 

the basis of the optimization paradigm; the extraction procedures proposed so far exploit 

common logic synthesis transformations, and thus assume the availability of a gate-level 

description of the circuit being optimized. In this paper we present exact and approximate 

algorithms for the automatic extraction of computational kernels directly from the functional 

specification of a RTL component. We show the effectiveness of such algorithms by reporting 

the results of an extensive experimentation we have carried out on a large set of standard 

benchmarks, as well as on some designs with known functionality. 

References 

[1] G. D. Hachtel, E. Macii, A. Pardo, F. Somenzi, “Markovian Analysis of Large Finite State Machines," IEEE 

Transactions on CAD, Vol. 15, No. 12, pp. 1479-1493, December 1996. 

[2] L. Benini, G. De Micheli, A. Lioy, E. Macii, G. Odasso, M. Poncino, “Computational Kernels and their 

Application to Sequential Power Optimization", DAC-35: ACM/IEEE 1998 Design Automation Conference, pp. 

764-769, San Francisco, CA, June 1998. 

[3] J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, “Sequential Circuit Verification Using Symbolic Model 

Checking," DAC-27: ACM/IEEE Design Automation Conference, pp. 46-51, Orlando, FL, June 1990. 

[4] O. Coudert, J. C. Madre, “A Unified Framework for the Formal Verification of Sequential Circuits," ICCAD-90: 

IEEE International Conference on Computer-Aided Design, pp. 126-129, Santa Clara, CA, November 1990. 

[5] H. Touati, H. Savoj, B. Lin, R. K. Brayton, A. Sangiovanni-Vincentelli, “Implicit Enumeration of Finite State 

Machines Using BDDs," ICCAD-90: IEEE International Conference on Computer-Aided Design, pp. 130-133, 

Santa Clara, CA, November 1990. 

[6] H. Cho, G. D. Hachtel, S. W. Jeong, B. Plessier, E. Schwarz, F. Somenzi, “ATPG Aspects of FSM Verification," 

ICCAD-90: IEEE International Conference on Computer-Aided Design, pp. 134-137, Santa Clara, CA, November 

1990. 

[7] K. Ravi, F. Somenzi, “High-Density Reachability Analysis," ICCAD-95: IEEE/ACM International Conference 

on Computer-Aided Design, pp. 154-158, San Jose, CA, November 1995. 

[8] H. Cho, G. D. Hachtel, E. Macii, B. Plessier, F. Somenzi, “Algorithms for Approximate FSM Traversal Based on 

State Space Decomposition,", IEEE Transactions on CAD, Vol. 15, No. 12, pp. 1465-1478, December 1996. 

[9] H. Cho, G. D. Hachtel, E. Macii, M. Poncino, F. Somenzi, “Automatic State Space Decomposition for 

Approximate FSM Traversal Based on Circuit Structural Analysis," IEEE Transactions on CAD, Vol. 15, No. 12, 

pp. 1451-1464, December 1996. 

[10] M. Alidina, J. Monteiro, S. Devadas, A. Ghosh, M. Papaefthymiou, “Precomputation-Based Sequential Logic 

Optimization for Low Power," IEEE Transactions on VLSI Systems, Vol. 2, No. 4, pp. 426-436, December 1994. 

[11] J. Monteiro, S. Devadas, A. Ghosh, “Sequential Logic Optimization for Low Power Using Input-Disabling 

Precomputation Architectures," IEEE Transactions on CAD, Vol. 17, No. 3, pp. 279-284, March 1998. 

[12] L. Benini, P. Siegel, G. De Micheli, “Automatic Synthesis of Gated Clocks for Power Reduction in Sequential 

Circuits," IEEE Design and Test of Computers, Vol. 11, No. 4, pp. 32-40, December 1994. 

[13] L. Benini, G. De Micheli, “Transformation and Synthesis of FSMs for Low Power Gated Clock 

Implementation," IEEE Transactions on CAD, Vol. 15, No. 6, pp. 630-643, June 1996.

[14] L. Benini, G. De Micheli, E. Macii, M. Poncino, R. Scarsi, “Symbolic Synthesis of Clock-Gating Logic for 

Power Optimization of Synchronous Controllers," ACM Transactions on Design Automation of Electronic Systems, 

To Appear. 

[15] S. H. Chow, Y. C. Ho, T. Hwang, C. L. Liu, “Lower Power Realization of Finite State Machines - A 

Decomposition Approach," ACM Transactions on Design Automation of Electronic Systems, Vol. 1, No. 3, pp. 315- 

340, July 1996. 

[16] J. Monteiro, A. Oliveira, “Finite State Machine Decomposition for Low Power," DAC-35: ACM/IEEE 1998 

Design Automation Conference, pp. 763-768, San Francisco, CA, June 1998. 

[17] L. Benini, G. De Micheli, Dynamic Power Management: Design Techniques and CAD Tools. Kluwer 


[18] E. Macii, M. Pedram, F. Somenzi, “High-Level Power Modeling, Estimation, and Optimization", IEEE 

Transactions on CAD, Vol. 17, No. 11, November 1998. 

[19] R. I. Bahar, C. Gaona, E. Frohm, G. D. Hachtel, E. Macii, A. Pardo, F. Somenzi, “Algebraic Decision Diagrams 

and Their Applications", Formal Methods in System Design, Vol. 10, pp. 171-206, 1997. 

[20] E. M. Sentovich, K. J. Singh, C. W. Moon, H. Savoj, R. K.Brayton, A. Sangiovanni-Vincentelli, “Sequential 

Circuits Design Using Synthesis and Optimization," ICCD-92: IEEE International Conference Computer Design, 

pp. 328-333, Cambridge, MA, October 1992. 

[21] F. Somenzi, CUDD: University of Colorado Decision Diagram Package, Release 2.3.0, Technical Report, Dept. 

of ECE, University of Colorado, Boulder, CO, September 1998. 

[22] F. Brglez, D. Bryan, K. Kozminski, “Combinational Profiles of Sequential Benchmark Circuits," ISCAS-89: 

IEEE International Symposium on Circuits and Systems, pp. 1929-1934, Portland, OR, May 1989. 

[23] A. Salz, M. Horowitz, “IRSIM: An Incremental MOS Switch-Level Simulator," DAC-26: ACM/IEEE Design 

Automation Conference, pp. 173-178, Las Vegas, NV, June 1989. 

[24] C. Y. Tsui, J. Monteiro, M. Pedram, S. Devadas, A. M. Despain, B. Lin, “Power Estimation Methods for 

Sequential Logic Circuits," IEEE Transactions on VLSI Systems, Vol. 3, No. 3, pp. 404-416, September 1995. 

[25] G. Berry, H. Touati, “Optimized Controller Synthesis using Esterel," IWLS-93: ACM/IEEE International 

Workshop on Logic Synthesis, Paper 5b, Lake Tahoe, CA, May 1993. 

[26] S.-I. Minato, “Generation of BDDs from Hardware Algorithm Description," ICCAD-96: IEEE/ACM 

International Conference on Computer-Aided Design, pp. 644-649, San Jose, CA, November 1996.

DAC'99, pages 253-257 Customized Instruction-Sets For Embedded Processors 

Joseph A. Fisher 

Hewlett-Packard Laboratories Cambridge, Cambridge, MA 02142 

ABSTRACT 

It is generally believed that there will be little more variety in CPU architectures, and thus the 

design of Instruction-set Architectures (ISAs) will have no role in the future of embedded CPU 

design. Nonetheless, it is argued in this paper that architectural variety will soon again become 

an important topic, with the major motivation being increased performance due to the 

customization of CPUs to their intended use. Five major barriers that could hinder customization 

are described, including the problems of existing binaries, toolchain development and 

maintenance costs, lost savings/higher chip cost due to the lower volumes of customized 

processors, added hardware development costs, and some factors related to the product 

development cycle for embedded products. Each is discussed, along with potential, sometimes 

surprising, solutions. 

Keywords: Embedded processors, custom processors, instruction-level parallelism, VLIW, mass 

customization of toolchains 

REFERENCES 

[1] Fisher, J. A. Walk-Time Techniques: Catalyst for Architectural Change. Computer, 30, 9 (September 1997), 40- 

42. 

[2] Fisher, J. A., Faraboschi, P., and Desoli, G. Custom-Fit Processors: Letting Applications Define Architectures. 

International Symposium on Microarchitecture, Micro-29, Paris, France, 1996, 324-335. 

[3] John Markoff, New Computer Dazzles a Jaded Industry Crowd. The New York Times, October 4, 1995, D6. 

[4] Erick Schonfeld , The Customized, Digitized, Have-It-Your-Way Economy. Fortune Magazine, 138, 6, 

September 28, 1998.

DAC'99, pages 258-259 

System-Level Hardware/Software Trade-offs 

Samuel P. Harbison 

Texas Instruments, Monroeville, PA 15146 

ABSTRACT 

Operating systems and development tools can impose overly general requirements that prevent 

an embedded system from achieving its hardware performance entitlement. It is time for 

embedded processor designers to become more involved with system software and tools. 

Keywords: Digital signal processors, instruction set architecture, compiler, real-time operating 

system, software configuration. 

REFERENCES 

[1] DSP/BIOS General Overview. URL http://www.ti.com/sc/docs/dsps/tools/dspbios/index.htm. 

[2] TMS320C600 product information. URL http://www.ti.com/sc/docs/dsps/products/c6000/index.htm. 

[3] RTDX. URL http://www.ti.com/sc/docs/dsps/tools/c5000/c54x/rtdx.htm. 

[4] “Emulation Fundamentals for TI’s DSP Solutions.” URL http://www.ti.com/sc/docs/psheets/abstract/apps/ 

spra439.htm.

DAC'99, pages 260-261 

Panel: Functional Verification: Real Users, Real Problems, Real Opportunities 

Chair: Jonah Mcleod – Silicon Strategies, Mountian View, CA 

Panel Members: Nozar Azarakhsh, Glen Ewing, Paul Gingras, Scott Reedstrom, Chris Rowen 

Abstract 

Achieving timely and comprehensive functional design verification is a ubiquitous problem in 

electronics. This panel offers perspectives on verification from designers of cardiac pacemakers, 

communications satellites, compute servers, networking equipment, and IP. 

The panelists will begin by dissecting the bottlenecks in their verification processes. For 

example, are simulators too slow? Or do test vector generation and coverage analysis consume 

the most time? The panelists will present their ideas for new EDA products which might 

accelerate verification. Finally, the panelists will discuss what compromises they would accept in 

order to achieve this acceleration. Would they learn a new HDL? Restrict their design styles? 

Forsake legacy designs?

DAC'99, pages 262-267 

A Timing-Driven Soft-Macro Resynthesis Method in Interaction with Chip Floorplanning 

Hsiao-Pin Su 1;2 , Allen C.-H. Wu 1 and Youn-Long Lin 1 

1 Department of Computer Science, Tsing Hua University, Hsin-Chu, Taiwan, ROC 

2 Taiwan Semiconductor Manufacturing Company, Ltd., Hsin-Chu, Taiwan, ROC 

Abstract 

In this paper, we present a complete chip design method which incorporates a soft-macro 

resynthesis method in interaction with chip floorplanning for area and timing improvements. We 

develop a timing-driven design flow to exploit the interaction between HDL synthesis and 

physical design tasks. During each design iteration, we resynthesize soft macros with either a 

relaxed or a tightened timing constraint which is guided by the post-layout timing information. 

The goal is to produce area-efficient designs while satisfying the timing constraints. Experiments 

on a number of industrial designs have demonstrated that by effectively relaxing the timing 

constraint of the non-critical modules and tightening the timing constraint of the critical modules, 

a design can achieve 13% to 30% timing improvements with little to no increase in chip area. 

References 

[1] B. T. Preas and M. J. Lorenzetti, Physical Design Automation of VLSI Systems, Benjamin Cummings, Menlo 

Park, CA., 1988. 

[2] N. Sherwani, Algorithms for VLSI Physical Design Automation, 2nd ed., Kluwer Academic Publishers, 1995. 

[3] C.J. Alpert and A. B. Kahng, “Recent Direction in Netlist Partitioning: A Survey," INTEGRATION: the VLSI 

Journal, N19, pp. 1-81, 1995. 

[4] M. Pedram and N. Bhat, “Layout Driven Technology Mapping," Proc. of the 28th Design Automation 

Conference, pp. 99-105, 1991. 

[5] S. Liu, K. Pan, M. Pedram, and A. M. Despain, “Alleviating Routing Congestion by Combing Logic Resynthesis 

and Linear Placement," Proc. of European Conference on Design Automation, pp. 578-582, 1993. 

[6] G. Stenz, B. M. Riess, B. Roheisch, F. M. Johannes, “Timing Driven Placement in Interaction with Netlist 

Transformations," Proc. of Int. Symp. on Physical Design, pp. 36-41, 1997. 

[7] G. Holt and A. Tyagi, “Minimizing Interconnect Energy Through Integrated Low-Power Placement and 

Combinational Logic Synthesis," Proc. of Int. Symp. on Physical Design, pp. 48-53, 1997. 

[8] C. M. Fiduccia and R. M. Mattheyses, “A Linear Time Heuristic for Improving Network Partitions," Proc. of the 

19th Design Automation Conference, pp. 175-181, 1982. 

[9] D. M. Schuler and E. G. Ulrich, “Clustering and linear placement," Proc. of the 9th Design Automation 

Conference, pp.412-419, 1972. 

[10] H.-P. Su, A. C.-H. Wu, Y.-L. Lin, “Performance-Driven Soft-Macro Clustering and Placement by Preserving 

HDL Design Hierarchy," Proc. of Int. Symp. on Physical Design, pp. 12-17, 1998. 

[11] “HDL Compiler for Verilog Reference Manual Version 3.4b", Synopsys, 1996. 

[12] “Silicon Ensemble Reference Manual Version 5.0", Cadence, 1996. 

[13] “Aquarious XO Reference Manual Version 2.1.2", AVANT!, 1998. 

[14] “STAR-RC Reference Manual Version 2.2", AVANT!, 1997. 

[15] “STAR-DC Reference Manual Version 2.1.2", AVANT!, 1996. 

[16] “TSMC ASIC Data Book TCB670", Taiwan Semiconductor Manufacturing Company, Ltd. 1997 

[17] “TSMC DSD Data Book ACB872", Taiwan Semiconductor Manufacturing Company, Ltd. 1998

DAC'99, pages 268-273 

An O-Tree Representation of Non-Slicing Floorplan and Its Applications 

Pei-Ning Guo, Chung-Kuan Cheng 

Mentor Graphics Corp., San Jose, CA 95131, U.S.A. 

Takeshi Yoshimura 

NEC Corp., 4-1-1 Miyazaki, Miyanae-Ku, Kawasaki 216, Japan 

ABSTRACT 

We present an ordered tree, O-tree, structure to represent non-slicing floorplans. The O-tree uses 

only n(2 + [lg n]) bits for a floorplan of n rectangular blocks. We define an admissible placement 

as a compacted placement in both x and y direction. For each admissible placement, we can find 

an O-tree representation. We show that the number of possible O-tree combinations is O(n!2 

/ n 1.5 ). This is very concise compared to a sequence pair representation which has O((n!) 2 ) 

combinations. The approximate ratio of sequence pair and O-tree combinations is O(n 2 (n/4e) n ). 

The complexity of O-tree is even smaller than a binary tree structure for slicing floorplan which 

has O(n! 2 5n –3 / n 1.5 ) combinations. Given an O-tree, it takes only linear time to construct the 

placement and its constraint graph. We have developed a deterministic floorplanning algorithm 

utilizing the structure of O-tree. Empirical results on MCNC benchmarks show promising 

performance with average 16% improvement in wire length, and 1% less in dead space over 

previous CPU-intensive cluster refinement method. 

REFERENCES 

[1] K. Keeler and J. Westbrook, Short Encoding of Planar Graphs and Maps, Discrete Applied Mathematics, vol. 

58, pp. 239-252, 1995 

[2] D.E. Knuth, The Art of Computer Programming, 2nd Ed., Vol. 1, Addison-Wesley Publishing Co., pp. 385-395, 

1973 

[3] H. Murata, K. Fujiyoshi, S. Nakatake, and Y. Kajatani, Rectangular-Packing-Based Module Placement, ICCAD, 

pp. 472-479, 1995 

[4] S. Nakatake, K. Fujiyoshi, H. Murata, and Y. Kajitani, Module Placement on BSG-Structure and IC Layout 

Applications, ICCAD, pp. 484-491, 1996 

[5] H. Onodera, Y. Taniguchi, K. Tamaru, Branch-and-Bound Placement for Building Block Layout, DAC, pp. 433- 

439, 1991 

[6] R. H. J. M. Otten, Automatic Floorplan Design, Proc. ACM/IEEE Design Automation Conf., pp. 261-267, 1982 

[7] P. Pan and C.L. Liu, Area Minimization for Floorplans, IEEE Transactions on Computer-Aided Design of 

Integrated Circuits and System, pp. 123-132, January 1995 

[8] B. T. Preas and W. M. VanCleemput, Placement Algorithms for Arbitrarily Shaped Blocks, DAC, pp. 474-480, 

1979 

[9] T. Takahashi, An Algorithm for Finding a Maximum-Weight Decreasing Sequence in a Permutation, Motivated 

by Rectangle Packing Problem, IEICE, vol. VLD96, pp. 31-35, 1996 

[10] T.-C. Wang, and D. F. Wong, An Optimal Algorithm for Floorplan Area Optimization, DAC, pp. 180-186, 

1990 

[11] D. F. Wong, and C. L. Liu, A New Algorithm for Floorplan Design, DAC, pp. 101-107, 1986 

[12] J. Xu, P.-N. Guo, and C.-K. Cheng, Cluster Refinement for Block Placement, DAC, pp. 762-765, 1997 

2n - 2

DAC'99, pages 274-279 

Module Placement for Analog Layout Using the Sequence-Pair Representation 

Florin Balasa, Koen Lampaert 

Conexant Systems, Newport Beach, CA 92660 

Abstract 

This paper addresses the problem of device-level placement for analog layout. Different from 

most of the existent approaches employing basically simulated annealing optimization 

algorithms operating on at Gellat-Jepsen spatial representations [2], we are using a more recent 

topological representation called sequence-pair [7], which has the advantage of not being 

restricted to slicing floorplan topologies. In this paper, we are explaining how specific features 

essential to analog placement, as the ability to deal with symmetry and device matching 

constraints, can be easily handled by employing the sequence-pair representation. Several analog 

examples substantiate the effectiveness of our placement tool, which is already in use in an 

industrial environment. 

References 

[1] J. Cohn, D. Garrod, R. Rutenbar, L. Carley, Analog Device-Level Automation, Kluwer Academic Publishers, 

1994. 

[2] D.W. Jepsen, C.D. Gellat Jr., “Macro placement by Monte Carlo Annealing", Proc. IEEE Int. Conf. on Comp. 

Design, pp. 495-498, Nov. 1984. 

[3] M. Kayal, S. Piguet, M. Declerq, B. Hochet, “SALIM: a layout generation tool for analog ICs," Proc. IEEE 

Custom Integrated Circuits Conf., pp. 7.5.1-4, 1988. 

[4] K. Lampaert, G. Gielen, W. Sansen, “A performance-driven placement tool for analog integrated circuits," IEEE 

J. of Solid-State Circ., Vol. SC-30, No. 7, pp. 773-780, July 1995. 

[5] E. Malavasi, E. Charbon, G. Jusuf, A. Sangiovanni-Vincentelli, “Virtual symmetry axes for the layout of analog 

IC's," Proc. 2nd ICVC, pp. 195-198, Seoul, Korea, Oct. 1991. 

[6] E. Malavasi, E. Charbon, E. Felt, A. Sangiovanni-Vincentelli, “Automation of IC layout with analog 

constraints," IEEE Trans. on Comp.-Aided Design of IC's and Systems, Vol. 15, No. 8, pp. 923-942, Aug. 1996. 

[7] H. Murata, K. Fujiyoshi, S. Nakatake, Y. Kajitani, “VLSI module placement based on rectangle-packing by the 

sequence-pair," IEEE Trans. on Comp.-Aided Design of IC's and Systems, Vol. 15, No. 12, pp. 1518-1524, Dec. 

1996. 

[8] S.W. Mehranfar, “STAT: a schematic to artwork translator for custom analog cells," Proc. 1990 IEEE Custom 

Integrated Circuits Conf., pp. 30.2.1-3, 1990. 

[9] H. Onodera, Y. Taniguchi, K. Tamaru, “Branch-and-bound placement for building block layout," Proc. 28 th 

ACM/IEEE Design Automation Conf., pp. 433-439, 1991. 

[10] R. Otten, “Complexity and diversity in IC layout design," Proc. IEEE Intn'l Symp. Circuits and Computers, 

1980. 

[11] J. Rijmenants, J.B. Litsios, T.R. Schwarz, M. Degrauwe, “ILAC: an automated layout tool for analog CMOS 

circuits," IEEE J. of Solid-State Circuits, Vol. SC-24, No. 2, pp. 417-425, April 1989. 

[12] W.-J. Sun, C. Sechen, “Efficient and effective placement for very large circuits," IEEE Trans. on Comp.-Aided 

Design of IC's and Systems, Vol. 14, No. 3, pp. 349-359, March 1995. 

[13] S. Sutanthavibul, E. Shragowitz, J.B. Rosen, “An analytical approach to floorplan design and optimization," 

IEEE Trans. on Comp.-Aided Design of IC's and Systems, Vol. 10, No. 6, pp. 761-769, June 1991. 

[14] D.F.Wong, C.L. Liu, “A new algorithm for floorplan design," Proc. 23rd ACM/IEEE Design Automation 

Conf., pp. 101-107, 1986.

DAC'99, pages 280-285 

Genetic List Scheduling Algorithm for Scheduling and Allocation on a Loosely Coupled 

Heterogeneous Multiprocessor System 

Martin Grajcar 

University of Passau 

Abstract 

Our problem consists of a partially ordered set of tasks communicating over a shared bus which 

are to be mapped to a heterogeneous multiprocessor system. The goal is to minimize the 

makespan, while satisfying constrains implied by data dependencies and exclusive resource 

usage. 

We present a new efficient heuristic approach based on list scheduling and genetic algorithms, 

which finds the optimum in few seconds on average even for large examples (up to 96 tasks) 

taken from [3]. The superiority of our algorithm compared to some other algorithms is 

demonstrated. 

Keywords: heterogeneous system design, heuristic, genetic algorithms, list scheduling 

References 

[1] T. L. Adam, K. M. Chandy, J. R. Dickson: A comparison of list schedules for parallel processing systems; in 

Communications ACM, 1974, Vol. 17, p. 685 

[2] Armin Bender: Design of an Optimal Loosely Coupled Heterogeneous Multiprocessor System; in European 

Design&Test Conference 1996, Paris 1996, p. 275–281 

[3] Armin Bender: Ein praktikables und optimales Einplanungsverfahren für heterogene Mehrprozessorsysteme; 

PhD-thesis, Shaker, Aachen 1997 

[4] Tobias Blickle: Theory of Evolutionary Algorithms and Application to System Synthesis; 

http://www.tik.ee.ethz.ch/˜blickle/diss.html, 1996 

[5] Edward G. Coffman, R. L. Graham: Optimal scheduling for two-processor systems; in Acta Informatica, 1972, 1, 

p. 200 

[6] Muhammad. K. Dhodhi, Intiaz Ahmad, Robert Storer: SHEMUS: Synthesis of heterogeneous multiprocessor 

systems; in Microprocessors and Microsystems, 1995, Vol. 19, No. 6, p. 311 

[7] Kemal Efe: Heuristic Models of task Assignment Scheduling in Distributed Systems; in Computer, 1982, p. 50– 

56 

[8] Michael R. Garey, David S. Johnson: Computers and intractability - a guide to the theory of NP-completeness; 

Freeman, 1979 

[9] David E. Goldberg: Genetic Algorithms in search, optimization, and machine learning; Addison-Wesley, 1989 

[10]Hironori Kasahara, Seinosuke Narita: Practical Multiprocessor Scheduling Algorithms for Efficient Parallel 

Processing; in IEEE Trans. on Computers, 1984, Vol. C-33, No. 11, p. 1023 

[11]Yu-Kwong Kwok, Ishfaq Ahmad: Dynamic Critical-Path Scheduling: An Effective Technique for Allocating 

Task graphs to Multiprocessors; in IEEE Trans. on Parallel and Distributed Systems, 1996, Vol. 7, p. 506 

[12]Zbigniew Michalewicz: Genetic Algorithms + Data Structures = Evolution Programs; Springer, 1996 

[13]Stella C. S. Porto, Celso C. Ribeiro: A Tabu Search Approach to Task Scheduling on Heterogeneous Processors 

under Precedence Constrains; ftp://ftp.inf.puc-rio.br/pub/docs/techreports/93 03 porto.ps.gz, 1994 

[14]Shiv Prakash, Alice C. Parker: SOS: Synthesis of Application-Specific Heterogeneous Multiprocessor Systems; 

in Journal of Parallel and Distributed Computing 16, 1992, p. 338–351 

[15]V. Sarkar: Partitioning and Scheduling Parallel Programs for Multiprocessors; Cambridge, MIT Press, 1989 

[16]Vadim G. Timkovsky: A polynomial-time algorithm for the two-machine unit-time release-date job-shop 

schedule-length problem;Discrete Applied Mathematics 7, 1997, p. 185 

[17]M. Y. Wu, D. D. Gajski: Hypertool: A Programming Aid for Message-Passing Systems; in IEEE Trans. on 

Parallel and Distributed Systems, 1990, Vol. 1, No. 3, p. 330–343

DAC'99, pages 286-291 

Performance-Driven Scheduling with Bit-Level Chaining 

Sanghun Park and Kiyoung Choi 

School of Electrical Engineering, Seoul National University, Seoul 151-742, Korea 

Abstract 

This paper presents a new scheduling algorithm that maximizes the performance of a design 

under resource constraints in high-level synthesis. The algorithm tries to achieve the maximal 

utilization of resources and the minimal waste of clock slack time. Moreover, it exploits the 

technique of bit-level chaining to target high-speed designs. The algorithm tries non-integer 

multiple-cycling and chaining, which allows multiple cycle execution of chained operations, to 

further increase the performance at the cost of small increase in the complexity of the control 

unit. Experimental results on several datapath-intensive designs show significant improvement in 

execution time, over the conventional scheduling algorithms. 

References 

[1] K. S. Hwang, A. E. Casavant, C. T. Chang, and M. A. d’Abreu, “Scheduling and hardware sharing in pipelined 

data paths,” in Proc. Int’l Conf. on Computer Aided Design, 1989, pp. 24–27. 

[2] N. Park and A. C. Parker, “Sehwa: A software package for synthesis of pipelines from behavioral 

specifications,” IEEE Trans. on Computer-Aided Design, pp. 356–370, Mar. 1988. 

[3] S. Devadas and A. R. Newton, “Data path synthesis from behavioral description: An algorithmic approach,” in 

Proc. Int’l Symposium on Circuits and Systems, 1987, pp. 298–401. 

[4] M. R. Corazao, M. A. Khalaf, L. M. Guerra M. Potkonjak, and J. Rabaey, “Performance optimization using 

template mapping for datapath-intensive highlevel synthesis,” IEEE Trans. on Computer-Aided Design, vol. 15, no. 

8, pp. 877–888, Aug. 1996. 

[5] P. Kanthamanon, G. R. Hellestrand, and R. L.K. Chan, “A context sensitive scheduling technique under resource 

constraints,” in Proc. Asia Pacific Conf. on Hardware Description Language, 1997, pp. 92–99. 

[6] S. Narayan and D. D. Gajski, “System clock estimation based on clock slack minimization,” in Proc. European 

Design & Test Conf., 1992, pp. 66–71. 

[7] S. Parameswaran, P. Jha, and N. Dutt, “Resynthesizing controllers for minimum execution time,” in Proc. Asia 

Pacific Conf. on HardwareDescription Language, 1994, pp. 111–117. 

[8] H.P. Juan, D. D. Gajski, and V. Chaiyakul, “Clock-driven performance optimization in interactive behavioral 

synthesis,” in Proc. Int’l Conf. on Computer Aided Design, 1996, pp. 154–157. 

[9] S. Park and K. Choi, “Latency minimisation by system clock optimisation,” IEE Electronics Letters, vol. 34, no. 

9, pp. 862–864, Apr. 1998. 

[10] P. G. Paulin and J. P. Knight, “Force-directed scheduling for the behavioral synthesis of asic’s,” IEEE Trans. 

on Computer-Aided Design, vol. 8, no. 6, pp. 661–679, June 1989. 

[11] W. F. J. Verhaegh, P. E. R. Lippens, E. H. L. Aarts J. H. M. Korst, J. L. van Meerbergen, and A. van derWerf, 

“Improved force-directed scheduling in high-throughput digital signal processing,” IEEE Trans. on Computer-Aided 

Design, vol. 14, no. 8, pp. 945–960, Aug. 1995. 

[12] R. Camposano, “Path-based scheduling for synthesis,” IEEE Trans. on Computer-Aided Design, vol. 10, no. 1, 

pp. 85–93, Jan. 1991. 

[13] C.T. Hwang, J.H. Lee, and Y.C. Hsu, “A formal approach to the scheduling problem in high level synthesis,” 

IEEE Trans. on Computer-Aided Design, vol. 10, no. 4, pp. 464–475, Apr. 1991. 

[14] J. Rabaey, C. Chu, P. Hoang, and M. Potkonjak, “Fast prototyping of datapath-intensive architectures,” IEEE 

Design & Test of Computers, pp. 40–51, June 1991. 

[15] K. Hwang, “Computer arithmetic: Principles, architecture, and design,” John Wiley & Sons, 1979. 

[16] S. Wu, “Hyper’s hardware library,” M.S. thesis, EECS Department, U.C. Berkeley, 1993–1995. 

[17] O. Bentz, “A hardware mapper for the hyper high level synthesis system,” M.S. thesis, EECS Department, U.C. 

Berkeley, 1993. 

[18] S. Note, F. Catthoor, G. Goossens, and H. De Man, “Combined hardware selection and pipelining in high 

performance deat-path design,” in Proc. Int’l Conf. on Computer Design, 1990, pp. 328–331.

[19] M. Potkonjak and J. Rabaey, “Retiming for scheduling,” in Proc. IEEE Workshop on VLSI Signal Processing, 

1990. 

[20] E. M. Sentovich and et al., “Sequential circuit design using synthesis and optimization,” in Proc. Int’l Conf. on 

Computer Aided Design, 1992, pp. 328–333. 

[21] A. Aziz, F. Balarin, R. Brayton and A. Sangiovanni-Vincentelli, “Sequential synthesis using sis,” in Proc. Int’l 

Conf. on Computer Aided Design, 1995, pp. 612–617. 

[22] S. Park and K. Choi, “Sequential circuit optimization by fsm transformation,” in Proc. Asia Pacific Conf. on 

Hardware Description Language, 1998, pp. 53–58.

DAC'99, pages 292-295 

A Model for Scheduling Protocol-Constrained Components and Environments 

Steve Haynal, Forrest Brewer 

Department of Electrical and Computer Engineering, 

University of California, Santa Barbara, U.S.A. 

ABSTRACT 

This paper presents a technique for highly constrained event sequence scheduling. System 

resource protocols as well as an external interface protocol are described by non-deterministic 

finite automata (NFA). All valid schedules which adhere to interfacing constraints and resource 

bounds for flow graph described behavior are determined exactly. A model and scheduling 

results are presented for an extensive design example. 

Keywords: Interface protocols, protocol-constrained scheduling, automata. 

REFERENCES 

[1] R. Camposano, “Path-Based Scheduling for Synthesis”, IEEE Trans. CAD/ICAS, vol. 10, no. 1, pp. 85-93, Jan. 

1991. 

[2] C. N. Coelho Jr, G. De Micheli, “Dynamic Scheduling and Synchronization Synthesis of Concurrent Digital 

Systems under System-Level Constraints”, Proc. IEEE Int. Conf. Computer-Aided Design, pp. 175-181, 1994. 

[3] C. H. Gebotys and M. I. Elmasry, “Global Optimization Approach for Architectural Synthesis”, IEEE Trans. 

CAD/ICAS, vol. 12, no. 9, pp. 1266-1278, Sep. 1993. 

[4] S. Haynal and F. Brewer, “Efficient Encoding for Exact Symbolic Automata-Based Scheduling”, Proc. IEEE Int. 

Conf. Computer-Aided Design, to appear, 1998. 

[5] H. Hulgaard S.M. Burns, T. Amon, G. Borriello, “An Algorithm for Exact Bounds on the Time Separation of 

Events in Concurrent Systems”, IEEE Transactions on Computers, vol. 44, no.11, pp. 1306-1317, Nov. 1995. 

[6] C.-T. Hwang and Y.-C. Hsu, “A Formal Approach to the Scheduling Problem in High Level Synthesis”, IEEE 

Trans. CAD/ICAS, vol. 10, no. 4, pp. 464-475, Apr. 1991. 

[7] C. Monahan and F. Brewer, “Scheduling and Binding Bounds for RT-Level Symbolic Execution”, Proc. IEEE 

Int. Conf. Computer-Aided Design, pp. 230-235, 1997. 

[8] I. Radivojevic and F. Brewer, “A New Symbolic Technique for Control-Dependent Scheduling”, IEEE Trans. 

CAD/ICAS, vol. 15, no. 1, pp. 45-57, Jan. 1996. 

[9] A. Seawright and F. Brewer, “Clairvoyant: A Synthesis System for Production-Based Specification”, Proc. IEEE 

Trans. on VLSI Systems, vol. 2, no. 2, pp. 172-185, June 1994. 

[10] K. Wakabayashi and H. Tanaka, “Global Scheduling Independent of Control Dependencies Based on Condition 

Vectors”, Proc. 29th ACM/IEEE Design Automation Conf., pp. 112-115, 1992. 

[11] J. C.-Y. Yang, G. De Micheli, and M. Damiani, “Scheduling and Control Generation with Environmental 

Constraints based on Automata Representations”, IEEE Trans. CAD/ICAS, vol. 15, no. 2, pp. 166-183, Feb. 1996.

DAC'99, pages 296-299 

A Reordering Technique for Efficient Code Motion 

Luiz C. V. dos Santos, Jochen A. G. Jess 

Design Automation Section, Eindhoven University of Technology, Eindhoven, The Netherlands 

Abstract 

Emerging design problems are prompting the use of code motion and speculative execution in 

high-level synthesis to shorten schedules and meet tight time-constraints. However, some code 

motions are not worth doing from a worst-case execution perspective. We propose a technique 

that selects the most promising code motions, thereby increasing the density of optimal solutions 

in the search space. 

References 

[1] A. Aiken et al., “Resource-Constrained Software Pipelining", IEEE Trans. Parallel and Distributed Syst., vol. 

6(12), pp. 1248-1270, Dec. 1995. 

[2] U. Banerjee et al., “Automatic Program Parallelization", Proc. of the IEEE, vol. 81(2), pp. 211-243, Feb. 1993. 

[3] R. Bergamaschi et. al.,”Control-Flow Versus Data-Flow Based Scheduling: Combinining Both Approaches in an 

Adaptive Scheduling System", IEEE Trans. on VLSI Systems, vol. 5, no.1, pp.82-100, March 1997. 

[4] J. van Eijndhoven and L. Stok, “A Data Flow Exchange Standard", Proc. Europ. Conf. Design Automation, pp. 

193-199, 1992. 

[5] S.Huang et al.,"A tree-based scheduling algorithm for control dominated circuits", Proc. ACM/IEEE Design 

Automation Conference, pp. 578-58, 1993. 

[6] S.-M. Moon and K. Ebcioglu, “An Efficient Resource-Constrained Global Scheduling Technique for Superscalar 

and VLIW Processors", Proc. Int. Simp. on Microarchitecture, pp. 55-71, 1992. 

[7] L. C. V. dos Santos, “Exploiting instruction-level parallelism: a constructive approach", PhD Thesis, Eindhoven 

University of Technology, The Netherlands, November, 1998. 

[8] M. Smith et al., “Efficient Superscalar Performance Through Boosting", Proc. Int. Conf. Archit. Support for 

Prog. Lang. and Operating Syst., pp. 248-259, 1992.

DAC'99, pages 300-305 

Coverage Estimation for Symbolic Model Checking 

Yatin Hoskote*, Timothy Kam*, Pei-Hsin Ho**, Xudong Zhao* 

*Strategic CAD Labs, Design Technology, Intel Corp. 

**Advanced Technology Group, Synopsys, Inc. 

Abstract 

Although model checking is an exhaustive formal verification method, a bug can still escape 

detection if the erroneous behavior does not violate any verified property. We propose a 

coverage metric to estimate the "completeness" of a set of properties verified by model checking. 

A symbolic algorithm is presented to compute this metric for a subset of the CTL property 

specification language. It has the same order of computational complexity as a model checking 

algorithm. Our coverage estimator has been applied in the course of some real-world model 

checking projects. We uncovered several coverage holes including one that eventually led to the 

discovery of a bug that escaped the initial model checking effort. 

References 

[1]K. L. McMillan, “Symbolic Model Checking: An Approach to the State Explosion Problem,” Kluwer Academic, 

1993. 

[2]E. Clarke, E. Emerson and A. Sistla, “Automatic Verification of Finite-State Concurrent Systems Using 

Temporal Logic Specifications,” ACM Transactions on Programming Languages and Systems, vol 8, no. 2, pp.244- 

263, April, 1986. 

[3]K.-T. Cheng, A. Krishnakumar, “Automatic Functional Test Generation Using the Extended Finite State Machine 

Model,” Proceedings of DAC, pp.86-91, June 1993 

[4]R. Ho, C. Yang, M. Horowitz, D. Dill, “Architecture Validation for Processors,” Proceedings of the 22nd Annual 

Symposium on Computer Architecture, June 1995 

[5]Y. Hoskote, D. Moundanos, J. Abraham, “Automatic Extraction of the Control Flow Machine and Application to 

Evaluating Coverage of Verification Vectors,” Proceedings of ICCD, pp. 532-537, October 1995 

[6]M. Kantrowitz, L. Noack, “I’m Done Simulating: Now What? Verification Coverage Analysis and Correctness 

Checking of the DEC chip 21164 ALPHA Microprocessor,” Proceedings DAC, pp. 325-330, June 1996 

[7]R. Bryant, “Graph-based Algorithms for Boolean Function Manipulation,” IEEE Transactions on Computers, 

vol. C-35, no. 8, 1986 

[8]H. Cho, G. Hachtel, F. Somenzi, “Redundancy Identification and Test Generation for Sequential Circuits Using 

Implicit State Enumeration,” IEEE Transactions on CAD, vol 12, no. 7, pp. 935-945, 1993 

[9]P.-H. Ho, A.Isles, T.Kam, “Formal Verification of Pipeline Control using Controlled Token Nets and Abstract 

Interpretation," Proceedings of ICCAD, pp 529-536, November 1998.

DAC'99, pages 306-311 

Improving Symbolic Traversals by means of Activity Profiles 

Gianpiero Cabodi, Paolo Camurati, Stefano Quer 

Politecnico di Torino, Dip. di Automatica e Informatica, Turin, ITALY 

Abstract 

Symbolic techniques have undergone major improvements in the last few years. Nevertheless 

they are still limited by the size of the involved BDDs, and extending their applicability to larger 

and real circuits is a key issue. 

Within this framework, we introduce "activity profiles" as a novel technique to characterize 

transition relations. In our methodology a learning phase is used to collect activity measures, 

related to time and space cost, for each BDD node of the transition relation. We use inexpensive 

reachability analysis as learning technique, and we operate within inner steps of image 

computations involving the transition relation and state sets. 

The above informations can be used for several purposes. In particular, we present an application 

of activity profiles in the field of reachability analysis itself. We propose transition relation 

subsetting and partial traversals of the state transition graph. We show that a sequence of partial 

traversals is able to complete a reachability analysis problem with smaller memory requirement 

and improved time performance. 

References 

[1] K. Ravi and F. Somenzi. High–Density Reachability Analysis. In Proc. IEEE/ACM ICCAD’95, pages 154–158, 

San Jose, California, November 1995. 

[2] K. Ravi, K. L. McMillan, T. R. Shiple, and F. Somenzi. Approximation and Decomposition of Binary Decision 

Diagram. In Proc. EDA/SIGDA/ACM/IEEE DAC’98, pages 445–450, San Francisco, California, June 1998. 

[3] G. Cabodi, P. Camurati, and S. Quer. Efficient State Space Pruning in Symbolic Backward Traversal. In Proc. 

IEEE ICCD’94, pages 230–235, Cambridge, Massachussetts, October 1994. 

[4] G. Cabodi, P. Camurati, L. Lavagno, and S. Quer. Disjunctive Partitioning and Partial Iterative Squaring: an 

effective approach for symbolic traversal of large circuits. In Proc. EDA/SIGDA/ACM/IEEE DAC’97, pages 728– 

733, Anaheim, California, June 1997. 

[5] A. Narayan, A. J. Isles, J. Jain, R. K. Brayton, and A. Sangiovanni-Vincentelli. Reachability Analysis Using 

Partitioned–ROBDDs. In Proc. IEEE/ACM ICCAD’97, pages 388–393, San Jose, California, November 1997. 

[6] M. Ganai and A. Aziz. Efficient Coverage Directed State Space Search. In IWLS’98: IEEE International 

Workshop on Logic Synthesis, Lake Tahoe, California, June 1998. 

[7] F. Somenzi. CUDD: CU Decision Diagram Package – Release 2.3.0. Technical report, Dept. of Electrical and 

Computer Engineering, University of Colorado, Boulder, Colorado, October 1998. 

[8] http://www.polito.it/~fcabodi,querg. 

[9] R. K. Brayton et al. VIS. In Proc. FMCAD’96, Lecture Notes in Computer Science 1166, Springer Verlag, pages 

248–256, Palo Alto, California, November 1996.

DAC'99, pages 312-316 

Improved Approximate Reachability using Auxiliary State Variables 

Shankar G. Govindaraju, David L. Dill and Jules P. Bergmann 

Computer Systems Laboratory, Stanford University, Stanford, CA 94305 

Abstract 

Approximate reachability techniques trade off accuracy for the capacity to deal with bigger 

designs. Cho et al [4] proposed partitioning the set of state bits into mutually disjoint subsets and 

doing symbolic forward reachability on the individual subsets to obtain an over approximation of 

the reachable state set. Recently [7] this was improved upon by dividing the set of state bits into 

various subsets that could possibly overlap, and doing symbolic reachability over the 

overlapping subsets. In this paper, we further improve on this scheme by augmenting the set of 

state variables with auxiliary state variables. These auxiliary state variables are added to capture 

some important internal conditions in the combinational logic. Approximate symbolic forward 

reachability on overlapping subsets of this augmented set of state variables yields much tighter 

approximations than earlier methods. 

References 

[1] Abadi, M. and Lamport, L., “The Existence of Refinement Mappings," LICS, pp. 165-177, July 1988. 

[2] Bryant, R. E., “Graph-Based Algorithms for Boolean Function Manipulation," IEEE Transactions on Computers, 

Vol. C-35, No. 8, pp. 677-691, August 1986. 

[3] Burch, J. R., Clarke, E. M., McMillan, K. L., Dill, D, L, and Hwang, L. J., “Symbolic Model Checking: 1020 

States and Beyond," LICS, pp. 428-439, 1990. 

[4] Cho, H., Hachtel, G., Macii, E., Pleisser, B., and Somenzi, F., “Algorithms for Approximate FSM Traversal 

Based on State Space Decomposition," IEEE TCAD, Vol. 15, No. 12, pp. 1465-1478, December 1996. 

[5] Cho, H., Hachtel, G., Macii, E., Poncino, M., and Somenzi, F., “Automatic State Space Decomposition for 

Approximate FSM Traversal Based on Circuit Analysis," IEEE 

TCAD, Vol. 15, No. 12, pp. 1451-1464, December 1996. 

[6] Coudert, O., and Madre, J. C., “A Unified Framework for the Formal Verification of Sequential Circuits," 

ICCAD, pp. 126-129, 1990. 

[7] Govindaraju, G. S., Dill, D. L., Hu, A. J, and Horowitz, M. A., “Approximate Reachability with BDDs Using 

Overlapping Projections," DAC, pp. 451-456, 1998. 

[8] Govindaraju, G. S. and Dill, D. L., “Verification by Approximate Forward and Backward Reachability," ICCAD, 

pp. 366-370, 1998. 

[9] Kuskin, J. et al, “The Stanford FLASH Multiprocessor," ISCA, pp. 301-313, April 1994.

DAC'99, pages 317-320 

Symbolic Model Checking using SAT procedures instead of BDDs 

A. Biere 1; 2 , A. Cimatti 3 , E.M. Clarke 1; 2 , M. Fujita 4 1; 2 

, Y. Zhu 

1 

Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A. 

2 

Verysys Design Automation, Inc., Fremont, CA 94538, U.S.A. 

3 

Istituto per la Ricerca Scientifica e Tecnolgica (IRST), 38055 Povo (TN), Italy 

4 

Fujitsu Laboratories of America, Inc. Sunnyvale, CA 94086-3922, U.S.A. 

Abstract 

In this paper, we study the application of propositional decision procedures in hardware 

verification. In particular, we apply bounded model checking, as introduced in [1], to 

equivalence and invariant checking. We present several optimizations that reduce the size of 

generated propositional formulas. In many instances, our SAT-based approach can significantly 

outperform BDD-based approaches. We observe that SAT-based techniques are particularly 

efficient in detecting errors in both combinational and sequential designs. 

References 

[1] BIERE, A., CIMATTI, A., CLARKE, E. M., AND ZHU, Y. Symbolic model checking without BDDs. In 

TACAS’99 (1999). to appear. 

[2] BOR ÄLV, A. The industrial success of verification tools based on St°almarck’sMethod. In 

InternationalConference on Computer-Aided Verification (CAV’97) (1997), O. Grumberg, Ed., no. 1254 in LNCS, 

Springer-Verlag. 

[3] BRYANT, R. E. Graph-based algorithms for boolean function manipulation. IEEE Transactions on Computers 

35, 8 (1986), 677–691. 

[4] BURCH, J. R., CLARKE, E. M., AND MCMILLAN, K. L. Symbolic model checking: 1020 states and beyond. 

Information and Computation 98 (1992), 142–170. 

[5] CLARKE, E., AND EMERSON, E. A. Design and synthesis of synchronization skeletons using branching time 

temporal logic. In Proceedings of the IBM Workshop on Logics of Programs (1981), vol. 131 of LNCS, Springer- 

Verlag, pp. 52–71. 

[6] DAVIS, M., AND PUTNAM, H. A computing procedure for quantification theory. Journal of the Association 

for Computing Machinery 7 (1960), 201–215. 

[7] DEHARBE, D. Using induction and BDDs to model check invariants. In CHARME’97 (1997), D. Probst, Ed., 

Chapman& Hall. 

[8] KUNZ, W. HANNIBAL: An efficient tool for logic verification based on recursive learning. In ICCAD’93 

(1993), pp. 538–543. 

[9] MARTIN, A. J. The design of a self-timed circuit for distributed mutual exclusion. In Proceedings of the 1985 

Chapel Hill Conference on Very Large Scale Integration (1985), H. Fuchs, Ed. 

[10] MCMILLAN, K. L. Symbolic Model Checking: An Approach to the State Explosion Problem. Kluwer 


[11] MCMILLAN, K. L. A conjunctively decomposed boolean representation for symbolic model checking. In 

CAV’96 (1996), vol. 1102 of LNCS, Springer-Verlag, pp. 13–25. 

[12] MUKHERJEE, R., JAIN, J., TAKAYAMA, K., FUJITA, M., ABRAHAM, J. A., AND FUSSELL, D. S. 

FLOVER: Filtering oriented combinational verification approach. In Proc. of International Workshop on Logic 

Synthesis (1995). 

[13] PLAISTED, D., AND GREENBAUM, S. A structure-preserving clause form translation. Journal of Symbolic 

Computation 2 (1986), 293–304. 

[14] SENTOVICH, E. M., SINGH, K. J., LAVAGNO, L., M., C., MURGAI, R., SALDANHA, A., SAVOJ, H., 

STEPHAN, P. R., BRAYTON, R. K., AND SANGIOVANNI-VINCENTELLI, A. SIS: A System for Sequential 

Circuit Synthesis. MemorandumNo. UCB/ERL M92/41, Electronics Research Laboratory, College of Engineering, 

University of California, Berkeley, 1992.

[15] STÅLMARCK, G. A system for determining propositional logic theorems by applying values and rules to 

triplets that are generated from a formula,1989. Swedish patent no. 467 076(1992), U.S. patent no. 5 276 897(1994), 

European patent no. 0404 454(1995). 

[16] ZHANG, H. SATO: An efficient propositional prover. In International Conference on Automated Deduction 

(CADE’97) (1997), no. 1249 in LNAI, Springer-Verlag, pp. 272–275.

DAC'99, pages 321-326 Power Efficient Mediaprocessors: Design Space Exploration 

Johnson Kin*, Chunho Lee**, William H. Mangione-Smith* and Miodrag Potkonjak** 

*Department of Electrical Engineering, UCLA 

**Department of Computer Science, UCLA 

Abstract 

We present a framework for rapidly exploring the design space of low power application-specific 

programmable processors (ASPP), in particular media processors. We focus on a category of 

processors that are programmable yet optimized to reduce power consumption for a specific set 

of applications. 

The key components of the framework presented in this paper are a retargetable instruction level 

parallelism (ILP) compiler, processor simulators, a set of complete media applications written in 

a high level language and an architectural component selection algorithm. The fundamental idea 

behind the framework is that with the aid of a retargetable ILP compiler and simulators it is 

possible to arrange architectural parameters (e.g., the issue width, the size of cache memory 

units, the number of execution units, etc.) to meet low power design goals under area constraints. 

REFERENCES 

[1] S. Banerjia, W. A. Havanki, and T. M. Conte. Treegion scheduling for highly parallel processors. In Euro-Par, 

pages 1074–1078, Passau, Germany, 1997. 

[2] A. P. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, and R. Broderson. Optimizing power using 

transformations. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 14(1):12, 1995. 

[3] A. P. Chandrakasan, S. Sheng, and R.W. Broderson. Low-power CMOS digital design. IEEE Journal of Solid- 

State Circuits, 27(4):473–484, 1992. 

[4] A. P. Chandrakasan, M. Srivastava, and R. Broderson. Energy efficient programmable computation. In VLSI 

Design Conference, pages 261–264, 1994. 

[5] P. P. Chang, S. A. Mahlke, W. Y. Chen, N. J. Warter, and W. m. W. Hwu. IMPACT: An architectural 

framework for multiple-instruction-issue processors. In International Symposium on Computer Architecture, 1991. 

[6] A. Chatterjee and R. Roy. Synthesis of low power DSP circuits using activity metrics. In VLSI Design 

Conference, pages 265–270, 1994. 

[7] R. P. Colwell, R. P. Nix, J. J. O’Donnell, D. B. Papworth, and P. K. Rodman. A VLIW architecture for a trace 

scheduling compiler. In Proceedings of ASPLOSII, pages 180–192, 1982. 

[8] J. A. Fisher. Trace scheduling: A technique for global microcode compaction. IEEE Transactions on Computing, 

C-30:478–490, 1981. 

[9] M. J. Flynn. Computer Architecture: Pipelined and Parallel Processor Design. Jones and Bartlett, 1996. 

[10] M. R. Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. 

H. Freeman and Company, New York, NY, 1979. 

[11] L. Goodby, A. Orailoglu, and P. Chau. Microarchitectural synthesis of performance-constrained low power 

VLSI designs. In International Conference on Computer Design, pages 323–326, 1994. 

[12] C. Hansen. MicroUnity’s MediaProcessor architecture. IEEE Micro, 17:34–41, 1997. 

[13] J. L. Hennessy and D. A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufman, San 

Francisco, CA, 1993. 

[14] I. Hong and M. M. Potkonjak. Power optimization using divide-and-conquer techniques for minimization of the 

number of operations. In ICCAD-97 IEEE/ACM International Conference on Computer-Aided Design, 1997. 

[15] P. Y. Hsu. Highly concurrent scalar processing. Technical Report CSG-49, Coordinated Science Laboratory, 

University of Illinois at Urbana-Champaign, 1986. 

[16] R. Jain. The Art of Computer Systems Performance Analysis. Wiley, 1991. 

[17] P. Kalapathy. Hardware-software interactions on MPACT. IEEE Micro, 17:20–26, 1997. 

[18] A. Kalavade and E.A. Lee. Complexity management in system-level design. Journal of VLSI Signal 

Processing, 14(2):157–169, 1996.

[19] M. B. Kamble and K. Ghosse. Analytical energy dissipation models for low power caches. In Proceedings 1997 

International Symposium on Low Power Electronics and Design, pages 143–148, 1997. 

[20] J. Kin, M. Gupta, and W.H. Mangione-Smith. The filter cache: An energy efficient memory structure. In 

Proceedings of 30th Annual International Symposium on Microarchitecture, 1997. 

[21] C. Lee, M. Potkonjak, and W. H. Mangione-Smith. Mediabench: A tool for evaluating and synthesizing 

multimedia and communications systems. In International Symposium on Microarchitectures, 1997. 

[22] R.B. Lee and M.D. Smith. Media processing: A new design target. IEEE Micro, 17:6–9, 1997. 

[23] W. m. W. Hwu, S. A. Mahlke, W. Y. Chen, P. P. Chang, N. J. Warter, R. A. Bringmann, R. G. Ouellette, R. E. 

Hank, T. Kiyohara, G. E. Haab, J. G. Holm, and D. M. Lavery. The superblock: An effective technique for VLIW 

and superscalar compilation. Journal of Supercomputing, 1993. 

[24] S. A. Mahlke, D. C. Lin, W. Y. Chen, R. E. Hank, and R. A. Bringmann. Effective compiler support for 

predicated execution using the Hyperblock. In International Symposium on Microarchitecture, 1992. 

[25] J. Montanaro et al. A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor. IEEE Journal of Solid-State 

Circuits, 31(11):1703–1714, November 1996. 

[26] A. Peleg and U. Weiser. MMX technology extension to the Intel architecture. IEEE Micro, 16(4):42–50, 

August 1996. 

[27] A. Raghunathan and N. Jha. Behavioral synthesis for low power. In International Conference on Computer 

Design, pages 318–322, 1994. 

[28] D. Singh, J. Rabaey, M.Pedram, F. Catthoor, S. Rajgopal, N. Sehgal, and T. Mozdzen. Power conscious CAD 

tools and methodologies: A perspective. Proceedings of IEEE, 83(4):570–594, 1995. 

[29] M. Srivastava, A. P. Chadrakasan, and R. Broderson. Predictive system shutdown and other architectural 

techniques for energy efficient programmable computation. IEEE Transactions on VLSI Systems, 4(1):42–55, 1996. 

[30] V. Tiwari, S. Malik, and A. Wolfe. Power analysis of embedded software: A first step towards software power 

minimization. IEEE Transactions on VLSI Systems, 2(4):437–445, 1994. 

[31] J. Turley and H. Hakkarainen. TI’s new ‘C6x DSP screams at 1,600 MIPS. The Microprocessor Report, 11:14– 

17, 1997.

DAC'99, pages 327-332 

Global Multimedia System Design Exploration using Accurate Memory Organization 

Feedback 

Arnout Vandecappelle, Miguel Miranda, Erik Brockmeyer, Francky Catthoor, Diederik Verkest 

IMEC vzw, Kapeldreef 75, 3001 Heverlee, Belgium 

Abstract 

Successful exploration of system-level design decisions is impossible without fast and accurate 

estimation of the impact on the system cost. In most multimedia applications, the dominant cost 

factor is related to the organization of the memory architecture. This paper presents a systematic 

approach which allows effective system-level exploration of memory organization design 

alternatives, based on accurate feedback by using our earlier developed tools. The effectiveness 

of this approach is illustrated on an industrial application. Applying our approach, a substantial 

part of the design search space has been explored in a very short time, resulting in a cost-efficient 

solution which meets all design constraints. 

References 

[1] F. Balasa, F. Catthoor, and H. De Man. Dataflow-driven memory allocation for multi-dimensional signal 

processing systems. In Proc. IEEE Int. Conf. Comp. Aided Design, pages 32–34, San Jose, CA, Nov. 1994. 

[2] F. Balasa, F. Catthoor, and H. De Man. Background memory area estimation for multi-dimensional signal 

processing systems. IEEE Trans. on VLSI Systems, 3(2):157–172, June 1995. 

[3] F. Catthoor, F. Franssen, S.Wuytack, L. Nachtergaele, and H. De Man. Global communication and memory 

optimizing transformations for low power signal processing systems. In J. Rabaey, P. Chau, and J. Eldon, editors, 

VLSI Signal Processing VII, pages 178–187. IEEE Press, New York, 1994. 

[4] F. Catthoor, S. Wuytack, E. De Greef, F. Balasa, L. Nachtergaele, and A. Vandecappelle. Custom Memory 

Management Methodology, Exploration of memory organization for embedded multimedia system design. Kluwer 

Academic Publishers, Boston, MA, 1998. 

[5] E. De Greef, F. Catthoor, and H. De Man. Program transformation strategies for memory size and power 

reduction of pseudoregular multimedia subsystems mapped on multi-processor architectures. IEEE Trans. on 

Circuits and Systems for Video Technology, 8(6):719–723, Oct. 1998. 

[6] P. Ellervee, M. Miranda, F. Catthoor, and A. Hemani. Exploiting data transfer locality in memory mapping. In 

25th EUROMICRO Conference (submitted), Milan, Italy, Sept. 1999. 

[7] T. H. Meng, B. Gordon, E. Tsern, and A. Hung. Portable video-on-demand in wireless communication. 

Proceedings of the IEEE, special issue on “Low power electronics”, 83(4):659–680, Apr. 1995. 

[8] L. Nachtergaele, D. Moolenaar, B. Vanhoof, F. Catthoor, and H. De Man. System-level power optimization of 

video codecs on embedded cores: a systematic approach. Journal of VLSI Signal Processing, special issue on 

“Future directions in the design and implementation of DSP systems” (eds. W. Burleson, K. Konstantinos), 

18(2):89–110, Feb. 1998. 

[9] L. Ramachandran, D. Gajski, and V. Chaiyakul. An algorithm for array variable clustering. In Proc. 5th 

ACM/IEEE Europ. Design and Test Conf., pages 262–266, Paris, France, Feb. 1994. 

[10] J. Robinson. Efficient general-purpose image compression with binary tree predictive coding. IEEE Trans. on 

Image Processing, 6(4):601–608, Apr. 1997. 

[11] H. Schmit and D. Thomas. Synthesis of application-specific memory designs. IEEE Trans. on VLSI Systems, 

5(1):101–111, Mar. 1997. 

[12] P. Slock, S. Wuytack, F. Catthoor, and G. de Jong. Fast and extensive system-level memory exploration for 

ATM applications. In Proc. 10th ACM/IEEE Int. Symp. on System Synthesis, pages 74–81, 1997. 

[13] J. Van Meerbergen, P. Lippens, W. Verhaegh, and A. van der Werf. PHIDEO: high-level synthesis for high 

throughput applications. Journal of VLSI signal processing, special issue on “Design environments for DSP”, year = 

1995, volume = 9, number = 1/2, month = jan, editor = Verbauwhede, I. and Rabaey, Jan, pages = 89–104. 

[14] I. Verbauwhede, F. Catthoor, J. Vandewalle, and H. De Man. Background memory management for the 

synthesis of algebraic algorithms on multi-processor dsp chips. In Proc. VLSI’89, Int. Conf. on VLSI, pages 209– 

218, Munich, Germany, Aug. 1989.

[15] I. Verbauwhede, C. Scheers, and J. Rabaey. Memory estimation for high-level synthesis. In Proc. 31st 

ACM/IEEE Design Automation Conf., pages 143–148, San Diego, CA, June 1994. 

[16] W. Verhaegh, P. Lippens, E. Aarts, J. van Meerbergen, and A. van der Werf. Multidimensional periodic 

scheduling: A solution approach. In Proc. European Design Automation Conf., pages 468–474, Paris, France, Mar. 

1997. 

[17] S. Wuytack, F. Catthoor, G. de Jong, and H. De Man. Minimizing the required memory bandwidth in VLSI 

system realizations. Accepted for IEEE Trans. on VLSI Systems, 7, 1999. 

[18] S. Wuytack, J.-P. Diguet, F. Catthoor, and H. De Man. Formalized methodology for data reuse exploration for 

low-power hierarchical memory mappings. IEEE Trans. on VLSI Systems, special issue on “Low-power systems and 

designs”, 6(4):529–537, Dec. 1998.

DAC'99, pages 333-336 

Implementation of a scalable MPEG-4 wavelet-based visual texture compression system 

L. Nachtergaele, B. Vanhoof, M. Peón, G. Lafruit, J. Bormans, I. Bolsens 

IMEC, Kapeldreef 75, B3000 Leuven, Belgium, 

ABSTRACT 

The realization of new MPEG-4 functionality, applicable to 3D graphics texture compression 

and image database access over the Internet, is demonstrated in a PC-based compression system. 

Applying our system-level design methodologies effectively removes all implementation 

bottlenecks. A first-of-a-kind ASIC, called Ozone, accelerates the Embedded Zero Tree based 

encoding and is capable of compressing 30 color CIF images per second. 

REFERENCES 

[1] ISO/IEC JTC1/SC29/WG11, Coding of audio-visual objects, ISO/IEC 14496, ‘98. 

[2] “Episode I trailer movie”, http://www.starwars.com 

[3] Catthoor F., et. al., “Proposal for unified system design meta flow in task-level and instruction-level design 

technology research for multi-media applications”, ISSS'98, Hsinchu, Taiwan, December 1998. 

[4] Catthoor F., et. al., “Custom Memory Management Methodology - Exploration of Memory Organisation for 

Embedded Multimedia System Design”, Kluwer Academic Publishers, Boston, ‘98. 

[5] Chakrabarti C., et. al., “Architectures for Wavelet Transforms: A Survey”, Journal of VLSI Signal Processing 

Systems for Signal Image and Video Technology, Vol. 14, No. 2, November ’96, 171-192. 

[6] Clarke P., “MPEG-4 project in Europe achieves wavelet silicon”, EE Times, 28 November ‘98, 

http://www.eetimes.com/story/OEG19981125S0008. 

[7] Knowles G., “A single chip wavelet zero-tree processor for video compression and decompression”, DATE ’98, 

February ’98, 61-65. 

[8] Lafruit G., et. al., “Optimal memory organisation for scalable texture codecs in MPEG-4”, IEEE Tr. on Circuits 

and Systems for Video Technologies, in press. 

[9] Lafruit G., et. al., "The Local Wavelet Transform: a memory-efficient, high-speed architecture for a Region- 

Oriented ZeroTree coder," Journal of Integrated Computer-Aided Engineering, ‘99, in press. 

[10] R. Lang, “Parallel VLSI architectures for one-, two-, and tree-dimensional discrete wavelet transforms”, PhD 

thesis, Department of Electrical and Computer Engineering, The University of Newcastle New South Wales, 2308 

Australia, March 1996. 

[10] Peón M., et. al., “Design of an arithmetic coder for a hardware wavelet compression engine”, IEEE Signal 

Processing Symposium, March 1998, Leuven, Belgium, 151-154. 

[11] Schaumont P., et. al., “A Programming Environment for the Design of Complex High Speed ASICs”, DAC, 

June ’98, 315-320. 

[12] Shapiro J.M., “Embedded image coding using the zerotrees of wavelet coefficients”, IEEE Tr. on Image 

Processing, Vol. 41, No. 12, , Dec. ’93, 3445-3462. 

[13] Sweldens W., “The Lifting Scheme: A new Philosophy in Biorthogonal Wavelets constructions,” Proc. of the 

SPIE conference, Vol. 2569, 1995, 68-79. 

[14] Vanhoof B., et. al., “A Scalable Architecture for MPEG-4 Embedded Zero Tree Coding”, CICC’99, in press. 

[15] Vishwanath M., et. al., “VLSI Architectures for the Discrete Wavelet transform”, IEEE Tr. on Circuits and 

Systems-II, Vol. 42, No. 5, May ’95, 305-316.

DAC'99, pages 337-340 

A 10 Mbit/s Upstream Cable Modem with Automatic Equalization 

Patrick Schaumont, Radim Cmar, Serge Vernalde, Marc Engels 

IMEC vzw, B-3001 Leuven Belgium 

Abstract 

A fully digital QAM16 burst receiver ASIC is presented. The BO4 receiver demodulates at 10 

Mbit/s and uses an advanced signal processing architecture that performs per burst automatic 

equalization. It is a critical building block in a broadband access system for HFC networks. The 

chip was designed using a C++ based flow and is implemented as a 80 Kgate 0.7u CMOS 

standard cell design. 

References 

[1] W. Geurts, F. Catthoor, S. Vernalde, and H. Deman. Accelerator Data-Path Synthesis for High-Throughput 

Signal Processing Applications. Kluwer Publishing, 1997. 

[2] Siemens Atea R&D Technology Homepage. http://www.siemens.be/atea/products services/rd technology/rd- 

frames.htm. 

[3] H. S. Jun and S. Y. Hwang. Design of a pipelined datapath synthesis system for digital signal processing. IEEE 

Trans. VLSI Syst., 2(3):292-303, September 1994. 

[4] W. Pugh and G. Boyer. Broadband access: Comparing alternatives. IEEE Communications Magazine, pages 34 - 

46, August 1995. 

[5] P. Schaumont, S. Vernalde, L. Rijnders, M. Engels, and I. Bolsens. A programming environment for the design 

of complex high speed asics. In Proceedings 35th Design Automation Conference, pages 315 - 320, San Francisco, 

CA, 1998.

DAC'99, pages 341-342 Panel: Cell Libraries - Build vs. Buy; Static vs. Dynamic 

Chair: Kurt Keutzer – University of California at Berkeley, Berkeley, CA 

Panel Members: Kurt Wolf, David Pietromonaco, Jay Maxey, Jeff Lewis, Martin Lefebvre, 

Jeff Burns 

Cell libraries determine the final density, performance, and power of most IC designs much as 

the construction materials determine the quality of a building. Nevertheless, the importance of 

libraries has often been a tertiary consideration in design projects – falling behind both design 

skill and tool quality. Choosing the right cell library for your project can have a significant 

impact on the characteristics of the circuit you design, and thus, the success of your product. 

Design teams need to consider a host of technical and business factors when selecting a library. 

Technical considerations include density, speed, power, design for reliability, and support for the 

designer's tools and flow. Business considerations include price, risk, time to market, and control 

of one's own destiny. 

This panel examines technical, as well as current business issues, associated with cell libraries. 

On the technical front, the advantages and disadvantages of static libraries versus `òn the fly" or 

dynamic libraries will be discussed and quantified. On the business front, while designers have 

traditionally used the cell libraries provided by their silicon source (internal division or 

semiconductor vendor), recent changes in technology and business practices make several celllibrary 

sources available to design groups: silicon vendors, third party library vendors, and 

internally created. This panel will explore the business issues associated with the library choice 

and debate when designers should use each available source of cell libraries.

DAC'99, pages 343-348 

Multilevel k-way Hypergraph Partitioning 

George Karypis and Vipin Kumar 

Department of Computer Science & Engineering, University of Minnesota, 

Minneapolis, MN 55455 

Abstract 

In this paper, we present a new multilevel k-way hypergraph partitioning algorithm that 

substantially outperforms the existing state-of-the-art K-PM/LR algorithm for multi-way 

partitioning. Both for optimizing local as well as global objectives. Experiments on the ISPD98 

benchmark suite show that the partitionings produced by our scheme are on the average 15% to 

23% better than those produced by the K-PM/LR algorithm, both in terms of the hyperedge cut 

as well as the (K – 1) metric. Furthermore, our algorithm is significantly faster, requiring 4 to 5 

times less time than that required by K-PM/LR. 

References 

[1] B. W. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. The Bell System Technical 

Journal, 49(2):291–307, 1970. 

[2] C. M. Fiduccia and R. M. Mattheyses. A linear time heuristic for improving network partitions. In In Proc. 19th 

IEEE Design Automation Conference, pages 175–181, 1982. 

[3] L. A. Sanchis. Multiple-way network partitioning. IEEE Transactions on Computers, pages 62–81, 1989. 

[4] C.W. Yeh, C. K. Cheng, and T. T. Lin. A general purposemultiple-way partitioning algorithm. In Proc. of the 

Design Automation Conference, pages 421–426, 1991. 

[5] P. Chan, M. Schlag, and J. Zien. Spectral k-way ratio-cut partitioning and clustering. In Proc. of the Design 

Automation Conference, pages 749–754, 1993. 

[6] L. A. Sanchis. Multiple-way network partitioning with different cost functions. IEEE Transactions on 

Computers, pages 1500–1504, 1993. 

[7] Horst D. Simon and Shang-Hua Teng. How good is recursive bisection? Technical Report RNR-93-012, NAS 

Systems Division, NASA, Moffet Field, CA, 1993. 

[8] C. J. Alpert and A. B. Kahng. Multi-way partitioning via space-filling curves and dynamic programming. In 

Proc. of the Design Automation Conference, pages 652–657, 1994. 

[9] Charles J. Alpert and Andrew B. Kahng. Recent directions in netlist partitioning. Integration, the VLSI Journal, 

19(1-2):1–81, 1995. 

[10] S. Hauck and G. Borriello. An evaluation of bipartitioning technique. In Proc. Chapel Hill Conference on 

Advanced Research in VLSI, 1995. 

[11] J. Cong, W. Labio, and N. Shivakumar. Multi-way VLSI circuit partitioning based on dual net representation. 

IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, pages 396–409, 1996. 

[12] B. Mobasher, N. Jain, E.H. Han, and J. Srivastava. Web mining: Pattern discovery from world wide web 

transactions. Technical Report TR-96-050, Department of Computer Science, University of Minnesota, 

Minneapolis, 1996. 

[13] S. Shekhar and D. R. Liu. Partitioning similarity graphs: A framework for declustering problmes. Information 

Systems Journal, 21(4), 1996. 

[14] C. J. Alpert, J. H. Huang, and A. B. Kahng. Multilevel circuit partitioning. In Proc. of the 34th ACM/IEEE 

Design Automation Conference, 1997. 

[15] George Karypis and Vipin Kumar. A coarse-grain parallel multilevel k-way partitioning algorithm. In 

Proceedings of the eighth SIAM conference on Parallel Processing for Scientific Computing, 1997. 

[16] C. J. Alpert. The ISPD98 circuit benchmark suite. In Proc. of the Intl. Symposium of Physical Design, pages 

80–85, 1998. 

[17] Jason Cong and Sung Kyu Lim. Multiway Partitioning with Pairwise Movement. In Intl. Conference on 

Computer Aided Design, 1998. 

[18] G. Karypis and V. Kumar. hMETIS 1.5: A hypergraph partitioning package. Technical report, Department of 

Computer Science, University of Minnesota, 1998. Available on the WWW at URL http://www.cs.umn.edu/˜metis.

[19] G. Karypis and V. Kumar. Multilevel algorithms for multi-constraint graph partitioning. In Proceedings of 

Supercomputing, 1998. Also available on WWW at URL http://www.cs.umn.edu/˜karypis. 

[20] G. Karypis and V. Kumar. Multilevel k-way hypergraph partitioning. Technical Report TR 98-036, Department 

of Computer Science, University of Minnesota, 1998. 

[21] Sverre Wichlund and Einar J. Aas. On Multilevel Circuit Partitioning. In Intl. Conference on Computer Aided 

Design, 1998. 

[22] C. Berge. Graphs and Hypergraphs. American Elsevier, New york, 1976. 

[23] Michael R. Garey and David S. Johnson. Computers and Instractability: A Guide to the Theory of NP– 

Completeness. W.H Freeman, San Francisco, CA, 1979. 

[24] George Karypis, Rajat Aggarwal, Vipin Kumar, and Shashi Shekhar. Multilevel hypergraph partitioning: 

Application in vlsi domain. IEEE Transactions on VLSI Systems, 1998 (to appear). A short version appears in the 

proceedings of DAC 1997.

DAC'99, pages 349-354 

Hypergraph Partitioning for VLSI CAD: Methodology for Heuristic Development, 

Experimentation and Reporting 

Andrew E. Caldwell, Andrew B. Kahng, Andrew A. Kennings† and Igor L. Markov 

UCLA Computer Science Department, Los Angeles, CA 90095-1596 

†Cypress Semiconductor, Beaverton, OR 97008 

Abstract 

We illustrate how technical contributions in the VLSI CAD partitioning literature can fail to 

provide one or more of: (i) reproducible results and descriptions, (ii) an enabling account of the 

key understanding or insight behind a given contribution, and (iii) experimental evidence that is 

not only contrasted with the state-of-the-art, but also meaningful in light of the driving 

application. Such failings can lead to reporting of spurious and misguided conclusions. For 

example, new ideas may appear promising in the context of a weak experimental testbed, but in 

reality do not advance the state of the art. The resulting inefficiencies can be detrimental to the 

entire research community. We draw on several models (chiefly from the metaheuristics 

community) [5] for experimental research and reporting in the area of heuristics for hard 

problems, and suggest that such practices can be adopted within the VLSI CAD community. Our 

focus is on hypergraph partitioning. 

References 

[1] C. J. Alpert, “Partitioning Benchmarks for the VLSI CAD Community, 

http://vlsicad.cs.ucla.edu/~cheese/benchmarks.html 

[2] C. J. Alpert, “The ISPD-98 Circuit Benchmark Suite”, Proc. ACM/IEEE International Symposium on Physical 

Design, April 98, pp. 80-85. See errata at http://vlsicad.cs.ucla.edu/~cheese/errata.html 

[3] C. J. Alpert, J.-H. Huang and A. B. Kahng,“Multilevel Circuit Partitioning”, ACM/IEEE Design Automation 

Conference, pp. 530-533. 

[4] C. J. Alpert and A. B. Kahng, “Recent Directions in Netlist Partitioning: A Survey”, Integration, 19(1995) 1-81. 

[5] R. S. Barr, B. L. Golden, J. P. Kelly, M. G. C. Resende andW. R. Stewart, “Designing and Reporting on 

Computational Experiments with Heuristic Methods”, technical report (extended version of J. Heuristics paper), 

June 27, 1995. 

[6] F. Brglez, “ACM/SIGDA Design Automation Benchmarks: Catalyst or Anathema?”, IEEE Design and Test, 

10(3) (1993), pp. 87-91. 

[7] F. Brglez, “Design of Experiments to Evaluate CAD Algorithms: Which Improvements Are Due to Improved 

Heuristic and Which are Merely Due to Chance?”, technical report CBL-04-Brglez, NCSU Collaborative 

Benchmarking Laboratory, April 1998. 

[8] T. Bui, S. Chaudhuri, T. Leighton and M. Sipser, “Graph Bisection Algorithms with Good Average Behavior”, 

Combinatorica 7(2), 1987, pp. 171-191. 

[9] A. E. Caldwell, A. B. Kahng and I. L. Markov, “Hypergraph Partitioning With Fixed Vertices”, in Proc. 

ACM/IEEE Design Automation Conf., June 1999. 

[10] A. E. Caldwell, A. B. Kahng and I. L. Markov, “Design and Implementation of the Fiduccia-Mattheyses 

Heuristic for VLSI Netlist Partitioning”, Proc. Workshop on Algorithm Engineering and Experimentation 

(ALENEX), Baltimore, Jan. 1999. 

[11] P. K. Chan andM. D. F. Schlag and J. Y. Zien, “Spectral K-Way Ratio-Cut Partitioning and Clustering”, IEEE 

Transactions on Computer-Aided Design, vol. 13 (8), pp. 1088-1096. 

[12] J. Cong, H. P. Li, S. K. Lim, T. Shibuya and D. Xu, “Large Scale Circuit Partitioning with Loose/Stable Net 

Removal and Signal Flow Based Clustering”, Proc. IEEE International Conference on Computer-Aided Design, 

1997, pp. 441-446. 

[13] W. Deng, personal communication, July 1998. 

[14] A. E. Dunlop and B.W. Kernighan, “A Procedure for Placement of Standard Cell VLSI Circuits”, IEEE 

Transactions on Computer-Aided Design 4(1) (1985), pp. 92-98

[15] S. Dutt and W. Deng, “VLSI Circuit Partitioning by Cluster-Removal Using Iterative Improvement 

Techniques”, Proc. IEEE International Conference on Computer-Aided Design, 1996, pp. 194-200 

[16] S. Dutt and H. Theny, “Partitioning Using Second-Order Information and Stochastic Gain Function”, Proc. 

IEEE/ACMInternational Symposium on Physical Design, 1998, pp. 112-117 

[17] C. M. Fiduccia and R. M. Mattheyses, “A Linear Time Heuristic for Improving Network Partitions”, Proc. 

ACM/IEEE Design Automation Conference, 1982, pp. 175-181. 

[18] M. R. Garey and D. S. Johnson, “Computers and Intractability, a Guide to the Theory of NP-completeness”, W. 

H. Freeman and Company: New York, 1979, pp. 223 

[19] I. P. Gent, S. A. Grant, E. MacIntyre, P. Prosser, P. Shaw, B. M. Smith and T. Walsh, “How Not To Do It”, 

research report 97-27, Univ. of Leeds School of Computer Studies, May 1997. 

[20] S. Hauck and G. Borriello, “An Evaluation of Bipartitioning Techniques”, IEEE Transactions on Computer- 

Aided Design 16(8) (1997), pp. 849-866. 

[21] L.W. Hagen, D. J. Huang and A. B. Kahng, “On Implementation Choices for Iterative Improvement 

Partitioning Methods”, Proc. European Design Automation Conference, 1995, pp. 144-149. 

[22] A. B. Kahng, “Futures for Partitioning in Physical design”, Proc. IEEE/ACM International Symposium on 

Physical Design, April 1998, pp. 190-193. 

[23] G. Karypis and V. Kumar, “Analysis of Multilevel Graph Partitioning”, draft, 1995 

[24] G. Karypis and V. Kumar, “Multilevel k-way Partitioning Scheme For Irregular Graphs”, Technical Report 95- 

064, University of Minnesota, Computer Science Department. 

[25] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar, “Multilevel Hypergraph Partitioning: Applications in 

VLSI Design”, Proc. ACM/IEEE Design Automation Conference, 1997, pp. 526-529. Additional publications and 

benchmark results for hMetis-1.5 are available at http://www-users.cs.umn.edu/~karypis/metis/hmetis/main.html 


VLSI Domain”, technical report, University of Minnesota Computer Science Department, March 27, 1998. 

[27] G. Karypis and V. Kumar, “Multilevel Algorithms for Multi-Constraint Graph Partitioning”, Technical Report 

98-019, University ofMinnesota, Department of Computer Science. 

[28] G. Karypis and V. Kumar, “hMetis: A Hypergraph Partitioning Package Version 1.5”, user manual, June 23, 

1998. 

[29] B. W. Kernighan and S. Lin, “An Efficient Heuristic Procedure for Partitioning Graphs”, Bell System Tech. 

Journal 49 (1970), pp. 291-307. 

[30] B. Krishnamurthy, “An Improved Min-cut Algorithm for Partitioning VLSI Networks”, IEEE Transactions on 

Computers, vol. C-33, May 1984, pp. 438-446. 

[31] L. T. Liu, M. T. Kuo, S. C. Huang and C. K. Cheng, “A Gradient Method on the Initial Partition of Fiduccia- 

Mattheyses Algorithm”, Proc. IEEE International Conference on Computer-Aided Design, 1995, pp. 229-234. 

[32] L. Sanchis, “Multiple-way network partitioning with different cost functions”, IEEE Transactions on 

Computers, Dec. 1993, vol.42, (no.12):1500-4. 

[33] G. R. Schreiber and O. C. Martin, “Procedure for Ranking Heuristics Applied to Graph Partitioning”, Proc. 2nd 

International Conference on Metaheuristics, July 1997, pp. 1-19. 

[34] G. R. Schreiber and O. C.Martin, “Cut Size Statistics of Graph Bisection Heuristics”, manuscript in submission 

to SIAM J. Optimization, 1997. 

[35] P. R. Suaris and G. Kedem, “Quadrisection: A New Approach to Standard Cell Layout”, Proc. IEEE/ACM 

International Conference on Computer-Aided Design, 1987, pp. 474-477. 

[36] W. Sun and C. Sechen, “Efficient and Effective Placements for Very Large Circuits”, Proc. IEEE/ACM 


[37] Y. C. Wei and C. K. Cheng, “Towards Efficient Design by Ratio-cut Partitioning”, Proc. IEEE International 

Conference on Computer-Aided Design, 1989, pp. 298-301.

DAC'99, pages 355-359 

Hypergraph Partitioning With Fixed Vertices 

Andrew E. Caldwell, Andrew B. Kahng and Igor L. Markov 

UCLA Computer Science Department, Los Angeles, CA 90095-1596 

Abstract 

We empirically assess the implications of fixed terminals for hypergraph partitioning heuristics. 

Our experimental testbed incorporates a leading-edge multilevel hypergraph partitioner [14] [3] 

and IBM-internal circuits that have recently been released as part of the ISPD-98 Benchmark 

Suite [2, 1]. We find that the presence of fixed terminals can make a partitioning instance 

considerably easier (possibly to the point of being "trivial"): much less effort is needed to stably 

reach solution qualities that are near best-achievable. Toward development of partitioning 

heuristics specific to the fixed-terminals regime, we study the pass statistics of flat FM-based 

partitioning heuristics. Our data suggest that with more fixed terminals, the improvements in a 

pass are more likely to occur near the beginning of the pass. Restricting the length of passes – 

which degrades solution quality in the classic (free-hypergraph) context - is relatively safe for the 

fixed-terminals regime and considerably reduces run time of our FM-based heuristic 

implementations. We believe that the distinct nature of partitioning in the fixed-terminals regime 

has deep implications (i) for the design and use of partitioners in top-down placement, (ii) for the 

context in which VLSI hypergraph partitioning research is pursued, and (iii) for the development 

of new benchmark instances for the research community. 

References 

[1] C. J. Alpert, “Partitioning Benchmarks for VLSI CAD Community”, http://vlsicad.cs.ucla.edu/ 

~cheese/benchmarks.html 



[3] C. J. Alpert, J.-H. Huang and A. B. Kahng,“Multilevel Circuit Partitioning”, ACM/IEEE Design Automation 

Conference, pp. 530-533. 

[4] C. J. Alpert and A. B. Kahng, “Recent Directions in Netlist Partitioning: A Survey”, Integration, 19(1995) 1-81. 

[5] J. A. Davis, V. K. De and J. D. Meindl, “A Stochastic Wire-Length Distribution for Gigascale Integration (GSI) - 

Part I: Derivation and Validation”, IEEE Transactions on Electron Devices, vol. 45(3), pp. 580-589. 

[6] A. E. Dunlop and B. W. Kernighan, “A Procedure for Placement of Standard Cell VLSI Circuits”, IEEE 

Transactions on Computer-Aided Design 4(1) (1985), pp. 92-98 

[7] S. Dutt andW. Deng, “VLSI Circuit Partitioning by Cluster-Removal Using Iterative Improvement Techniques”, 

Proc. IEEE International Conference on Computer-Aided Design, 1996, pp. 194-200 



[9] M. R. Garey and D. S. Johnson, “Computers and Intractability, a Guide to the Theory of NP-completeness”, W. 

H. Freeman and Company: New York, 1979, pp. 223 

[10] M. K. Goldberg and M. Burstein, “Heuristic Improvement Technique for Bisection of VLSI Networks”, IEEE 

Transactions on Computer-Aided Design, 1983, pp. 122-125. 

[11] S. Hauck and G. Borriello, “An Evaluation of Bipartitioning Techniques”, IEEE Transactions on Computer- 

Aided Design 16(8) (1997), pp. 849-866. 

[12] D. J. Huang and A. B. Kahng, “Partitioning-Based Standard Cell Global Placement with an Exact Objective”, 

Proc. ACM/IEEE International Symposium on Physical Design, 1997, pp. 18-25. 


Journal 49 (1970), pp. 291-307. 


VLSI Design”, Proc. ACM/IEEE Design Automation Conference, 1997, pp. 526-529.

[15] B. Landman and R. Russo, “On a Pin Versus Block Relationship for Partitioning of Logic Graphs”, IEEE 

Transactions on Computers C-20(12) (1971), pp. 1469-1479. 

[16] P. R. Suaris and G. Kedem, “Quadrisection: A New Approach to Standard Cell Layout”, Proc. IEEE/ACM 


[17] D. Sylvester and K. Keutzer, “Getting to the Bottom of Deep-Submicron”, to appear in Proc. IEEE Intl. 

Conference on Computer-Aided Design, November 1998.

DAC'99, pages 360-366 

Relaxation and Clustering in a Local Search Framework: Application to Linear Placement 

Sung-Woo Hur and John Lillis 

Dept. of Electrical Eng. and Comp. Sci., University of Illinois at Chicago 

Abstract 

This paper presents two primary results relevant to physical design problems in CAD/VLSI 

through a case study of the linear placement problem. First a local search mechanism which 

incorporates a neighborhood operator based on constraint relaxation is proposed. The strategy 

exhibits many of the desirable features of analytical placement while retaining the flexibility and 

non-determinism of local search. The second and orthogonal contribution is in netlist clustering. 

We characterize local optima in the linear placement problem through a simple visualization tool 

- the displacement graph. This characterization reveals the relationship between clusters and 

local optima and motivates a dynamic clustering scheme designed specifically for escaping such 

local optima. Promising experimental results are reported. 

References 

[1] C. K. Cheng and E. S. Kuh, “Module Placement Based on Resistive Network Optimization," IEEE Transactions 

on CAD, pp. 218-225, 1984. 

[2] G. Sigl, K. Doll, and F. Johannes, “Anylytical Placement: A Linear or a Quadratic Objective Function?," in 28th 

ACM/IEEE DAC, pp. 427-432, 1991. 

[3] M. B. Jackson and E. S. Kuh, “Performance-Driven Placement of Cell Based IC's," in 25th DAC, pp. 370-375, 

1988. 

[4] J. Frankle and R. M. Karp, “Circuit Placements and Cost Bounds by Eigenvector Decomposition," in ICCAD, 

pp. 414-417, 1986. 

[5] M. A. Breuer, “A Class of Min-cut Placement Algorithms for the Placement of Standard Cells," in DAC, pp. 

284-290, 1977. 

[6] P. R. Suaris and G. Kedem, “Quadrisection: A New Approach to Standard Cell Layout," in ICCAD, pp. 474-477, 

1987. 

[7] C. Sechen and A. Sangiovanni-Vincentelli, “TimberWolf3.2: A New Standard Cell Placement and Global 

Routing Package," in 23rd DAC, pp. 432-439, 1986. 

[8] D. Mitra, F. Romeo, and A. Sangiovanni-Vincentelli, “Convergence and Finite-Time Behavior of Simulated 

Annealing," Advances in Applied Probability, pp. 747-771, 1986. 

[9] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar, “Multilevel Hypergraph Partitioning: Application in VLSI 

Domain," in DAC, pp. 526-529, 1997. 

[10] C. Alpert, J.-H. Huang, and A. B. Kahng, “Multilevel Circuit Partitioning," in DAC, pp. 530-533, 1997. 

[11] Y. G. Saab, “An Improved Linear Placement Algorithm Using Node Compaction," IEEE Trans. on CAD of 

Intergrated Circuits and Systems, vol. 15, no. 8, pp. 952-958, 1996. 

[12] J. Li, J. Lillis, L.-T. Liu, and C. K. Cheng, “New Spectral Linear Placement and Clustering Approach," in 33rd 

DAC, pp. 88-93, 1996. 

[13] S. Sato, “Simulated Quenching: New Placement Method for Module Generation," in ICCAD, pp. 538-541, 

IEEE Computer Society Press, Nov. 1997. 

[14] “ftp://ftp.es.ele.tue.nl/pub/lp solve." 

[15] S.-W. Hur and J. Lillis, “Relaxation and Clustering in a Local Search Framework: Application to Linear 

Placement," in Technical Report UIC-EECS-99-2, 1999. 

[16] H. Yang and D. F. Wong, “Efficient Network Flow Based Min-Cut Balanced Partitioning," in ICCAD, pp. 50- 

55, IEEE Computer Society Press, Nov. 1994. 

[17] H. Liu and D. F. Wong, “Network Flow Based Multi-Way Partitioning with Area and Pin Constraints," in 

ISPD, pp. 12-17, ACM/IEEE, Apr. 1997. 

[18] C. J. Alpert and A. B. Kahng, “A General Framework for Vertex Orderings, with Applications to Netlist 

Clustering," in ICCAD, pp. 63-69, IEEE Computer Society Press, Nov. 1994.

DAC'99, pages 367-372 

An a-approximate algorithm for delay-constraint technology mapping 

Sumit Roy, Krishna Belkhale, Prithviraj Banerjee* 

Cadence Design Systems, Santa Clara, CA 95054, USA 

*Electrical and Comp. Engineering, Northwestern University, Evanston, IL-60208, USA 

Abstract 

Variants of delay-cost functions have been used in a class of technology mapping algorithms [1, 

2, 3, 4]. We illustrate that in an industrial environment the delay-cost function can grow 

unboundedly and lead to very large run-times. The key contribution of this work is a novel 

bounded compression algorithm. We introduce a concept of a delay-cost curve, (a-DC-curve) 

that requires up to exponentially less delay-cost points to be stored compared to that stored by 

the delay function. We prove that the solution obtained by this exponential compaction of the 

delay-function is bounded to alpha% of the optimal solution. We also suggest a large set of CAD 

applications which may benefit from using a-DC-curve. Finally, we demonstrate the 

effectiveness of our compaction scheme on one such application, namely technology mapping 

for low power. Experimental results on industrial environment show that we are more than 17 

times faster than [2] on certain MCNC circuit. 

References 

[1] K. Chaudhary, M. Pedram, and A. M. Despain, “A near-optimal algorithm for technology mapping minimizing 

area under delay constraints," in Proceedings of the Design Automation Conference, pp. 492-498, June 1992. 

[2] C.-Y. Tsui, M. Pedram, and A. M. Despain, “Technology decomposition and mapping targeting low power 

dissipation," in Proceedings of the Design Automation Conference, pp. 68-73, June 1993. 

[3] J. Lou, A. H. Salek, and M. Pedram, “An exact solution to simultaneous technology mapping and linear 

placement problem," in Proceedings of the International Conference on Computer-Aided Design, pp. 671-675, Nov. 

1997. 

[4] A. H. Salek, J. Lou, and M. Pedram, “A dsm design flow: Putting oorplanning, technology mapping, and gateplacement 

together," in Proceedings of the Design Automation Conference, pp. 128-133, June 1998. 

[5] K. Keutzer, “DAGON: Technology binding and local optimization by DAG matching," in Proceedings of the 

Design Automation Conference, pp. 341-347, June 1987. 

[6] H. J. Taouti, C. W. Moon, R. K. Brayton, and A. Wang, “Performance-oriented technology mapping," in 

Proceedings of the 6th MIT Conference, Advanced Research in VLSI, pp. 79-97, 1990. 

[7] V. Tiwari, S. Malik, and P. Ashar, “Technology mapping for low power," in Proceedings of the Design 

Automation Conference, pp. 74-79, June 1993. 

[8] BuildGates. Ambit Design Group, Cadence Design Systems, Santa Clara, CA, 1999. 

[9] S. Roy, Low-Power-Driven Synthesis Algorithms for Sequential and Combinational Circuits. PhD thesis, 

University of Illinois, Urbana-Champaign, IL, 1998. 

[10] F. Najm, “Transition density:a new measure of activity in digital circuits," IEEE Transactions on Computer 

Aided Design, vol. 12, pp. 310-323, Feb. 1993.

DAC'99, pages 373-378 

Technology Mapping for FPGAs with Nonuniform Pin Delays and Fast Interconnections 

Jason Cong, Yean-Yow Hwang, Songjie Xu 

Department of Computer Science, University of California, Los Angeles, CA 90095 

Abstract 

In this paper we study the technology mapping problem for FPGAs with nonuniform pin delays 

and fast interconnects. We develop the PinMap algorithm to compute the delay optimal mapping 

solution for FPGAs with nonuniform pin delays in polynomial time based on the efficient cut 

enumeration. Compared with FlowMap [5] without considering the nonuniform pin delays, 

PinMap is able to reduce the circuit delay by 15% without any area penalty. For mapping with 

fast interconnects, we present two algorithms, an iterative refinement based algorithm, named 

ChainMap, and a Boolean matching based algorithm, named HeteroBM, which combines the 

Boolean matching techniques proposed in [2] and [3] and the heterogeneous technology mapping 

mechanism presented in [1]. It is shown that both ChainMap and HeteroBM are able to 

significantly reduce the circuit delay by making efficient use of the FPGA fast interconnects 

resources. 

References 

[1] J. Cong and S. Xu, "Delay-Optimal Technology Mapping for FPGAs with Heterogeneous LUTs", Proc. 35th 

ACM/IEEE Design Automation Conf., San Francisco, CA, June, 1998, pp. 704-707. 

[2] J. Cong and Y.-Y. Hwang, "Partially-Dependent Functional Decomposition with Applications in FPGA 

Synthesis and Mapping", Proc. ACM 5th Int'l Symposium on FPGA, Feb. 1997, pp. 35-42. 

[3] J. Cong and Y.-Y. Hwang, "Boolean Matching for Complex PLBs in LUT-based FPGAs with Application to 

Architecture Evaluation", Proc. ACM 6th Int'l Symposium on FPGA, Feb. 1998, pp. 27-34. 

[4] K. Chung and J. Rose, "TEMPT: Technology Mapping for the Exploration of FPGA Architectures with Hard- 

Wired Connections", 29th ACM/IEEE Design Automation Conference, 1992, pp. 361-367. 

[5] J. Cong and Y. Ding, "FlowMap: An Optimal Technology Mapping Algorithm for Delay Optimization in 

Lookup-Table Based FPGA Designs", IEEE Transactions on Computer-Aided Design, Feb. 1994, Vol. 13, No. 1, 

pp. 1-12. 

[6] J. Cong and Y. Ding, "Tutorial and Survey Paper - Combinational Logic Synthesis for L UT Based Field 

Programmable Gate Arrays", ACM Transactions on Design Automation of Electronic Systems, Vol. 1, No. 2, April 

1996, pp. 145-204. 

[7] J. Cong, Y. Ding, and C. Wu, "Cut Ranking and Pruning: Enabling A General And Efficient FPGA Mapping 

Solution" Proc. ACM 4th International Symposium on FPGA, Feb. 1999, pp. 29-35. 

[8] J. Cong, Y.-Y. Hwang, and S. Xu, "Technology Mapping for FPGAs with Nonuniform Pin Delays and Fast 

Interconnection", UCLA Computer Science Department Technical Report CSD-990018. 

[9] J. R. Hauler and J. Wawrzynek, "Garp: A MIPS Processor with a Reconfigurable Coprocessor", Proc. of IEEE 

Symposium on Field-Programmable Custom Computing Machines, 1997, pp 24-33, http://www.cs.berkeley.edu/ 

projects/brass/documents/ GarpArchitecture.html. 

[10] Advanced Micro Devices, "VANTIS VF1 FPGA Data Sheet", Advanced Micro Devices, Inc., Sunnyvale, CA, 

1998.

DAC'99, pages 379-384 

Automated Phase Assignment for the Synthesis of Low Power Domino Circuits 

Priyadarshan Patra 

Strategic CAD Labs, Intel Corporation, Hillsboro, OR 97124-5961 

Unni Narayanan 

Design Technology, Intel Corporation, Santa Clara, CA 95052-8119 

Abstract 

High performance circuit techniques such as domino logic have migrated from the 

microprocessor world into more mainstream ASIC designs. The problem is that domino logic 

comes at a heavy cost in terms of total power dissipation. For mobile and portable devices such 

as laptops and cellular phones, a high power dissipation is an unacceptable price to pay for high 

performance. Hence, we study synthesis techniques that allow designers to take advantage of the 

speed of domino circuits while at the same time to minimize total power consumption. 

Specifically, in this paper we present three results related to automated phase assignment for the 

synthesis of low power domino circuits: (1) We demonstrate that the choice of phase assignment 

at the primary outputs of a circuit can significantly impact power dissipation in the domino block 

(2) We propose a method for efficiently estimating power dissipation in a domino circuit and (3) 

We apply the method to determine a phase assignment that minimizes power consumption in the 

final circuit implementation. Preliminary experimental results on a mixture of public domain 

benchmarks and real industry circuits show potential power savings as high as 34% over the 

minimum area realization of the logic. Furthermore, the low power synthesized circuits still meet 

timing constraints. 

References 

[1] R. Bryant. Graph-based algorithms for boolean manipulation. IEEE Transactions on Computers, C-35(8):677– 

691, 1986. 

[2] S. T. Chakradhar, A. Balakrishnan, and V. D. Agrawal. An exact algorithm for selecting partial scan flip-flops. 

In Design Automation Conference, pages 81–86, 1994. 

[3] S. Chakravarty. On the complexity of using bdds for the synthesis and analysis of boolean circuits. In Allerton 

Conference on Communication, Control and Computing, pages 730–739, 1989. 

[4] A. Chandrakasan and R. Broderson. Low Power Digital CMOS Design. Kluwer Academic Publishers, 1995. 

[5] H. Y. Chen and S. M. Kang. Performance optimization for domino cmos circuit modules. In ICCD, pages 522– 

525, 1997. 

[6] J. C. Costa, J.Monteiro, and S. Devadas. Switching activity estimation using limited depth reconvergent path 

analysis. In International Symposium on low power electronics and design, pages 184–189, 1997. 

[7] S. M. Kang. Data shifting and rotating apparatus. US Patent 4,396,994, August 1983. 

[8] U. K. Narayanan, H. Leong, K. Chung, and C. L. Liu. Low power multiplexer decomposition. In International 

Symposium on low power electronics and design, pages 269–274, 1997. 

[9] U. K. Narayanan and C. L. Liu. Low power logic synthesis for xor based circuits. In International Conference on 

Computer-Aided Design, 1997. 

[10] U. K. Narayanan, P. Pan, and C. L. Liu. Low power logic synthesis under a general delay model. In 

International Symposium on low power electronics and design, 1998. 

[11] R. Panda and F. Najm. Technology decomposition for low-power synthesis. In IEEE Custom Integrated 

Circuits Conference, pages 627–630, 1995. 

[12] P. Patra. Approaches to Design of Circuits for Low-Power Computation. PhD thesis, The University of Texas at 

Austin, 1995. 

[13] P. Patra and D. Fussell. Power-efficient delay-insensitive codes for data transmission. In Proc. of 28th Hawaii 

International Conference on System Sciences, Jan 1995.

[14] M. Pedram. Power minimization in IC design: Principles and applications. ACM Transactions on Design 

Automation of Electronic Systems, 1(1):3–56, 1996. 

[15] R. Puri, A. Bjorksten, and T. Rosser. Logic optimization by output phase assignment in dynamic logic 

synthesis. In International Conference on Computer Aided Design, pages 2–8, 1996. 

[16] N. Weste and K. Eshraghian. Principles of CMOS VLSI Design: A Systems Perspective. Addison-Wesley, 1993.

DAC'99, pages 385-390 

Enhancing Simulation with BDDs and ATPG 

Malay K. Ganai, Adnan Aziz 

Electrical and Computer Engineering, The University of Texas at Austin 

Andreas Kuehlmann 

IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA 

Abstract 

We introduce SImulation Verification with Augmentation (SIVA), a tool for checking safety 

properties on digital hardware designs. SIVA integrates simulation with symbolic techniques for 

vector generation. Specifically, the core algorithm uses a combination of ATPG and BDDs to 

generate input vectors which cover behavior not excited by simulation. Experimental results 

demonstrate considerable improvement in state space coverage compared with either simulation 

or formal verification in isolation. 

Keywords: Formal verification, ATPG, simulation, BDDs, coverage. 

References 

[1] Felice Balarin and A. L. Sangiovanni-Vincentelli. An Iterative Approach to Language Containment. In Proc. of 

the Computer Aided Verification Conf., June 1993. 

[2] R. Kurshan. Formal Verification in a Commercial Setting. In Proc. of the Design Automation Conf., June 1997. 

[3] J. Yuan, J. Shen, J. Abraham, and A. Aziz. On Combining Formal and Informal Verification. In Proc. of the 

Computer Aided Verification Conf., July 1997. 

[4] R. K. Brayton et al. VIS: A System for Verification and Synthesis. In Proc. of the Computer Aided Verification 

Conf., July 1996. 

[5] B. Chen, M. Yamazaki, and M. Fujita. Bug Identification of a Real Chip Design by Symbolic Model Checking. 

In Proc. European Conf. on Design Automation, March 1994. 

[6] H. Cho, G. Hatchel, E. Macii, M. Poncino, and F. Somenzi. A State Space Decomposition Algorithm for 

Approximate FSM Traversal Based on Circuit Structural Analysis. Technical report, ECE/VLSI, Univ. of Colorado 

at Boulder, 1993. 

[7] S. Devadas, A. Ghosh, and K. Keutzer. An Observability-Based Code Coverage Metric for Functional 

Simulation. In Proc. Intl. Conf. on Computer-Aided Design, November 1996. 

[8] David L. Dill. Embedded Tutorial: What's between Simulation and Formal Verification? In Proc. of the Design 

Automation Conf., San Francisco, CA, June 1998. 

[9] A. El-Maleh, T. Marchok, J. Rajski, and W. Maly. Behavior and Testability Preservation Under the Retiming 

Transformation. IEEE Transactions on Computer-Aided Design of Integrated Circuits, May 1997. 

[10] D. Geist, M. Farkas, A. Landver, Y. Lichtenstein, S. Ur, and Y. Wolfsthal. Coverage Directed Test Generation 

Using Formal Verification. In Proc. of the Formal Methods in CAD Conf., November 1996. 

[11] Daniel Geist and Ilan Beer. Efficient Model Checking by Automated Ordering of Transition Relation Partitions. 

In Computer Aided Verification, volume 818 of Lecture Notes in Computer Science, pages 52-71. Springer-Verlag, 

1994. 

[12] P. Goel. An Implicit Enumeration Algorithm to Generate Tests for Combinational Logic Circuits. IEEE 

Transactions on Computers, 1981. 

[13] R. Ho and M. Horowitz. Validation Coverage Analysis for Complex Digital Designs. In Proc. Intl. Conf. on 

Computer-Aided Design, November 1996. 

[14] Richard C. Ho, C. Han Yang, Mark A. Horowitz, and David L. Dill. Architectural Validation for Processors. In 

Proceedings of the International Symposium on Computer Architecture, June 1995. 

[15] Y. Hoskote, D. Moundanos, and J. Abraham. Automatic Extraction of the Control Flow Machine and 

Application to Evaluating Coverage of Verification Vectors. In Proc. Intl. Conf. on Computer Design, Austin, TX, 

October 1995. 

[16] Andreas Kuehlmann and Florian Krohm. Equivalence Checking Using Cuts and Heaps. In Proc. of the Design 

Automation Conf., June 1997.

[17] W. Lee, A. Pardo, G. D. Hachtel, J. Jang, A. Pardo, and F. Somenzi. Tearing Based Automatic Abstraction for 

CTL Model Checking. In Proc. Intl. Conf. on Computer-Aided Design, 1996. 

[18] Kenneth L. McMillan. Symbolic Model Checking. Kluwer Academic Publishers, 1993. 

[19] K. L. McMillan. Verification of an Implementation of Tomasulo's Algorithm by Compositional Model 

Checking. In Proc. of the Computer Aided Verification Conf., Vancouver, BC, Canada, June 1998. 

[20] D. Moundanos, J. Abraham, and Y. Hoskote. A Unified Framework for Design Validation and Manufacturing 

Test. In Proc. Intl. Test Conf., 1996. 

[21] R. Mukherjee, J. Jain, K. Takayama, M. Fujita, J. A. Abraham, and D. S. Fussell. Efficient Combination 

Verification Using Cuts and Overlapping BDDs. In Proc. Intl. Workshop on Logic Synthesis, May 1997. 

[22] K. Ravi and F. Somenzi. High Density Reachability Analysis. In Proc. Intl. Conf. on Computer-Aided Design, 

Santa Clara, CA, November 1995. 

[23] R. Rudell. Dynamic Variable Ordering for Binary Decision Diagrams. In Proc. Intl. Conf. on Computer-Aided 

Design, November 1993. 

[24] P. Stephan, R. K. Brayton, and A. L. Sangiovanni-Vincentelli. Combination Test Generation using 

Satisfiability. IEEE Transactions on Computer-Aided Design of Integrated Circuits, September 1996. 

[25] U. Stern and D. L. Dill. Using Magnetic Disk instead of Main Memory in the Murphi Verifier. In Proc. of the 

Computer Aided Verification Conf., June 1998. 

[26] D. Xiang, S. Venkataraman, W. K. Fuchs, and J. H. Patel. Partial Scan Design Based on Circuit State 

Information. In Proc. of the Design Automation Conf., Las Vegas, NV, June 1996. 

[27] C. H. Yang and D. L. Dill. Validation with Guided Search of the State Space. In Proc. of the Design 

Automation Conf., June 1998. 

[28] UC Berkeley. www.cad.eecs.berkeley.edu/~vis.

DAC'99, pages 391-396 

Cycle-based Symbolic Simulation of Gate-level Synchronous Circuits 

Valeria Bertacco† Maurizio Damiani‡ Stefano Quer‡ 

†Vera Group, Synopsys, Inc., Palo Alto, CA 94303 

‡Advanced Technology Group, Synopsys, Inc., Mountain View, CA 94043 

ABSTRACT 

Symbolic methods are often considered the state-of-the-art technique for validating digital 

circuits. Due to their complexity and unpredictable run-time behavior, however, their potential is 

currently limited to small-to-medium circuits. Logic simulation privileges capacity, it is nicely 

scalable, flexible, and it has a predictable run-time behavior. For this reason, it is the common 

choice for validating large circuits. Simulation, however, typically visits only a small fraction of 

the state space: The discovery of bugs heavily relies on the expertise of the designer of the test 

stimuli. 

In this paper we consider a symbolic simulation approach to the validation problem. Our 

objective is to trade-off between formal and numerical methods in order to simulate a circuit 

with a "very large number" of input combinations and sequences in parallel. We demonstrate 

larger capacity with respect to symbolic techniques and better efficiency with respect to cyclebased 

simulation. We show that it is possible to symbolically simulate very large trace sets in 

parallel (over 100 symbolic inputs) for the largest ISCAS benchmark circuits, using 96 Mbytes 

of memory. 

References 

[1] O. Coudert, C. Berthet, and J. C. Madre. Verification of Sequential Machines Based on Symbolic Execution. In 

Lecture Notes in Computer Science 407, Springer Verlag, pages 365–373, Berlin, Germany, 1989. 

[2] H. Touati, H. Savoj, B. Lin, R. K. Brayton, and A. Sangiovanni-Vincentelli. Implicit state enumeration of finite 

state machines using BDD’s. In Proc. ICCAD, pages 130–133, November 1990. 

[3] J. Burch, E. Clarke, D. Long, K. McMillan, and D. Dill. Symbolic Model Checking for Sequential Circuit 

Verification. IEEE Transactions on CAD, 13(4):401–424, April 1994. 

[4] Z. Barzilai, J. L. Carter, B. K. Rosen, and J. D. Rutledge. Hss- a high-speed simulator. IEEE Trans. on 

CAD/ICAS, pages 601–617, July 1987. 

[5] C. Hansen. Hardware logic simulation by compilation. In Proc. DAC, pages 712–715, June 1987. 

[6] L.T. Wang, N. E. Hoover, E. H. Porter, and J. J. Zasio. Ssim: A software levelized compiled-code simulator. In 

Proc. DAC, June 1987. 

[7] C.J. DeVane. Efficient circuit partitioning to extend cycle simulation beyond synchronous circuits. In Proc. 

ICCAD, pages 154–161, nov 1997. 

[8] P. Jain and G. Gopalakrishnan. Efficient symbolic simulation-based verification using the parametric form of 

boolean expressions. IEEE Trans. on CAD/ICAS, 13:1005–1015, August 1994. 

[9] R. E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Trans. on Computers, 

35(8):677–691, August 1986. 

[10] R. E. Bryant. Symbolic Boolean Manipulation with Ordered Binary–Decision Diagrams. ACM Computing 

Surveys, 24(3):293–318, September 1992. 

[11] H. Cho, G. Hachtel, S. Jeong, B. Plessier, E. Shwarz, and F. Somenzi. Atpg aspects of fsm verification. In Proc. 

ICCAD, pages 134–137, November 1990. 

[12] P. McGeer, K. McMillan, A. Saldanha, A. Sangiovanni-Vincentelli, and P. Scaglia. Fast discrete function 

evaluation using decision diagrams. In Proc. ICCAD, pages 402–407, November 1995. 

[13] P. Ashar and S. Malik. Fast Functional Simulation using Branching Programs. In Proc. ICCAD, pages 408– 

412, San Jose, California, November 1995. 

[14] Y. Luo, T. Wongsonegoro, and A. Aziz. Hybrid Techniques for Fast Functional Simulation. In Proc. 

IEEE/ACM DAC’98, pages 664–667, San Francisco, California, June 1998.

[15] A. Hu and D. Dill. Reducing bdd size by exploiting functional dependencies. In Proc. DAC, pages 266–271, 

June 1993. 

[16] C.A.J. van Eijk and J. A. G. Jess. Exploiting functional dependencies in fsm verification. In Proc. EDAC, pages 

9–14, February 1996. 

[17] F. Brglez, D. Bryan, and K. Kozminski. Combinatorial Profiles of Sequential Benchmark Circuits. In Proc. 

IEEE ISCAS’89, pages 1929–1934, May 1989.

DAC'99, pages 397-401 

Exploiting Positive Equality and Partial Non-Consistency 

in the Formal Verification of Pipelined Microprocessors 

Miroslav N. Velev*, Randal E. Bryant‡, * 

*Department of Electrical and Computer Engineering 

‡School of Computer Science 

Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A. 

Abstract 

We study the applicability of the logic of Positive Equality with Uninterpreted Functions (PEUF) 

[2][3] to the verification of pipelined microprocessors with very large Instruction Set 

Architectures (ISAs). Abstraction of memory arrays and functional units is employed, while the 

control logic of the processors is kept intact from the original gate-level designs. PEUF is an 

extension of the logic of Equality with Uninterpreted Functions, introduced by Burch and Dill 

[4], that allows us to use distinct constants for the data operands and instruction addresses needed 

in the symbolic expression for the correctness criterion. We present several techniques that make 

PEUF scale very efficiently for the verification of pipelined microprocessors with large ISAs. 

These techniques are based on allowing a limited form of non-consistency in the uninterpreted 

functions, representing initial memory state and ALU behaviors. Our tool required less than 30 

seconds of CPU time and 5 MB of memory to verify a 5-stage MIPS-like pipelined processor 

that implements 191 instructions of various classes. The verification was done by 

correspondence checking - a formal method, where a pipelined microprocessor is compared 

against a non-pipelined specification. 

References 

[1] R.E. Bryant, “Symbolic Boolean Manipulation with Ordered Binary-Decision Diagrams,” ACM Computing 

Serveys, Vol. 24, No. 3 (September 1992), pp. 293-318. 

[2] R.E. Bryant, S. German, and M.N. Velev, “Exploiting Positive Equality in a Logic of Equality with 

Uninterpreted Functions,”2 Computer-Aided Verification, LNCS, Springer-Verlag, June 1999. 

[3] R.E. Bryant, S. German, and M.N. Velev, “Processor Verification Using Efficient Reductions of the Logic of 

Uninterpreted Functions to Propositional Logic,”2 Technical Report CMU-CS-99-115, Carnegie Mellon University, 

1999. 

[4] J.R. Burch, and D.L. Dill, “Automated Verification of Pipelined Microprocessor Control,” CAV‘94, D.L. Dill, 

ed., LNCS 818, Springer-Verlag, June 1994, pp. 68-80. 

[5] J.R. Burch, “Techniques for Verifying Superscalar Microprocessors,” 33rd Design Automation Conference 

(DAC’96), June 1996, pp. 552-557. 

[6] Y.-A. Chen, “Arithmetic Circuit Verification Based on Word-Level Decision Diagrams,” Ph.D. thesis, School of 

Computer Science, Carnegie Mellon University, May 1998. 

[7] A. Goel, K. Sajid, H. Zhou, A. Aziz, and V. Singhal, “BDD Based Procedures for a Theory of Equality with 

Uninterpreted Functions,” CAV‘98, Springer-Verlag, June 1998. 

[8] A.J. Isles, R. Hojati, and R.K. Brayton, “Computing Reachable Control States of Systems Modeled with 

Uninterpreted Functions and Infinite Memory,” CAV’98, Springer-Verlag, June 1998. 

[9] G. Kane, and J. Heinrich, MIPS RISC Architecture, Prentice Hall, Englewood Cliffs, NJ, 1992. 

[10] G. Nelson, and D.C. Oppen, “Simplification by Cooperating Decision Procedures,” ACM Transactions on 

Programming Languages and Systems, Vol. 1, No. 2, October 1979, pp. 245-257. 

[11] M. Pandey, “Formal Verification of Memory Arrays,” Ph.D. thesis, School of Computer Science, Carnegie 

Mellon University, May 1997. 

[12] D.A. Patterson, and J.L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 2nd 

edition, Morgan Kaufmann Publishers, San Francisco, CA, 1998. 

[13] M.N. Velev, R.E. Bryant, and A. Jain, “Efficient Modeling of Memory Arrays in Symbolic Simulation,”2 

CAV‘97, O. Grumberg, ed., LNCS 1254, Springer-Verlag, June 1997, pp. 388-399.

[14] M.N. Velev, and R.E. Bryant, “Efficient Modeling of Memory Arrays in Symbolic Ternary Simulation,”2 

TACAS’98, B. Steffen, ed., LNCS 1384, Springer-Verlag, March-April 1998, pp. 136-150. 

[15] M.N. Velev, and R.E. Bryant, “Bit-Level Abstraction in the Verification of Pipelined Microprocessors by 

Correspondence Checking,”2 FMCAD’98, G. Gopalakrishnan and P. Windley, eds., LNCS 1522, Springer-Verlag, 

November 1998, pp. 18-35.

DAC'99, pages 402-407 

Formal Verification Using Parametric Representations of Boolean Constraints 

Mark D. Aagaard, Robert B. Jones, Carl-Johan H. Seger 

Strategic CAD Labs, Intel Corporation, Hillsboro, OR 97124, USA 

Abstract 

We describe the use of parametric representations of Boolean predicates to encode data-space 

constraints and significantly extend the capacity of formal verification. The constraints are used 

to decompose verifications by sets of case splits and to restrict verifications by validity 

conditions. Our technique is applicable to any symbolic simulator. We illustrate our technique on 

state-of-the-art Intel ® designs, without removing latches or modifying the circuits in any way. 

References 

[1] M. D. Aagaard, R. B. Jones, and C.-J. H. Seger. Combining theorem proving and trajectory evaluation in an 

industrial environment. In ACM/IEEE Design Automation Conference, pages 538–541. ACM/IEEE, July 1998. 

[2] G. Boole. The Mathematical Analysis of Logic. Macmillan 1847. Reprinted 1948, B. Blackwell, 1847. 

[3] R. E. Bryant. On the complexity of VLSI implementations and graph representations of boolean functions with 

applications to integer multiplication. IEEE Transactions on Computers, C-40(2):205–213, Feb. 1991. 

[4] J. Burch, E. Clarke, and K. McMillan. Sequential circuit verification using symbolic model checking. In 

ACM/IEEE Design Automation Conference, pages 46–51. ACM/IEEE, 1990. 

[5] Y.-A. Chen and R. Bryant. Verification of floating-point adders. In A. J. Hu and M. Y. Vardi, editors, Workshop 

on Computer-Aided Verification, pages 488–499, July 1998. 

[6] O. Coudert, C. Berthet, and J. C. Madre. Verification of sequential machines using Boolean functional vectors. 

In Proceedings of the IMEC-IFIP Workshop on Applied Formal Methods for Correct VLSI Design, pages 179–196, 

Nov. 1989. 

[7] O. Coudert and J. C. Madre. A unified framework for the formal verification of sequential circuits. In 

International Conference on Computer-Aided Design, pages 78–82, Nov. 1990. 

[8] J. M. Feldman and C. T. Retter. Computer Architecture. McGraw-Hill, 1994. 

[9] S. Hazelhurst and C.-J. H. Seger. Symbolic trajectory evaluation. In T. Kropf, editor, Formal Hardware 

Verification, chapter 1, pages 3–78. Springer Verlag; New York, 1997. 

[10] IEEE. IEEE Standard for binary floating-point arithmetic. ANSI/IEEE Std 754-1985, 1985. 

[11] Intel. Pentium Processor User’s Manual, Volume 3: Architecture and Programming Manual. Intel Corporation, 

1993. 

[12] P. Jain and G. Gopalakrishnan. Efficient symbolic simulation-based verification using the parametric form of 

boolean expressions. IEEE Transactions on Computer Aided Design, 1994. 

[13] C.-J. H. Seger and R. E. Bryant. Formal verification by symbolic evaluation of partially-ordered trajectories. 

Formal Methods in System Design, 6(2):147–189, Apr. 1994.

DAC'99, pages 408-413 

Vertical Benchmarks for CAD 

Christopher Inacio, Herman Schmit, David Nagle, Andrew Ryan, Donald E. Thomas, 

Yingfai Tong, Ben Klass 

Dept. of Electrical and Computer Engineering, Carnegie Mellon University 

Pittsburgh, PA 15213, USA 

ABSTRACT 

Vertical benchmarks are complex system designs represented at multiple levels of abstraction. 

More effective than component-based CAD benchmarks, vertical benchmarks enable 

quantitative comparison of CAD techniques within or across design flows. This work describes 

the notion of vertical benchmarks and presents our benchmark, which is based on a commercial 

DSP, by comparing two alternative design flows. 

References 

[1] F. Brglez, D. Bryan, K. Kozminski. “Combinational Profiles of Sequential Benchmark Circuits”, ISCAS ‘89, pp. 

1929-1934, 1989. 

[2] J. Darnauer and W. Dai, “A Method for Generating Random CIrcuits and Its Application to Routability 

Measurement”, in 4th ACM/SIGDA Int’l Symp. on FPGAs, pp. 66-72, Feb. 1996. 

[3] N. Dutt. “Current Status of HLSW Benchmarks and Guidelines for Benchmark Submission”, HLSynth ’92 

Benchmark, Sept. 1992. 

[4] M. D. Hutton, J. P. Grossman, J. S. Rose, and D. G. Corneil, “Characterization and Parameterized Random 

Generation of Digital Circuits,” in 33rd ACM/SIGDA Design Automation Conference (DAC), pp. 94-99, June, 1996. 

[5] Motorola Corporation, DSP56000 Digital Signal Processor Family Manual. 1995. 

[6] Programmable Electronics Performance Corporation, URL: http://www.prep.org/synth.htm. 

[7] System Performance Evaluation Corporation (SPEC), SPEC CPU95 Version 1.1, URL: http://www.spec.org, 

August 21, 1995. 

[8] S. Yang. “Logic Synthesis and Optimization Benchmarks User Guide, Version 3.0”, Microelectronics Center of 

North Carolina, Research Triangle Park, NC, Jan. 1991.

DAC'99, pages 414-419 

A Framework for User Assisted Design Space Exploration 

X. Hu a , G. W. Greenwood b , S. Ravichandran b , G. Quan a 

a Dept. of Computer Science & Engineering, University of Notre Dame, Notre Dame, IN 46556 

b Dept. of Electrical & Computer Engineering, Western Michigan University, 

Kalamazoo, MI 49008 

Abstract 

Much effort in hardware/software co-design has been devoted to developing "push-button" types 

of tools for automatic hardware/software partitioning. However, given the highly complex nature 

of embedded system design, user guided design exploration can be more effective. In this paper , 

we propose a framework for designer assisted partitioning that can be used in conjunction with 

any given search strategy. A key component of this framework is the visualization of the design 

space, without enumerating all possible design configurations. Furthermore, this design space 

representation provides a straightforward way for a designer to identify promising partitions and 

hence guide the subsequent exploration process. Experiments have shown the effectiveness of 

this approach. 

References 

[1] W. H. Wolf. Hardware-software co-design of embedded systems. Proc. IEEE, 82:967-989, 1994. 

[2] M. Chiodo, P. Giusto, A. Jurecska, H. Hsieh, A. Sangiovanni-Vincentelli, and L. Lavagno. Hardware-software 

codesign of embedded systems. IEEE Micro, 14:26-36, 1994. 

[3] R. Ernst, J. Henkel, and T. Benner. Hardware-software cosynthesis for microcontrollers. IEEE Design & Test of 

Computers, 10:64-75, 1993. 

[4] R. Gupta and G. De Micheli. Hardware-software cosynthesis for digital systems. IEEE Design & Test of 

Computers, 10:29-40, 1993. 

[5] E. Barros, W. Rosenstiel, and X. Xiong. A method for partitioning unity language to hardware and software. 

Proc. European Design Automation Conf., pages 220-225, 1994. 

[6] S. Prakash and A. Parker. Sos: Synthesis of application-specific heterogeneous multiprocessor systems. J. Para. 

& Dist. Computers, 16:338-351, 1992. 

[7] S. Kumar, J. Aylor, B. Johnson, and W. Wulf. Object-oriented techniques in hardware design. IEEE Computer, 

27:64-70, 1994. 

[8] B. Dave, G. Lakshminarayana, and N. Jha. Cosyn: Hardware-software co-synthesis of embedded systems. Proc. 

Design Automation Conf., pages 703-708, 1997. 

[9] J. Teich, T. Blickle, and L. Thiele. An evolutionary approach to system-level synthesis. Proc. Int'l Workshop 

Hardware/Software Codesign, pages 167-171, 1997. 

[10] R. Dick and N. Jha. Mogac: A multiobjective genetic algorithm for the co-synthesis of hardware-software 

embedded systems. IEEE/ACM Int'l Conf. on CAD, pages 522-529, 1997. 

[11] L. Garber and D. Sims. In pursuit of hardware-software codesign. IEEE Computer, 31:12-14, 1998. 

[12] X. Hu and G. Greenwood. Evolutionary approach to hardware/software partitioning. IEE Proc.-Comput. Digit. 

Tech., 145:203-209, 1998. 

[13] W. Chapman and J. Rozenblit. The system design problem is np-complete. IEEE. Conf. Sys., Man, & Cyber., 

pages 1880-1884, 1994. 

[14] Z. Michalewicz and M. Schoenauer. Evolutionary algorithms for constrained parameter optimization problems. 

Evolutionary Comp., 4:1-32, 1996. 

[15] E. Weinberger. Correlated and uncorrelated landscapes and how to tell the difference. J. Biol. Cybern., 63:325- 

336, 1990. 

[16] G. Greenwood and X. Hu. Are landscapes for constrained optimization problems statistically isotropic? Physica 

Scripta, 57:321-323, 1998. 

[17] Y. Saad and M. H. Schultz. Topological properties of hypercube. IEEE Trans. on Computers, 37:867-870, 

1988. 

[18] G. Greenwood and S. Ravichandran. Fitness landscapes on torus.

[19] R. Sambandam and X. Hu. Predicting timing behavior in architectural design exploration of real-time 

embedded systems. Proceedings of the 34th IEEE/ACM Design Automation Conference, pages 157-160, 1997.

DAC'99, pages 420-424 

Fast Prototyping: a system design flow applied to a complex System-On-Chip 

multiprocessor design 

Benoit Clement, Richard Hersemeule, 

STMicroelectronics, 5bis, Chemin de la Dhuy, F-38240 Meylan France 

Etienne Lantreibecq 

STMicroelectronics, 850 rue Jean Monnet - BP 16, F-38926 Crolles Cedex, France 

Bernard Ramanadin 

STMicroelectronics, STAR US RnD c/o Hitachi HMSI, San Jose, CA 95134, USA 

Pierre Coulomb, Francois Pogodalla 

STMicroelectronics, 5bis, Chemin de la Dhuy, F-38240 Meylan France 

ABSTRACT 

This paper describes a new design flow that significantly reduces time-to-market for highly 

complex multiprocessor-based System-On-Chip designs. This flow, called Fast Prototyping, 

enables concurrent hardware and software development, early verification and productive re-use 

of intellectual property. We describe how using this innovative system design flow, that 

combines different technologies, such as C modeling, emulation, hard Virtual Component re-use 

and CoWare N2C TM , we achieve better productivity on a multi-processor SOC design. 

Keywords: System design, Hardware/Software (HW/SW) co-design, Virtual Component (VC) 

re-use, Fast Prototyping, system verification, system modeling. 

REFERENCES 

[1] M. Genoe, Alcatel, “Requirements capturing and specification of Systems-on-Chip”, MEDEA/ESPRIT 

conference on HW/SW codesign, 1998 

[2] S. Tsasakou, C. Dre, H. Kharatanasis, A. Birbas, Univ. of Patras/Intracom SA, “Combined assessment of an 

industrial current practice and CoWare’s methodology to the codesign/cosimulation problem”, MEDEA/ESPRIT 

conference on HW/SW codesign, 1998 

[3] C. Berthet, G. Mas, F. Pogodalla & al., STMicroelectronics, “Functional verification methodology of Chameleon 

processor”, 33rd DAC, 1996 

[4] K. Hashmi, A. C. Bruce, “Design and use of a system-level specification and verification methodology”, EURO- 

DAC 95 

[5] J. Monaco, D. Holloway, R. Raina, “Functional verification methodology for the PowerPC 604 microprocessor”, 

33rd DAC, 1996 

[6] P. Paulin, “A flexible hardware/software development environment and its application to consumer multimedia 

products designs”, CODES/CASHE’98 

[7] A. Sangiovanni-Vincentelli, J. Liu, M. Lajolo, “Software timing analysis using HW/SW cosimulation and 

instruction set simulator”, CODES/CASHE’98 

[8] A.A. Jerraya, J.M. Daveau, G. Marchioro, “hardware/software codesign of an ATM network interface card: a 

case study”, CODES/CASHE’99 

[9] M. Benjamin, D. Geist, A. Hartman, G. Mas, R. Smeets, Y. Wolfsthal, STMicroelectronics and IBM Science and 

Technology, Haifa Research Lab. "A Study in Coverage-Driven Test Generation", DAC’99

DAC'99, pages 425-428 

Verification and Management of a multimillion-gate embedded core design 

Johann Notbauer, Thomas Albrecht, Georg Niedrist 

Siemens, Austria, A-1030 Vienna, Austria 

Stefan Rohringer 

Siemens Semiconductors, A-8020 Graz, Austria 

ABSTRACT 

Verification is one of the most critical and time-consuming tasks in today's design processes. 

This paper demonstrates the verification process of a 8.8 million gate design using HWsimulation 

and cycle simulation-based HW/SW-coverification. The main focuses are overall 

methodology, testbench management, the verification task itself and defect management. The 

chosen verification process was a real success: the quality of the designed hard- and software 

was increased and furthermore the time needed for integration and test of the design in the 

context of the overall system was greatly reduced. 

REFERENCES 

[1] Albrecht, Notbauer, Rohringer: "HW/SW Coverification Performance Estimation & Benchmark for a 24 

Embedded RISC Core Design", 35th ACM/IEEE Design Automation Conference, pages 808-811, 1998 

[2] Albrecht, "Concurrent Design Methodology and Configuration Management of the Siemens EWSD-CCS7E 

Processor System Simulation", 32nd ACM/IEEE Design Automation Conference, pages 222-227, 1995 

[3] Cohen: "VHDL, Answers to Frequently Asked Questions", Kluwer Academic Publishers, 1997 

[4] Jantsch, Notbauer, Albrecht, "Testcase Development for large Telecom Systems", 2nd IEEE International High 

Level Design Validation and Test Workshop, 1997 

[5] Pure Software, "Distributed Defect Tracking System (PureDDTS), Administrator's Manual", 1995 

[6] Synopsys Inc. "Cyclone VHDL Coding Style Guide V1.1b", 1997

DAC'99, page 429 Panel: Parasitic Extraction Accuracy: How Much Is Enough? 

Chair: Paul Franzon – North Carolina State University, Raleigh, NC 

Panel Mambers: Mark Basel, Aki Fujimura, Sharad Mehrotra, Ron Preston, Robin C. Sarma, 

Marty Walker 

The effect of parasitic elements on chip performance is well known, however the relative 

importance of this effect is becoming more critical to a chip's performance. To cope with this 

new design hazard there are a number of parasitic extraction tools and methodology approaches 

available to the circuit designer. Some developed for the general market and some developed for 

internal use. With each product having its own claims and approaches, deciding on a tool or 

extraction strategy is a confusing exercise. The purpose of this panel is to help the designer and 

CAD manager determine how to properly compare extractors and how to put them to use. The 

panel will address a number of questions including; What is the best way to accurately handle 

parasitic extraction while dealing with increasingly large and complex (SOC, mixed signal) 

chips? How to determine and achieve the required extraction accuracy for a particular design 

situation? How is extraction accuracy measured? How can each extractor be compared and 

contrasted with some degree of confidence? Can circuit design techniques and/or tool 

methodologies be used to reduce the extraction effort? How should process variations or 

inductive effects can handled ? What's the best way to deal with the data volume problem?

DAC'99, page 430-435 

Mixed-Vth (MVT) CMOS Circuit Design Methodology for Low Power Applications 

Liqiong Wei, Zhanping Chen, and Kaushik Roy 

School of Electrical and Computer Engineering, Purdue University, W. Lafayette, IN 47907 

Yibin Ye and Vivek De 

Intel Corp., Hillsboro, OR 97124 

Abstract 

Dual threshold technique has been proposed to reduce leakage power in low voltage and low 

power circuits by applying a high threshold voltage to some transistors in non-critical paths, 

while a low-threshold is used in critical path(s) to maintain the performance. Mixed-Vth (MVT) 

static CMOS design technique allows different thresholds within a logic gate, thereby increasing 

the number of high threshold transistors compared to the gate-level dual threshold technique. In 

this paper, a methodology for MVT CMOS circuit design is presented. Different MVT CMOS 

circuit schemes are considered and three algorithms are proposed for the transistor-level 

threshold assignment under performance constraints. Results indicate that MVT CMOS design 

technique can provide about 20% more leakage reduction compared to the corresponding gatelevel 

dual threshold technique. 

References 

[1] J. M. C. Stork, “Technology Leverage for Ultra-Low Power Information Systems", Proceedings of the IEEE, 

Vol.83, No.4, pp. 607-618, 1995. 

[2] A. P. Chandrakasan, S. Sheng and R. W. Brodersen, “Low-Power CMOS Digital Design", IEEE Journal of 

Solid-State Circuits, Vol.27, No.4, pp.473, 1992. 

[3] S. Mutoh, et al., “1-V Power Supply High-Speed Digital Circuit Technology with Multithreshold-Voltage 

CMOS", IEEE Journal of Solid-State Circuits, Vol.30, No.8, pp. 847-854, 1995. 

[4] Z. Chen, C. Diaz, J. Plummer, M. Cao and W. Greene, “0.18um Dual Vt MOSFET Process and Energy-Delay 

Measurement", IEDM Digest, pp. 851, 1996. 

[5] L. Wei, Z. Chen, K. Roy, M.C. Johnson, Y. Ye and V. De, "Design and Optimization of Dual Threshold Circuits 

for Low Voltage Low Power Applications", IEEE Transactions on VLSI Systems, Vol.7, No. 1, pp. 16-24, 1999 

[6] N.Weste and K. Eshraghian, Principles of CMOS VLSI Design: a system perspective, Addison-Wesley 

Publishing Company, pp. 221-223, 1992 

[7] Q. Wang and S. Vrudhula, “Static Power Optimization of Deep Submicron CMOS Circuits for Dual Vt 

Technology", International Conference on Computer-Aided Design, pp. 490-494, 1998. 

[8] B.J. Sheu, D.L. Scharfetter, P.K. Ko, and M.C. Teng, “BSIM: Berkeley Short-Channel IGFET Model for MOS 

Transistors", IEEE J. Solid-State Circuits, SC-22, No.4, pp. 558-566, 1987. 

[9] M. Johnson, D. Somasekhar, and K. Roy, “Deterministic Estimation of Minimum and Maximum Leakage 

Conditions in CMOS Logic," IEEE Transactions on Computer-Aided Design of IC's, accepted for publication.

DAC'99, pages 436-441 Stand-by Power Minimization through Simultaneous 

Threshold Voltage Selection and Circuit Sizing 

Supamas Sirichotiyakul, Tim Edwards, Chanhee Oh, Jingyan Zuo, Abhijit Dharchoudhury, 

Rajendran Panda, and David Blaauw 

Advanced Tools, Motorola Inc., Austin, TX 

Abstract 

We present a new approach for estimation and optimization of the average stand-by power 

dissipation in large MOS digital circuits. To overcome the complexity of state dependence in 

average leakage estimation, we introduce the concept of "dominant leakage states" and use state 

probabilities. Our method achieves speed-ups of 3 to 4 orders of magnitude over exhaustive 

SPICE simulations while maintaining accuracies within 9% of SPICE. This accurate estimation 

is used in a new sensitivity-based leakage and performance optimization approach for circuits 

using dual Vt processes. In tests on a variety of industrial circuits, this approach was able to 

obtain 81-100% of the performance achievable with all low Vt transistors, but with 1/3 to 1/6 the 

stand-by current. 

Keywords: Low-power-design, Dual-Vt, Leakage 

References 

[1] N. Rohrer, et al. “A 480MHz RISC microprocessor in a 0.12 um Leff CMOS technology with copper 

interconnects”, IEEE International Solid-State Circuits Conference, 1998. 

[2] Y. Oowaki, et al., “A Sub-0.1um Circuit Design with Substrate- over-Biasing”, ISSCC, pages88, February 1998. 

[3] J. Kao, A. Chandrakasan, D. Antoniadis, “Transistor sizing issues and tool for multi-threshold CMOS 

technology”, Proc. Design Automation Conference, 1997 

[4] L.Wei, et al. “Design and Optimization of Low Voltage High Performance Dual Threshold CMOS Circuits”, 

35th Design Automation Conference, 1998 

[5] Qi Wang, et al. “Static power optimization of deep submicron CMOS circuits for dual V t technology,” ICCAD 

1998. 

[6] Z. Chen, et al. “Estimation of Standby Leakage Power in CMOS Circuits Considering Accurate Modeling of 

Transistor Stacks”, ISLPED, 1998. 

[7] J. Halter and F.N. Najm, “A gate-level leakage power reduction method for ultra-low-power CMOS circuits,” 

Custom Integrated Circuit Conference, 1997. 

[8] P. Pant, et. al., “Device-circuit optimization for minimal energy and power consumption in CMOS random logic 

networks,” 34th Design Automation Conference, June 1997. 

[9] S.Ercolani, M.Favalli, M.Damiani, P.Olivo, B.Ricco. “Testability Measures in Pseudorandom Testing”, IEEE 

Trans. on CAD, 1992, v.11, n.6, pp.794-800. 

[10] J.P. Fishburn, et al., “TILOS: A posynomial programming approach to transistor sizing,” ICCAD, Nov 1985 

[11] A. Dharchoudhury, et al., “Fast and accurate timing simulation with regionwise quadratic models of MOS I-V 

characteristics,” ICCAD, Nov. 1994, pp190-194 

[12] A. Dharchoudhury, et. al., “Transistor-level sizing and timing verification of domino circuits in the PowerPC TM 

microprocessor,” ICCD, October 1997.

DAC'99, pages 442-445 

Leakage Control With Efficient Use of Transistor Stacks in Single Threshold CMOS 

Mark C. Johnson 

Rose-Hulman Institute of Technology, Terre Haute, IN 47803-3999, USA 

Dinesh Somasekhar, Kaushik Roy 

Purdue University, West Lafayette, IN 47907-1285, USA 

ABSTRACT 

The state dependence of leakage can be exploited to obtain modest leakage savings in CMOS 

circuits. However, one can modify circuits considering state dependence and achieve larger 

savings. We identify a low leakage state and insert leakage control transistors only where 

needed. Leakage levels are on the order of 35% to 90% lower than those obtained by state 

dependence alone. 

REFERENCES 

[1] Chen, Z., Johnson, M., Wei, L., and Roy, K. Estimation of standby leakage power in CMOS circuits considering 

accurate modeling of transistor stacks. Proceedings of the Symposium on Low Power Design and Electronics 

(1998), 239-244. 

[2] Cormen, T.H., Leiserson, G.E., and Rivest, R.L. Introduction to Algorithms, The MIT Press, Cambridge, MA, 

1990. 

[3] Gil, J., Je, M., Lee, J., and Shin, H. A high speed and low power SOI inverter using active body bias. 

Proceedings of the Symposium on Low Power Electronics and Design.(1998), 59-63. 

[4] Halter, J.P., and Najm, F. A gate-level leakage power reduction method for ultra-low-power CMOS circuits. 

Proceedings of the IEEE Custom Integrated Circuits Conference (1997), 475-478. 

[5] Johnson, M.C., Somasekhar, D., and Roy, K. A model for leakage control by MOS transistor stacking. Tech. 

Rep. TRECE 97-12, Purdue University, School of Electrical and Computer Engineering, 1997. 

[6] Kobayashi, T., and Sakurai, T. Self-adjusting threshold-voltage scheme (SATS) for low-voltage high-speed 

operation. Proceedings IEEE Custom Integrated Circuits Conference (1994), 271-274. 

[7] Kuroda, T., et al. A 0.9v 150MHz 10 mW 4mm 2 2-D discrete cosine transform core processor with variablethreshold-voltage 

scheme. Proceedings IEEE International Solid-State Circuits Conference (1996), 166-167. 

[8] Maxwell, P.C., and Rearick, J.R. A simulation-based method for estimating defect-free IDDQ. Digest of Papers, 

IEEE International Workshop on IDDQ Testing (1997), 8O-84. 

[9] Mutoh, S., et al. 1-v power supply high-speed digital circuit technology with multithreshold-voltage CMOS. 

IEEE Journal of Solid-State Circuits, vol.30, no.8 (Aug. 1995), 847-853. 

[10] Shigematsu, S., et. al. A 1-V high-speed MTCMOS circuit scheme for power-down applications. IEEE 

Symposium on VLSI Circuits Digest of Technical Papers (1995), 125-126. 

[11] Vieri, C., et al. SOIAS: Dynamically variable threshold SOI with active substrate. Proceedings of the 

Symposium on Low Power Electronics (1995), 86-87. 

[12] Wei, L., Chen, Z., Roy, K., Johnson, M.C., Ye, Y., and De, V. Design and optimization of dual threshold 

circuits for low voltage low power applications. IEEE Transactions on Very Large Scale Integration (VLSI) 

Systems, vol.7, no.1 (March 1999), 16-24.

DAC'99, pages 446-451 

A Practical Gate Resizing Technique Considering Glitch Reduction for Low Power Design 

Masanori Hashimoto, Hidetoshi Onodera and Keikichi Tamaru 

Department of Communications and Computer Engineering,Kyoto University 

Abstract 

We propose a method for power optimization that considers glitch reduction by gate sizing based 

on the statistical estimation of glitch transitions. Our method reduces not only the amount of 

capacitive and short-circuit power consumption but also the power dissipated by glitches which 

has not been exploited previously. The effect of our method is verified experimentally using 8 

benchmark circuits with a 0.6 µm standard cell library. Our method reduces the power 

dissipation from the minimum-sized circuits further by 9.8% on average and 23.0% maximum. 

We also verify that our method is effective under manufacturing variation. 

References 

[1] A. Shen, A. Ghosh, S. Devadas and K. Keutzer, ‘‘On average power dissipation and random pattern testability of 

CMOS combinational logic networks,’’ Proc. ICCAD, pp. 402--407, 1992. 

[2] D. Brand and C. Visweswariah, ‘‘Inaccuracies in power estimation during logic synthesis,’’ Proc. ICCAD, pp. 

388--394, 1996. 

[3] F. N. Najm and M. Y. Zhang, ‘‘Extreme delay sensitivity and the worst-case switching activity in VLSI 

circuits,’’ Proc. DAC, pp. 623--627, 1995. 

[4] Y. Tamiya and Y. Matsunaga, ‘‘LP based cell selection with constraints of timing, area, and power 

consumption,’’ Proc. ICCAD, pp. 378--381, 1994. 

[5] M. Borah, R. M. Owens, and M. J. Irwin, ‘‘Transistor sizing for minimizing power consumption of CMOS 

circuits under delay constraint,’’ Proc. ISLPD, pp. 167--172, 1995. 

[6] H.-R. Lin and T. T. Hwang, ‘‘Power reduction by gate sizing with path-oriented slack calculation,’’ Proc. ASP- 

DAC, pp. 7--12, 1995. 

[7] S. S. Sapatnekar and W. Chuang, ‘‘Power vs. delay in gate sizing: Conflicting objectives?,’’ Proc. ICCAD, pp. 

463--466, 1995. 

[8] Y. Je Lim and M. Soma, ‘‘Statistical estimation of delaydependent switching activities in embedded CMOS 

combinational circuits,’’ IEEE Trans. on VLSI Systems, vol. 5, no. 3, pp. 309--319, September 1997. 

[9] M. Hashimoto, H. Onodera, and K. Tamaru, ‘‘A power optimization method considering glitch reduction by gate 

sizing,’’ Proc. ISLPED, pp. 221--226, 1998. 

[10] F. N. Najm, ‘‘Transition density, a stochastic measure of activity in digital circuits,’’ Proc. DAC, pp. 644--649, 

1991. 

[11] M. Berkelaar, ‘‘Statistical delay calculation, a linear time method,’’ Proc. TAU, pp. 15--24, 1997. 

[12] Synopsys Inc., Desigin Compiler Reference Manual, 1998. 

[13] Synopsys Inc., PowerMill Reference Manual, 1998.

DAC'99, pages 452-459 

Gradient-Based Optimization of Custom Circuits Using a Static-Timing Formulation 

A. R. Conn*, I. M. Elfadel*, W. W. Molzen, Jr.*, P. R. O’Brien**, P. N. Strenski*, 

C. Visweswariah*, C. B. Whan* 

*IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598 

**IBM Electronic Design Automation, Austin, TX 78758 

Abstract 

This paper describes a method of optimally sizing digital circuits on a static-timing basis. All 

paths through the logic are considered simultaneously and no input patterns need be specified by 

the user. The method is unique in that it is based on gradient-based, nonlinear optimization and 

can accommodate transistor-level schematics without the need for pre-characterization. It 

employs efficient time-domain simulation and gradient computation for each channel-connected 

component. A large-scale, general-purpose, nonlinear optimization package is used to solve the 

tuning problem. A prototype tuner has been developed that accommodates combinational circuits 

consisting of parameterized library cells. Numerical results are presented. 

References 

[1] W. Nye, D. C. Riley, A. Sangiovanni-Vincentelli, and A. L. Tits, “DELIGHT. SPICE: An optimization-based 

system for the design of integrated circuits,” IEEE Transactions on Computer-Aided Design of ICs and Systems, vol. 

CAD-7, pp. 501–519,April 1988. 

[2] A. R. Conn, R. A. Haring,C. Visweswariah, and C.W. Wu, “Circuit optimization via adjoint Lagrangians,” IEEE 

International Conference on Computer-Aided Design, pp. 281–288,November 1997. 

[3] A. R. Conn, P. K. Coulman, R. A. Haring, G. L. Morrill, C. Visweswariah, and C.W.Wu, “JiffyTune: circuit 

optimizationusing time-domainsensitivities,” IEEE Transactions on Computer-Aided Design of ICs and Systems, 

vol. 17, pp. 1292–1309, December 1998. 

[4] J. P. Fishburn and A. E. Dunlop, “TILOS: A posynomial programming approach to transistor sizing,” IEEE 

InternationalConference on Computer-Aided Design, pp. 326–328,November 1985. 

[5] W. C. Elmore, “The transient analysis of damped linear networks with particular regard to wideband amplifiers,” 

Journal of AppliedPhysics, vol. 19, no. 1, pp. 55–63, 1948. 

[6] P. Penfield and J. Rubinstein, “Signal delay in RC tree networks,” in Proceedings of the 2nd Caltech VLSI 

Conference, pp. 269–283,March 1981. 

[7] S. S. Sapatnekar, V. B. Rao, P. M. Vaidya, and S. M. Kang, “An exact solution to the transistor sizing problem 

for CMOS circuits using convex optimization,” IEEE Transactions on Computer-Aided Design of ICs and Systems, 

vol. CAD-12, pp. 1621–1634,November 1993. 

[8] A. Srinivasan, K. Chaudhary, and E. S. Kuh, “RITUAL: A performance driven placement algorithm for small 

cell ICs,” IEEE International Conference on Computer-Aided Design, pp. 48–51,November 1991. 

[9] A. R. Conn, N. I. M. Gould, and Ph. L. Toint, LANCELOT: A Fortran Package for Large-Scale Nonlinear 

Optimization (Release A). Springer Verlag, 1992. 

[10] C. Visweswariah and R. A. Rohrer, “Piecewise approximate circuit simulation,” IEEE Transactions on 

Computer-Aided Design of ICs and Systems, vol. 10, pp. 861–870, July 1991. 

[11] C. Visweswariah and J. A. Wehbeh, “Incremental event-driven simulation of digital FET circuits,” Proc. 1993 

Design Automation Conference, pp. 737–741, June 1993. 

[12] P. Feldmann, T. V. Nguyen, S. W. Director, and R. A. Rohrer, “Sensitivity computation in piecewise 

approximate circuit simulation,” IEEE Transactions on Computer-Aided Design of ICs and Systems, vol. 10, pp. 

171–183, February 1991. 

[13] A. R. Conn, N. I. M. Gould, and Ph. L. Toint, “Global convergence of a class of trust region algorithms for 

optimization with simple bounds,” SIAM Journal on Numerical Analysis, vol. 25, pp. 433–460, 1988. See also same 

journal, pp. 764–767, volume 26, 1989. 

[14] A. R. Conn, N. I. M. Gould, and Ph. L. Toint, “A globally convergent augmented Lagrangian algorithm for 

optimization with general constraints and simple bounds,” SIAM Journal on Numerical Analysis, vol. 28, no. 2, pp. 

545–572, 1991.

[15] A. R. Conn, L. N. Vicente, and C. Visweswariah, “Two-step algorithms for nonlinear optimization with 

structured applications,”Research Report RC21198(94689), IBM Research Division, T. J. Watson Research Center, 

Yorktown Heights, NY 10598, June 1998. Submitted to SIAM Journal on Optimization. 

[16] M. Ohlrich, C. Ebeling, E. Ginting, and L. Sather, “SubGemini: identifying subcircuits using a fast subgraph 

isomorphism algorithm,” Proc. 1993 Design Automation Conference, pp. 31–37, June 1993. 

[17] J. P. M. Silva and K. A. Sakallah, “GRASP–A new search algorithm for satisfiability,” IEEE International 

Conference on Computer-Aided Design, pp. 220–227, November 1996. 

[18] C. C. Douglas, D. A. George, and M. E. Henderson, “Object classes for numerical analysis,” in Proceedings of 

the second annual object-oriented numerics conference, pp. 32–49, Rogue Wave Software, Inc., Corvallis 

Oregon,April 1994. 

[19] A. R. Conn, R. A. Haring, and C. Visweswariah, “Noise considerations in circuit optimization,” IEEE 

International Conference on Computer-Aided Design, pp. 220–227, November 1998.

DAC'99, pages 460-465 

Simultaneous Circuit Partitioning/Clustering with Retiming for Performance Optimization 

Jason Cong, Honching Li, Chang Wu 

Department of Computer Science, University of California, Los Angeles, CA 90095 

Abstract 

Partitioning and clustering are crucial steps in circuit layout for handling large scale designs 

enabled by the deep submicron technologies. Retiming is an important sequential logic 

optimization technique for reducing the clock period by optimally repositioning flip flops [7]. In 

our exploration of a logical and physical co-design flow, we developed a highly efficient 

algorithm on combining retiming with circuit partitioning or clustering for clock period 

minimization. Compared with the recent result by Pan et al. [10] on quasi-optimal clustering with 

retiming, our algorithm is able to reduce both runtime and memory requirement by one order of 

magnitude without losing quality. Our results show that our algorithm can be over 1000X faster 

for large designs. 

References 

[1] J. Cong, H. Li, S. Lim, T. Shibuya, and D. Xu. Large Scale Circuit Partitioning With Loose/Stable Net Removal 

And Signal Flow Based Clustering. In IEEE International Conference on CAD, pages 441-446, 1997. 

[2] J. Cong, H. Li, and C. Wu. Simultaneous Circuit Partitioning/Clustering with Retiming for Performance 

Optimization. UCLA-CSD 990019, Technique Report, March 1999. 

[3] T. H. Cormen, C. H. Leiserson, and R. L. Rivest. Introduction to Algorithms, chapter 25. The MIT Press, 1990. 

[4] C. Fiduccia and R. Matheyses. A Linear-Time Heuristic for Improving Network Partitions. In ACM/IEEE 

Design Automation Conference, pages 175-181, 1982. 

[5] L. Hagen and A. B. Kahng. New Spectral Methods for Ratio Cut Partitioning and Clustering. IEEE Trans. on 

Computer-Aided Design of Integrated Circuits And Systems, 11(9):1074-1085, 1992. 

[6] E. L. Lawler, K. N. Levitt, and J. Turner. Module Clustering to Minimize Delay in Digital Networks. IEEE 

Trans. on Computers, 18:47-57, 1969. 

[7] C. E. Leiserson and J. B. Saxe. Retiming Synchronous Circuitry. Algorithmica, 6:5-35, 1991. 

[8] L. Liu, M. Kuo, C. K. Cheng, and T. C. Hu. Performance-Driven Partitioning using a Replication Graph 

Approach. In Prod. 32th ACM/IEEE Design Automation Conference, pages 206-210, 1995. 

[9] R. Murgai, R. K. Brayton, and A. Sangiovanni-Vincentelli. On Clustering for Minimum Delay/Area. In IEEE 

International Conference on CAD, pages 6-9, 1991. 

[10] P. Pan, A. K. Karandikar, and C. L. Liu. Optimal Clock Period Clustering for Sequential Circuits with 

Retiming. IEEE Trans. on Computer-Aided Design of Integrated Circuits And Systems, 17(6):489-498, 1998. 

[11] Y. C. Wei and C. K. Cheng. Towards Efficient Hierarchical Designs by Ratio Cut Partitioning. In IEEE 

International Conference on CAD, pages 298-301, 1989. 

[12] H. Yang and D. F. Wong. Circuit Clustering for Delay Minimization under Area and Pin Constraints. In 

ED&TC, pages 65-70, 1995.

DAC'99, pages 466-471 

Wave Steering in YADDs: A Novel Non-Iterative Synthesis and Layout Technique 

Arindam Mukherjee 1 , Ranganathan Sudhakar 2 , Malgorzata Marek-Sadowska 1 , Stephen I. Long 1 

1 Dept. of ECE, University of California, Santa Barbara, CA, USA 

2 Dept. of ECE, Stanford University, Stanford, CA, USA 

ABSTRACT 

In this paper we present a new synthesis and layout approach that avoids the normal iterations 

between synthesis, technology mapping and layout, and increases routing by abutment. It 

produces shorter and more predictable delays, and sometimes even layouts with reduced areas. 

This scheme equalizes delays along different paths, which makes low granularity pipelining a 

reality, and hence we can clock these circuits at much higher frequencies, compared to what is 

possible in a conventionally designed circuit. Since any circuit can be clocked at a fixed rate, this 

method does not require timing-driven synthesis. We propose the logic and layout synthesis 

schemes and algorithms, discuss the physical layout part of the process, and support our 

methodology with simulation results. 

References 

[1]. S. B. Akers, “A Rectangular Logic Array”, IEEE Trans. on Computers, vol. C-21, no.8, pp.848-856, August 

1972. 

[2] W.P.Burleson, M.Ciesielski,F.Klass and W.Liu, “Wave-Pipelining a Tutorial and Research Survey”; IEEE 

Transactions on VLSI Systems, Vol.6, No.3, Sep. 1998. 

[3]L.Cotten, “Maximum Rate Pipelined Systems”, Proc. AFIPS Spring Joint Comp. Conf., 1969. 

[4]. V. Bertacco et al, “Decision Diagrams and Pass Transistor Logic Synthesis”, Proc. of the ACM/IEEE Int’l 

Workshop on Logic Synthesis, pp. 1-5, May 1997. 

[5]. R.E. Bryant, “Graph-based algorithms for Boolean functions manipulation”, IEEE Trans. Computers, vol. C-35, 

pp. 677-691, Aug. 1986 

[6]. P. Buch et al, “On Synthesizing Pass Transistor Networks”, Proc. of the ACM/IEEE Int’l Workshop on Logic 

Synthesis, pp. 1-8, May 1997. 

[7]. M. Chrzanowska-Jeske, Z. Wang and Y.Xu, “A regular representation for mapping to fine-grain locallyconnected 

FPGAs”, Proc. Int. Symposium on Circuits and Systems, 1997. 

[8]. M. Chrzanowska-Jeske and Z.Wang “Mapping of symmetric and partially-symmetric functions to the CA-type 

FPGAs”, Proc.Midwest’ 95, pp.290-293, 1995. 

[9]W.K.C.Lam, R.K.Brayton and A.L.Sangiovanni-Vincentelli, “Valid Clock Frequencies and Their Computation in 

Wavepipelined Circuits”, IEEE Transactions on CAD of IC and Systems, Vol. 15, No.7, July 1996. 

[10] P.S.Lassen, S.I.Long, and K.R.Nary, “Ultra-Low Power GaAs MESFET MSI Circuits Using Two-Phase 

Dynamic FET Logic”, IEEE J. Solid State Circuits, Vol.28, pp.1038-1045, October 1993. 

[11]. M. Perkowski, E.Pierzchala and R.Drechsler, “Layout driven synthesis for a submicron technology: Mapping 

expansions to fat regular lattices”, Proc. Int. Symp. on Circuits and Systems, 1997. 

[12]. M. Perkowski, L.Jozwiak and R.Drechsler, “Two hierarchies of generalized Kronecker trees, forms, decision 

diagrams and regular layouts”, Proc. 3rd International Workshop on Applications of the Reed-Muller Expansion in 

Circuit Design, (Reed-Muller’97), Sept. 19-20, 1997, Oxford, UK. 

[13] J.M.Rabaey, “Digital Integrated Circuits: A Design Perspective”, Section 3.3.3 and Chapter 8, Prentice Hall, 

1996. 

[14] M. Shamanna et al, “Multiple-input, Multiple-output Pass Transistor Logic”, Int’l J. Electronics, vol. 79, no. 1, 

pp. 33-45. 

[15] R.Sudhakar, “YADDA: Layout Synthesis using Pass Transistor Logic”, MS Thesis, UCSB, 1998. 

[16] K.Yano et al, “A 3.8ns CMOS 16x16b Multiplier using Complementary Pass-Transistor Logic”, IEEE J.Solid- 

State Circuits, vol.25, no.2, pp.388-395, April, 1990. 

[17] Information Sciences Institute, MOS Implementation Service www.mosis.com Bloomington IN, 1995.

DAC'99, pages 472-478 

MERLIN: Semi-Order-Independent Hierarchical Buffered 

Routing Tree Generation Using Local Neighborhood Search 

Amir H. Salek, Jinan Lou, Massoud Pedram 

Department of Electrical Engineering – Systems, University of Southern California 

Los Angeles, California 90089 

ABSTRACT 

This paper presents a solution to the problem of performance-driven buffered routing tree 

generation in electronic circuits. Using a novel bottom-up construction algorithm and a local 

neighborhood search strategy, this method finds the best solution of the problem in an 

exponential size solution sub-space in polynomial time. The output is a hierarchical buffered 

rectilinear Steiner routing tree that connects the driver of a net to its sink nodes. The two variants 

of the problem, i.e. maximizing the driver required time subject to a total buffer area constraint 

and minimizing the total buffer area subject to a minimum driver required time constraint, are 

handled by propagating three-dimensional solution curves during the construction phase. 

Experimental results prove the effectiveness of this technique compared to the other solutions for 

this problem. 

REFERENCES 

[Be57] R. Bellman, Dynamic Programming, Princeton Univ. Press, 1957. 

[CHKM96] J. Cong, L. He, C. Koh, and P. Madden, “Performance optimization of VLSI interconnect layout,” In 

Integration, the VLSI Journal 21, pp. 1-94, 1996. 

[CLZ93] J. Cong, K. Leung, and D. Zhou, “Performance-driven interconnect design based on distributed RC delay 

model,” In Proceedings of the 30th Design Automation Conference, pp. 606-611, 1993. 

[El48] W. C. Elmore, “The transient response of damped linear network with particular regard to wideband 

amplifiers,” In Journal of Applied Physics 19, pp. 55-63, 1948. 

[Gr92] L. K. Grover, “Local search and the local structure of NP-complete problems,” In Operations Research 

Letters 12, pp. 235-243, Oct. 1992. 

[Gi90] L.P.P.P. van Ginneken, “Buffer placement in distributed RC-tree networks for minimal Elmore delay,“ In 

Proceedings of International Symposium on Circuits and Systems, pp. 865-868, 1990. 

[GJ79] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, 

W. H. Freeman, SF, CA, 1979. 

[Ha66] M. Hanan, “On Steiner’s problem with rectilinear distance,” SIAM Journal of Applied Mathematics, No. 14, 

pp. 255-265, 1966. 

[LCLH96] J. Lillis, C. K. Cheng, T. Y. Lin, and C. Ho, “New performance driven routing techniques with explicit 

area/delay tradeoff and simultaneous wire sizing,” In Proceedings of the 33th Design Automation Conference, pp. 

395-400, 1996. 

[LSP98] J. Lou, A. H. Salek, and M. Pedram, “An integrated flow for technology remapping and placement of subhalf-micron 

circuits,” In Proceedings of Asia and South Pacific Design Automation Conference, pp. 295-300, 1998. 

[OC96a] T. Okamoto, and J. Cong, “Buffered Steiner tree construction with wire sizing for interconnect layout 

optimization,” In Proceedings of International Conference on Computer-Aided Design, pp. 44-49, 1996. 

[OC96b] T. Okamoto, and J. Cong, “Interconnect layout optimization by simultaneous Steiner tree construction and 

buffer insertion,” In Proceedings of the 5’th ACM/SIGDA physical Design Workshop, pp. 1-6, 1996. 

[SLP98] A. H. Salek, J. Lou, and M. Pedram, “A simultaneous routing tree construction and fanout optimization 

algorithm,” In Proceedings of International Conference on Computer-Aided Design, 1998. 

[SSLM92] E. M. Sentovich, K. J. Singh, L. Lavagno, C. Moon, R. Murgai, A. Saldanha, H. Savoj, P. R. Stephan, R. 

K. Brayton, and A. Sangiovanni-Vincentelli, ”SIS: A system for sequential circuit synthesis,” Memorandum No. 

UCB/ERL M92/41, Electronics Research Laboratory, College of Engineering, University of California, Berkeley, 

CA 94720, May 1992. 

[To90] H. Touati, “Performance-oriented technology mapping,” Ph.D. thesis, University of California, Berkeley, 

Technical Report UCB/ERL M90/109, November 1990.

[WM89] W.S. Wong, and R.J.T. Morris, “A new approach to choosing initial points in local search,” In Information 

Processing Letters 30, pp. 67-72, January 1989. 

[Ya92] M. Yannakakis, “The Analysis of Local Search Problems and Their Heuristics,” In Proceedings of 7’th 

Annual Symposium on Theoretical Aspects of Computer Science, pp. 298-311, 1990.

DAC'99, pages 479-484 

Buffer Insertion With Accurate Gate and Interconnect Delay Computation 

Charles J. Alpert, Anirudh Devgan 

IBM Austin Research Laboratory, Austin, TX 78717 

Stephen T. Quay 

IBM Server Group, Austin, TX 78717 

Abstract 

Buffer insertion has become a critical step in deep submicron design, and several buffer 

insertion/sizing algorithms have been proposed in the literature. However, most of these methods 

use simplified interconnect and gate delay models. These models may lead to inferior solutions 

since the optimized objective is only an approximation for the actual delay. We propose to 

integrate accurate wire and gate delay models into Van Ginneken's buffer insertion algorithm 

[18] via the propagation of moments and driving point admittances up the routing tree. We have 

verified the effectiveness of our approach on an industry design. 

References 

[1] C. J. Alpert and A. Devgan, “Wire Segmenting For Improved Buffer Insertion”, IEEE/ACM DAC,1997, pp. 588- 

593. 

[2] C. J. Alpert, A. Devgan, and S. T. Quay, “Buffer Insertion for Noise and Delay Optimization”, DAC, 1998, pp. 

362-367. 

[3] C. C. N. Chu and D. F. Wong, “Closed Form Solution to Simultaneous Buffer Insertion/Sizing andWire Sizing”, 

International Symposium on Physical Design, 1997, pp. 192-197. 

[4] C. C. N. Chu and D. F. Wong, “A New Approach to Simultaneous Buffer Insertion and Wire Sizing”, 

IEEE/ACM Intl. Conference on Computer-Aided Design, 1997, pp. 614-621. 

[5] J. Cong and C.-K. Koh, “Interconnect Layout Optimization Under Higher-Order RLC Model”, ICCAD, 1997, 

713-720. 

[6] S. Dhar and M. A. Franklin, “Optimum Buffer Circuits for Driving Long Uniform Lines”, IEEE Journal of 

Solid-State Circuits, 26(1), 1991, pp. 32-40. 

[7] W. C. Elmore, “The Transient Response of Damped Linear Network with Particular Regard to Wideband 

Amplifiers”, J. Applied Physics, 19, 1948, pp. 55-63. 

[8] R. Gupta, B. Krauter, B. Tutuianu, J. Willis and L. T. Pileggi, “The Elmore Delay as a Bound for RC Trees with 

Generalized Input Signals”, DAC, 1995, pp. 364-369. 

[9] J. Lillis, C.-K. Cheng and T.-T. Y. Lin, “Optimal Wire Sizing and Buffer Insertion for Low Power and a 

Generalized Delay Model”, IEEE J. Solid-State Circuits, 31(3), 1996, 437-447. 

[10] S. Lin and M. Marek-Sadowska, “A Fast and Efficient Algorithm for Determining Fanout Trees in Large 

Networks”, Proc. Euro. Conf. on Design Automation, 1991, pp. 539-544. 

[11] F.-J. Liu, J. Lillis and C.-K. Cheng, “Design and Implementation of a Global Router Based on a New Layout- 

Driven Timing Model with Three Poles”, ISCAS, 1997, pp. 1548-1551. 

[12] P. R. O’Brien and T. L. Savarino, “Modeling the Driving-Point Characteristic of Resistive Interconnect for 

Accurate Delay Estimation”, IEEE/ACM ICCAD, 1989, pp. 512-515. 

[13] T. Okamoto and J. Cong, “Interconnect Layout Optimization by Simultaneous Steiner Tree Construction and 

Buffer Insertion”, ACM/SIGDA Physical Design Workshop, 1996, pp. 1-6. 

[14] L. T. Pillage and R. A. Rohrer. AsymptoticWaveform Evaluation for Timing Analysis. IEEE TCAD, 9(4), 1990, 

352-366. 

[15] J. Qian, S. Pulllela, and L. Pillage, “Modeling the “Effective Capacitance” for the RC Interconnect of CMOS 

Gates”, IEEE Trans. CAD,. 13(12), 1994, pp. 1526-1535. 

[16] C. Ratzlaff and L. T. Pillage, “RICE: Rapid Interconnect circuit Circuit Evaluator using Asymptotic Waveform 

Evaluation”, IEEE Trans. on CAD, pp. 763-776, June 1994. 

[17] B. Tutuianu, F. Dartu, and L. Pileggi, “Explicit RC-Circuit Delay Approximation Based on the First Three 

Moments of the Impulse Response”, DAC, 1996, pp. 611-616.

[18] L. P. P. P. van Ginneken, “Buffer Placement in Distributed RC-tree Networks for Minimal Elmore Delay”, Intl. 

Symp.Circuits and Systems, 1990, pp. 865-868.

DAC'99, pages 485-490 

Reducing Cross-Coupling among Interconnect Wires in Deep-Submicron Datapath Design 

Joon-Seo Yim*, Chong-Min Kyung** 

*DSP Group, Information Technology Lab., LG Corporate Institute of Technology, 

16, Woomyeon-Dong, Seocho-Gu, Seoul, 137-140, Korea 

**Department of Electrical Engineering, KAIST, 373-1, Kusong-Dong, Yusong-Gu, 

Taejon, 305-701, Korea 

Abstract 

As the CMOS technology enters the deep submicron design era, the lateral inter-wire coupling 

capacitance becomes the dominant part of load capacitance and makes RC delay on the bus 

structures very data-dependent. Reducing the cross-coupling capacitance is crucial for achieving 

high-speed as well as lower power operation. In this paper, we propose two interconnect layout 

design methodologies for minimizing the coupling effect" in the design of full-custom datapath. 

Firstly, we describe the control signal ordering scheme which was shown to minimize the 

switching power consumption by 10% and wire delay by 15% for a given set of benchmark 

examples. Secondly, a track assignment algorithm based on evolutionary programming was used 

to minimize the cross-coupling capacitance. Experimental results have shown that the chip 

performance improvement as much as 40% can be obtained using the proposed interconnect 

schemes in various stages of the datapath layout optimization. 

References 

[1] Semiconductor Industry Association, National Technology Roadmap for Semiconductors, 1994 

[2] A.B.Kang et al., “Interconnect Tuning Strategies for High-Performance ICs", Proc. DATE, pp.471-478, 1998 

[3] D.Li et al., “A Repeater Optimization Methodology for Deep Submicron, High-Performance Processors", Proc. 

ICCD, pp.726-731, 1997 

[4] C.D.Kibler, Personal communication on “Interconnect design", SandCraft Inc. 1997 

[5] K.Chaudhary, A.Onozawa, and E.S.Kuh, “A spacing algorithm for performance enhancement and cross-talk 

reduction", Proc. ICCAD, pp.697-702, 1993 

[6] T. Gao and C.L.Liu “Minimum Crosstalk Channel Routing", IEEE Trans. CAD-15, pp.465-474, May. 1996 

[7] K.S.Jhang et.al., “COP: A Crosstalk OPtimizer for Gridded Channel Routing", IEEE Trans. CAD-15, pp.424- 

429, Apr. 1996 

[8] A.Vittal and M.Marek-Sadowska, “Crosstalk Reduction for VLSI", IEEE Trans. CAD-16, no.3, pp.290-298, 

March 1997 

[9] T.Xue et al., , “Post global routing crosstalk synthesis", IEEE Trans. CAD-16, no.12, pp.1418-1430, Dec. 1997 

[10] A.Onozawa et al., , “Performance driven Spacing Algorithm Using Attractive and Repulsive Constraints for 

Submicron LSI's", IEEE Trans. CAD-14, no.6 pp.707-719, Jun. 1995 

[11] H.Zhou and D.F.Wong, “Global Routing with crosstalk constraints", Proc. 35th DAC, pp.374-377, June, 1998 

[12] H.-P.Tseng, L.Scheffer, and C.Sechen, “Timing and Crosstalk Driven Area Routing", Proc. 35th DAC, pp.378- 

381, June, 1998 

[13] S.S.Lai and W.Hwang, “Design and Implementation of Differential Cascode Voltage Switch with Pass- 

Gate(DCVSPG) Logic for High-Performance Digital Systems", IEEE JSSC , Vol.32, No.4, pp.563-573, April, 1997 

[14] D.Carlson, et al., “Multimedia Extension for a 550-MHz RISC Microprocessor", IEEE JSSC , Vol.32, No.11, 

pp.1618-1624, Nov., 1997 

[15] A.Hashimoto and J.Stevens, “Wire Routing by Optimizing Channel Assignment within Large Apertures", Proc. 

8th DAC, pp.155-169, June, 1971 

[16] C.Sechen,”An improved simulated annealing algorithm for row-based placement", Proc. ICCAD, pp.478-481, 

1987 

[17] Z.Michalewicz, “Genetic Algorithms + Data Structures = Evolution Programs", Springer-Verlag, pp.16-17, 

1992

[18] C.M. Kyung et al., “HK386: An x86-Compatible 32bit CISC Microprocessor", Proc. ASP-DAC '97, pp.661- 

662, 1997 

[19] J.S.Yim et al., “A C-Based RTL Design Verification Methodology for Complex Microprocessor", Proc. 34th 

DAC, pp.83-88, June, 1997

DAC'99, pages 491-496 

A Novel VLSI Layout Fabric for Deep Sub-Micron Applications 

Sunil P. Khatri, Amit Mehrotra, Robert K. Brayton, Alberto Sangiovanni-Vincentelli, 

Ralph H.J.M. Otten 

Abstract 

We propose a new VLSI layout methodology which addresses the main problems faced in Deep 

Sub-Micron (DSM) integrated circuit design. Our layout "fabric" scheme eliminates the 

conventional notion of power and ground routing on the integrated circuit die. Instead, power 

and ground are essentially "pre-routed" all over the die. By a clever arrangement of 

power/ground and signal pins, we almost completely eliminate the capacitive effects between 

signal wires. Additionally, we get a power and ground distribution network with a very low 

resistance at any point on the die. Another advantage of our scheme is that the arrangement of 

conductors ensures that on-chip inductances are uniformly negligible. Finally, characterization of 

the circuit delays, capacitances and resistances becomes extremely simple in our scheme, and 

needs to be done only once for a design. 

We show how the uniform parasitics of our fabric give rise to a reliable and predictable design. 

We have implemented our scheme using public domain layout software. Preliminary results 

show that it holds much promise as the layout methodology of choice in DSM integrated circuit 

design. 

References 

[1] “The National Tecnology Roadmap for Semiconductors.” http://notes.sematech.org/97melec.htm, 1997. 

[2] P. D. Fisher, “Clock Cycle Estimation for Future Microprocessor Generations,” tech. rep., SEMATECH, 1997. 

[3] “Physical Design Modelling and Verification Project (SPACE Project).” http://cas.et.tudelft.nl/research/ 

space/html. 

[4] B. A. Gieseke et al., “A 600MHz Superscalar RISC Microprocessor with Out-of-Order Execution,” in Digest of 

Technical Papers, International Solid State Circuits Conference, 1997. 

[5] A. Rubio, N. Itazaki, and K. Kinoshita, “An approach to the analysis and detection of cross-talk faults in digital 

VLSI circuits,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 13, pp. 387– 

95, March 1994. 

[6] D. Kirkpatrick and A. Sangiovanni-Vincentelli, “Digital Sensitivity: Predicting signal interaction using 

functional analysis,” in Proceedings of the International Conference on Computer-Aided Design, pp. 536–41, Nov 

1996. 

[7] D. Kirkpatrick and A. Sangiovanni-Vincentelli, “Techniques for cross-talk avoidance in the physical design of 

high-performance digital systems,” in Proceedings of the International Conference on Computer-Aided Design, pp. 

616–9, Nov 1994. 

[8] S. Y. Liao, Microwave Devices and Circuits. Prentice-Hall, 1980. 

[9] “Analysis of Silicon Inductors and Transformers for ICs.” http://kabuki.eecs.berkeley.edu/_niknejad/doc/asitic 

doc.html. 

[10] R. K. Brayton, “Logic Synthesis for Ultra Deep Sub-Micron (UDSM),” in Proceedings of the 35th Design 

Automation Conference, 1998. 

[11] J. Reed, M. Santomauro, and A. Sangiovanni-Vincentelli, “A new gridless channel router: Yet another channel 

router the second (YACR-II),” in Digest of Technical Papers International Conference on Computer-Aided Design, 

1984. 

[12] G. T. Hamachi, R. N. Mayo, and J. K. Ousterhout, “Magic: A VLSI Layout system,” in 21st Design Automation 

Conference Proceedings, 1984. 

[13] E. M. Sentovich, K. J. Singh, L. Lavagno, C. Moon, R. Murgai, A. Saldanha, H. Savoj, P. R. Stephan, R. K. 

Brayton, and A. L. Sangiovanni-Vincentelli, “SIS: A System for Sequential Circuit Synthesis,” Tech. Rep. 

UCB/ERL M92/41, Electronics Research Laboratory, Univ. of California, Berkeley, CA 94720, May 1992.

[14] A. Casotto, ed., Octtools-5.1 Manuals, (Electronics Research Laboratory, College of Engineering, University of 

California, Berkeley, CA 94720), University of California at Berkeley, Sept. 1991. 

[15] C. Sechen and A. Sangiovanni-Vincentelli, “The TimberWolf Placement and Routing Package,” IEEE Journal 

of Solid-State Circuits, 1985. 

[16] D. Sylvester and K. Keutzer, “Getting to the bottom of deep submicron,” in Proceedings of the International 

Conference on Computer-Aided Design, 1998. To Appear. 

[17] S. Yamashita, H. Sawada, and A. Nagoya, “A new method to express functional permissibilities for LUT based 

FPGAs and its applications,” in Proceedings of the International Conference on Computer-Aided Design, 1996. 

[18] R. Brayton, “Understanding SPFDs: A new method for specifying flexibility,” in Workshop Notes, 

International Workshop on Logic Synthesis, 1997.

DAC'99, pages 497-501 

Improved Delay Prediction for On-Chip Buses. 

Real G. Pomerleau, Paul D. Frazon, Griff L. Bilbro 

North Carolina State University, Raleigh, NC, 27695-7914 

ABSTRACT 

In this paper, we introduce a simple procedure to predict wiring delay in bi-directional buses and 

a way of properly sizing the driver for each of its port. In addition, we propose a simple 

calibration procedure to improve its delay prediction over the Elmore delay of the RC tree. The 

technique is fast, accurate, and ideal for implementation in floorplanner during behavioral 

synthesis. 

Keywords: RC wiring delay, High-Level Synthesis, Floorplanning, Buffer Optimization, 

Interconnect optimization. 

REFERENCES 

[1] Bakoglu H. B., “Circuits, Interconnections, and Packaging for VLSI”, Addison-Wesley Publishing Company, 

1990. 

[2] Choi J-S, and Lee K. “Design of CMOS Buffer for Minimum Power-Delay Product”, IEEE Journal of Solid- 

State Circuits, 29(9):1142-1145, 1994. 

[3] Cong J., He L., Khoo K-Y, Koh C-K, Pan Z. “Interconnect Design for Deep Submicron ICs”, IEEE/ACM 

International Conference on Computer-Aided Design, 478-485, 1997 

[4] Deutsche A., et al, “When are Transmission-Line Effects Important for On-Chip Interconnections”, 47th 

Electronic Components & Technology Conference, 704-711, 1997. 

[5] Li N., Haviland G., Tuszynski A., “CMOS Tapered Buffer”, IEEE Journal of Solid-State Circuits, 25(4):1005- 

1008, 1990. 

[6] Lynch W., “Power Supply Distribution and Other Wiring Issues for Deep-Submicron ICs”, NCSU VLSI Seminar, 

1997. 

[7] Prabhakaran P., and Banerjee P, “Simultaneous Scheduling, Binding, and Floorplanning in High-Level 

Synthesis”, Proc of the 11th International Conference on VLSI Design, 428-434, 1998. 

[8] Pedran M., “Panel: Physical Design and Synthesis Merge or Die”, Proc. 32sd Design Automation Conference, 

238-239,1995. 

[9] Penfield P., and Rubinstein J., “Signal Delay in RC Tree Networks”, Proc. 18th Design Automation Conference, 

613-617, 1981. 

[10] Sai-Halasz G., “Performance Trends in High-end Processors”, IEEE Proceeding, 83(-):20-36, 1995. 

[11] Sakurai T., “Closed-Form Expressions for Interconnection Delay, Coupling, and Crosstalk in VLSI's”, IEEE 

Transactions on Electronic Devices, 40(1):118-124, 1993. 

[12] Semiconductor Industry Association, “The National Technology Roadmap for Semiconductors”, 1997 Edition.

DAC'99, pages 502-506 

Noise-aware Repeater Insertion and Wire Sizing for On-chip Interconnect Using 

Hierarchical Moment-Matching 

Chung-Ping Chen and Noel Menezes 

Strategic CAD Labs, Design Technology, Intel Corporation, Hillsboro, OR 97124 

Abstract 

Recently, several algorithms for interconnect optimization via repeater insertion and wire sizing 

have appeared based on the Elmore delay model. Using the Devgan noise metric [6] a noiseaware 

repeater insertion technique has also been proposed recently. Recognizing the 

conservatism of these delay and noise models, we propose a moment-matching based technique 

to interconnect optimization that allows for much higher accuracy while preserving the 

hierarchical nature of Elmore-delay-based techniques. We also present a novel approach to noise 

computation that accurately captures the effect of several attackers in linear time with respect to 

the number of attackers and wire segments. Our practical experiments with industrial nets 

indicate that the corresponding reduction in error afforded by these more accurate models 

justifies this increase in runtime for aggressive designs which is our targeted domain. Our 

algorithm yields delay and noise estimates within 5% of circuit simulation results. 

References 

[1]. W. C. Elmore, “The transient response of damped linear networks with particular regard to wideband 

amplifiers,” Journal of Applied Physics, vol. 19, no. 1, 1948. 

[2] P. Penfield and J. Rubinstein, “Signal delay in RC tree networks,” IEEE Trans. Computer-Aided Design, vol. 

CAD-2, pp. 202-211, July 1983. 

[3] J. Lillis. C.-K. Cheng, and T.-T. Lin, “Optimal and efficient buffer insertion and wire sizing,” Proc. Custom 

Integrated Circuits Conference, pp. 259–262, May 1995. 

[4] L. P. P. P. van Ginneken, “Buffer placement in distributed RC-tree networks for minimal Elmore Delay,” Proc. 

International Symposium on Circuits and Systems, pp. 865-868, 1990. 

[5] C. J. Alpert, A. Devgan, and S. T. Quay, “Buffer insertion for noise and delay optimization,” Proc. 35th 

ACM/IEEE Design Automation Conference, pp. 362-367, June 1997. 

[6] A. Devgan, “Efficient noise coupled noise estimation for on-chip interconnects,” Proc. of the Intl. Conf. on 

Computer-Aided Design, pp. 147-151, Nov. 1997. 

[7] J. Culetu, C. Amir, and J. MacDonald, “A practical repeater insertion method in high speed VLSI circuits,” Proc. 

35th ACM/IEEE Design Automation Conference, pp. 392-395, June 1997. 

[8] D. Li, A. Pua, P. Srivastava, and U. Ko, “A repeater optimization methodology for deep sub-micron, highperformance 

processors,” Proc. of the Intl. Conf. on Computer Design, pp. 726-731, Oct. 1997. 

[9] N. Menezes and C.-P. Chen, “Spec-based repeater insertion and wire sizing for on-chip interconnect,” Proc. of 

the 12th Intl. Conf. on VLSI Design, pp. 476-483, Jan. 1999. 

[10] K. Rahmat, J. Neves, and J.-F. Lee, “Methods for calculating coupling noise in early design: a comparative 

analysis,” Proc. of the Intl. Conf. Computer Design, pp. 76-81, Oct. 1998. 

[11] P. Feldman and R.W. Freund, “Efficient linear circuit analysis by Pade approximation via the Lanczos 

process,” IEEE Trans. Computer-Aided Design., vol. 14, no. 5, pp. 639-649, May 1995. 

[12] F. Dartu, N. Menezes, J. Qian, and L.T. Pillage, “A gate-delay model for high-speed CMOS circuits,” Proc. 

31st ACM/IEEE Design Automation Conference, pp. 576–580, June 1994. 

[13] J. Qian, S. Pullela and L. T. Pillage, “Modeling the effective capacitance for the RC interconnect of CMOS 

gates,” IEEE Trans. Computer-Aided Design., vol. 13, no. 12, pp. 1526-1535, Dec. 1994. 

[14] L. T. Pillage and R. A. Rohrer, “Asymptotic waveform evaluation for timing analysis,” IEEE Trans. Computer- 

Aided Design, vol. 9, no. 4, pp. 352-366, April 1990. 

[15] K. L. Shepard, et al. “Global harmony: coupled noise analysis for full-chip RC interconnect networks,” Proc. of 

the Intl. Conf. on Computer-Aided Design, pp. 139-146, Nov. 1997. 

[16] A. B. Kahng, and S. Muddu, “New efficient algorithms for computing effective capacitance,” Proc. of the 1998 

Intl. Symposium on Physical Design, pp. 147-151, April 1998.

DAC'99, pages 507-510 

Interconnect Estimation and Planning for Deep Submicron Designs 

Jason Cong, David Zhigang Pan 

Department of Computer Science, University of California, Los Angeles, CA 

Abstract 

This paper reports two sets of important results in our exploration of an interconnect-centric 

design flow for deep submicron (DSM) designs: (i) We obtain efficient yet accurate wiring area 

estimation models for optimal wire sizing (OWS). We also propose a simple metric to guide 

area-efficient performance optimization; (ii) Guided by our interconnect estimation models, we 

study the interconnect architecture planning problem for wire-width designs. We achieve a rather 

surprising result which suggests that two pre-determined wire widths per metal layer are 

sufficient to achieve near-optimal performance. This result will greatly simplify the routing 

architecture and tools for DSM designs. We believe that our interconnect estimation and 

planning results will have a significant impact on DSM designs. 

References 

[1] J. Cong, L. He, K.-Y. Khoo, C.-K. Koh, and D. Z. Pan, “Interconnect design for deep submicron ICs," in Proc. 

Int. Conf. on Computer Aided Design, pp. 478-485, 1997. 

[2] J. Cong and D. Z. Pan, “Interconnect delay estimation models for synthesis and design planning," in Proc. Asia 

and South Pacific Design Automation Conf., pp. 97-100, Jan., 1999. 

[3] J. Cong and K. S. Leung, “Optimal wiresizing under the distributed Elmore delay model," in Proc. Int. Conf. on 

Computer Aided Design, pp. 634-639, 1993. 

[4] J. Cong and D. Z. Pan, “Interconnect estimation and planning for deep submicron designs," Tech. Rep. 980035, 

UCLA CS Dept, 1998. http://cadlab.cs.ucla.edu/~pan/publications/. 

[5] J. Cong, L. He, C.-K. Koh, and D. Z. Pan, “Global interconnect sizing and spacing with consideration of 

coupling capacitance," in Proc. Int. Conf. on Computer Aided Design, pp. 628-633, 1997. 

[6] Semiconductor Industry Association, National Technology Roadmap for Semiconductors, 1997. 

[7] J. Davis and J. Meindl, “Is interconnect the weak link?," IEEE Circuits and Devices Magazine, vol. 14, no. 2, pp. 

30-36, 1998. 

[8] R. Otten and R. K. Brayton, “Planning for performance," in Proc. Design Automation Conf, pp. 122-127, June 

1998. 

[9] P. Fisher and R. Nesbitt, “The test of time. clock-cycle estimation and test challenges for future 

microprocessors," IEEE Circuits and Devices Magazine, vol. 14, pp. 37-44, March 1998. 

[10] J. Cong, L. He, A. B. Kahng, D. Noice, N. Shirali, and S. H.-C. Yen, “Analysis and justification of a simple, 

practical 2 1/2-d capacitance extraction methodology," in Proc. ACM/IEEE Design Automation Conf., pp. 40.1.1- 

40.1.6, June, 1997. 

[11] W. C. Elmore, “The transient response of damped linear networks with particular regard to wide-band 

amplifiers," Journal of Applied Physics, vol. 19, pp. 55-63, Jan. 1948. 

[12] C.-K. Koh, VLSI Interconnect Layout Optimization. PhD thesis, University of California, Los Angeles, 1998. 

[13] J. Davis, V. De, and J. Meindl, “A stochastic wire-length distribution for gigascale integration (GSI) i. 

derivation and validation," IEEE Transactions on Electron Devices, vol. 45, no. 3, pp. 580-9, 1998.

DAC'99, pages 511-516 ECL: A Specification Environment for System-Level Design 

Luciano Lavagno, Ellen Sentovich 

Cadence Berkeley Laboratories, Berkeley, CA 94704-1103, USA 

Abstract 

We propose a new specification environment for system-level design called ECL. It combines 

the Esterel and C languages to provide a more versatile means for specifying heterogeneous 

designs. It can be viewed as the addition to C of explicit constructs from Esterel for waiting, 

concurrency and pre-emption, and thus makes these operations easier to specify and more 

apparent. An ECL specification is compiled into a reactive part (an extended finite state machine 

representing most of the ECL program), and a pure data looping part, thus nicely supporting a 

mix of control and data. The reactive part can be robustly estimated and synthesized to hardware 

or software, while the data looping part is implemented in software as specified. 

References 

[1] F. Balarin, E. Sentovich, M. Chiodo, P. Giusto, H. Hsieh, B. Tabbara, A. Jurecska, L. Lavagno, C. Passerone, K. 

Suzuki, and A. Sangiovanni-Vincentelli. Hardware-Software Co-design of Embedded Systems – The POLIS 

experience. Kluwer Academic Publishers, 1997. 

[2] G. Berry. The Constructive Semantics of Pure Esterel. 1996. To Appear, available now at ftp: 

//www.inria.fr/meije/esterel/papers/constructiveness.ps.gz. 

[3] G. Berry. The Foundations of Esterel. 1998. See http://www.inria.fr/meije/Esterel. 

[4] F. Boussinot, G. Doumenc, and J.-B. Stefani. Reactive objects. Annales des Telecommunications, 51(9-10):459– 

473, September 1996. 

[5] P. Clarke. Felix tools pushed in research project. Electronic Engineering Times, October 1998. See 

http://www.eetimes.com/news/98/1029news/felix.html. 

[6] S. Edwards, L. Lavagno, E.A. Lee, and A. Sangiovanni-Vincentelli. Design of embedded systems: formal 

models, validation, and synthesis. Proceedings of the IEEE, 85(3):366–390, March 1997. 

[7] N. Halbwachs. Synchronous Programming of Reactive Systems. Kluwer Academic Publishers, 1993. 

[8] D. Har’el, H. Lachover, A. Naamad, A. Pnueli, et al. STATEMATE: a working environment for the 

development of complex reactive systems. IEEE Transactions on Software Engineering, 16(4), April 1990. 

[9] E.A. Lee and D.G. Messerschmitt. Static scheduling of synchronous data flow graphs for digital signal 

processing. IEEE Transactions on Computers, January 1987. 

[10] S. Liao, S. Tjiang, and R. Gupta. An efficient implementation of reactivity for modeling hardware in the Scenic 

design environment. In Proceedings of the Design Automation Conference, pages 70–75, June 1997. 

[11] System Level Design Language Home page, 1998. See http://www.inmet.com/SLDL/.

DAC'99, pages 517-522 

Representation of Function Variants for Embedded System Optimization and Synthesis 

K. Richter, D. Ziegenbein, R. Ernst 

IDA / TU Braunschweig, D-38106 Braunschweig, Germany 

L. Thiele 

TIK / ETH Zürich, CH-8092 Zürich, Switzerland 

J. Teich 

DATE / UNI Paderborn, D-33098 Paderborn, Germany 

Abstract 

Many embedded systems are implemented with a set of alternative function variants to adapt the 

system to different applications or environments. This paper proposes a novel approach for the 

coherent representation and selection of function variants in the different phases of the design 

process. In this context, the modeling of re-configuration of system parts is supported in a natural 

way. Using a real example from the video processing domain, the approach is explained and 

validated. 

References 

[1] P. Chou and G. Boriello. An analysis-based approach to composition of distributed embedded systems. In 

Proceedings Codes/CASHE ’98, pages 3–7, Seattle, USA, March 1998. 

[2] R. Ernst. System architectures. Talk at NATO ASI on System Level Synthesis, August 1998. 

[3] R. Ernst, K. Henriss, and P. Rueffer. Software signal processing on image-engine plattforms. Technical report, 

TU Braunschweig, August 1998. 

[4] A. Jerraya et al. Hardware/Software Co-Design: Principles and Practice, chapter Languages for System-Level 

Specification and Design. Kluwer Academic Publishers, Boston, USA, October 1997. 

[5] A. Kavalade and P. Subrahmanyam. Hardware/software partitioning for multi-function systems. In Proceedings 

ICCAD ’97, pages 516–521, San Jose, USA, November 1997. 

[6] K. Kim, R. Karri, and M. Potkonjak. Synthesis of application specific programmable processors. In Proceedings 

DAC ’97, pages 353–358, Anaheim, USA, June 1997. 

[7] Philips Semiconductors. TriMedia Processor. http://www.semiconductors.philips.com/trimedia/. 

[8] D. Ziegenbein, R. Ernst, K. Richter, J. Teich, and L. Thiele. Combining multiple models of computation for 

scheduling and allocation. In Proceedings Codes/CASHE ’98, pages 9–13, Seattle, USA, March 1998. 

[9] D. Ziegenbein, K. Richter, R. Ernst, J. Teich, and L. Thiele. Representation of process mode correlation for 

scheduling. In Proceedings ICCAD ’98, San Jose, USA, November 1998.

DAC'99, pages 523-528 

Vex - A CAD toolbox 

Jules P. Bergmann and Mark A. Horowitz 

Computer Systems Laboratory, Stanford University, Stanford, CA 94305 

Abstract 

The increasing size and complexity of designs is making the use of hardware description 

languages (HDLs), such as Verilog and VHDL, more prevalent. They are able to describe both 

the initial design and intermediate representations of the design as it is readied for fabrication. 

For large designs, there inevitably are problems with the tool flow that require custom tools to be 

created. These tools must be able to access and modify the HDL for the design, requirements that 

often dwarf the tools’ actual functionality, making them difficult to create without a large effort 

or cutting corners. During the FLASH project at Stanford we created Vex -- a toolbox of 

components for dealing with Verilog, tied together with an interactive scripting language -- that 

simplifies the creation of these tools. It was used to create a number of tools that were critical to 

our design's tape-out and has also been useful in creating design exploration and research tools. 

Bibliography 

[1] R.E. Bryant, “Graph-Based Algorithms for Boolean Function Manipulation,” IEEE Transactions on Computers, 

Vol. C-35, No. 8, August 1986. 

[2] J.R. Burch, E.M. Clarke, K.L. McMillan, D.L. Dill, L.J. Hwang, “Symbolic Model Checking: 1020 States and 

Beyond,” Proceedings of the Conference on Logic in Computer Science, 1990. 

[3] J. Ellson, E. Gasner, E. Koutsofios, S. North, “Graphviz -- Tools for Viewing and Interacting with Graph 

Diagrams,” URL: http://www.research.att.com/sw/tools/graphviz. 

[4] A. Greiner, et al. “Alliance: a Complete Set of CAD Tools for Teaching VLSI Design”, Proceedings of 

EUROCHIP Workshop on VLSI Circuits, September 1992. URL: http://wwwasim.lip6.fr/alliance. 

[5] R.C. Ho, M.A. Horowitz, “Validation Coverage Analysis for Complex Digital Designs,” Proceedings of the 

International Conference on Computer Aided Design, November 1996. 

[6] H. Kapadia, M.A. Horowitz, “Using Partitioning to Help Convergence in the Standard-cell Design Automation 

Methodology,” Proceedings of the Design Automation Conference, June 1999. 

[7] J. Kuskin, D. Ofelt, M. Heinrich, J. Heinlein, R. Simoni, K. Gharachorloo, J. Chapin, D. Nakahira, J. Baxter, M. 

Horowitz, A. Gupta, M. Rosenblum, and J. Hennessy, “The Stanford FLASH Multiprocessor”, Proceedings of the 

International Symposium on Computer Architecture, June 1994. 

[8] J.K. Ousterhout, “Scripting: Higher Level Programming for the 21st Century”, IEEE Computer, March 1998. 

[9] D.E. Thomas, P.R. Moorby, The Verilog Hardware Description Language (2nd Edition), Kluwar Academic 

Publishers, 1995 

[10] L. Wall, T. Christiansen, R.L. Shwartz, Programming Perl (2nd Edition), O’Reilly & Associates, 1996. 

[11] T. Wang, T. Edsall, “Practical FSM Analysis for Verilog”, Proceedings of the International Verilog HDL 

Conference and VHDL International Users Forum, March 1998. URL: http://www.employees.org/~ciscofsm.

DAC'99, pages 529-534 

Constraint Management for Collaborative Electronic Design 

Juan Antonio Carballo 

EECS Department, University of Michigan, Ann Arbor, MI 48109, USA 

Stephen W. Director 

College of Engineering, University of Michigan, Ann Arbor, MI 48109, USA 

ABSTRACT 

Today's complex design processes feature large numbers of varied, interdependent constraints, 

which often cross interdisciplinary boundaries. Therefore, a computer-supported constraint 

management methodology that automatically detects violations early in the design process, 

provides useful violation notification to guide redesign efforts, and can be integrated with 

conventional CAD software can be a great aid to the designer. We present such a methodology 

and describe its implementation in the Minerva II design process manager, along with an 

example design session. 

REFERENCES 

[1] C. Bessiere and J. Regin, “Arc consistency for general constraint networks: preliminary results”, Proc. IJCAI’97: 

398-404. 

[2] F.L. Chan, M.D. Spiller, and A.R. Newton, “Weld - an environment for web-based electronic design”, Proc. 

35th DAC, June 1998. 

[3] T.F. Chiueh and R.H. Katz, “Intelligent VLSI Design Object Management”, in Proc. EDAC, pp. 410-414, 1992. 

[4] J. Cohn et al., “KOAN/ANAGRAM II: New tools for device-level analog placement and routing”, IEEE JSSC, 

26(3):330-342, March 1991. 

[5] J. D’Ambrosio, ConstrLib: An Interval Constraint Propagation Library, AI Lab, The University of Michigan, 

1998. 

[6] M. Fitting, First-Order Logic and Automated Theorem Proving, Springer Verlag, New York, second edition, 

1996. 

[7] S.M. Fohn et al., “A Constraint-system Shell to Support Concurrent Engineering Approaches to Design”, AI in 

Engineering, (9):1–17, 1994. 

[8] S.T. Frezza, S.P. Levitan, and P.C. Chrysanthis, “Requirements-based Design Evaluation”, Proc. 32nd DAC, 

June 1995. 

[9] D. Kuokka et al., “A parametric design assistant for concurrent engineering”, AI-EDAM, no. 9: 135-144, 1995. 

[10] M.Jacome and S.Director, “A formal basis for design process planning and management”, IEEE Trans. CAD, 

15(10):1197–1211, Oct. 1996. 

[11] V. Kumar, “Algorithms for Constraint Satisfaction”, AI Magazine, 13(1):32–44, 1992. 

[12] A. Silberschatz et al., Database System Concepts, McGraw-Hill, 1996. 

[13] P. R. Sutton, J.B. Brockman, and S.W. Director, “Design Management Using Dynamically Defined Flows”, 

Proc. DAC: 648–653, 1993. 

[14] P. R. Sutton and S. W. Director, “A Description Language for Design Process Management”, Proc. 33rd DAC, 

1996. 

[15] P. R. Sutton and S. W. Director, “Framework Encapsulations: A New Approach to CAD Tool Interoperability”, 

Proc. 35th DAC, June 1998. 

[16] K.O. ten Bosch et al., “Design Flow Management in the Nelsis CAD Framework”, Proc. 28th DAC: 711–716, 

June 1991. 

[17] M. Zaman, MISTIC User’s Guide, Univ. of Michigan, 1997.

DAC'99, pages 535-536 

Panel: MEMS CAD Beyond Multi-Million Transistors 

Chair: Kris Pister - University of California Berkeley, Berkeley, CA 

Panel Members: Albert P. Pisano, Nicholas Swart, Mike Horton, John Rychcik, 

John R. Gilbert, Gerry K. Fedder 

Existing MEMS products boast more than a million electrical and mechanical components on a 

single chip. With several billion dollars in sales in 1997 and exponential growth it is clear that 

MEMS fabrication technology has leveraged decades of IC expertise to great advantage. As a 

result, the fabrication capabilities far outstrip the design capabilities in both industry and 

university environments. MEMS CAD tools are only now beginning to leverage corresponding 

decades of IC CAD expertise to address the exciting and unique electro-mechanical co-design 

problems from the physical through system level design. Perspectives represented in this panel 

include the industry needs at the system and device levels, tool developers at all levels, and the 

government research vision. 

Questions to be addressed include: 

- How do the current tools help or impede the development of new products? 

- What are the breakthrough markets for MEMS and how will they challenge the existing CAD 

tools? 

- What are the challenges for the next generation?

DAC'99, pages 537-542 A Multiscale Method for Fast Capacitance Extraction 

Johannes Tausch 

Dept. of Mathematics, Southern Methodist University, Dallas, TX 75275-0156 

Jacob White 

Department of EECS, MIT, Cambridge, MA 02139 

Abstract 

The many levels of metal used in aggressive deep submicron process technologies has made fast 

and accurate capacitance extraction of complicated 3-D geometries of conductors essential, and 

many novel approaches have been recently developed. In this paper we present an accelerated 

boundary-element method, like the well-known FASTCAP program, but instead of using an 

adaptive fast multipole algorithm we use a numerically generated multiscale basis for 

constructing a sparse representation of the dense boundary-element matrix. Results are presented 

to demonstrate that the multiscale method can be applied to complicated geometries, generates a 

sparser boundary-element matrix than the adaptive fast multipole method, and provides an 

inexpensive but effective preconditioner. Examples are used to show that the better sparsification 

and the effective preconditioner yield a method that can be 25 times faster than FASTCAP while 

still maintain accuracy in the smallest coupling capacitances. 

References 

[1] W. Shi, J. Liu, N. Kakani, and T. Yu, A Fast Hierarchical Algorithm for 3-D Capacitance Extraction Proceeding 

of the 29th Design Automation Conference, San Francisco, CA, June, 1997, pp. 212-217. 

[2] V. Veremey and R. Mittra, A Technique for Fast Calculation Of Capacitance Matrices of Interconnect Structures 

IEEE Transactions of Components, Packaging, and Manufacturing Technology, Part B: Advanced Packaging, Vol 

21, No 3, pp. 241-249. 

[3] W. Hong, W. K. Sun, Z. H. Zhu, H. Ji, B. Song, and W. Dai, A Novel Dimension-Reduction Technique for the 

Capacitance Extraction of 3-D VLSI Interconnects MTT, Vol. 46, No. 8, pp 1037-1044. 

[4] J. R. Phillips and J. K. White, “A Precorrected-FFT method for Electrostatic Analysis of Complicated 3-D 

Structures,” IEEE Trans. on Computer-Aided Design, October 1997, Vol. 16, No. 10, pp. 1059-1072. 

[5] S. Kapur and J. Zhao,”A fast method of moments solver for efficient parameter extraction of MCMs”Design 

Automation Conference, 1997 pp. 141–146. 

[6] G. Beylkin, R. Coifman, and V. Rohklin. Fast wavelet transforms and numerical algorithms. Comm. Pure Appl. 

Math., XLIV:141–183, 1991. 

[7] K. Nabors and J. White, “FastCap: A Multipole-Accelerated 3-D Capacitance Extraction Program,” IEEE 

Transactions on Computer-Aided Design, vol. 10 no. 10, November 1991, p1447-1459. 

[8] R. F. Harrington, Field Computation by Moment Methods. New York: MacMillan, 1968. 

[9] A. E. Ruehli and P. A. Brennan, “Efficient capacitance calculations for three-dimensional multiconductor 

systems,” MTT, vol. 21, pp. 76–82, February 1973. 

[10] Y. L. Le Coz and R. B. Iverson, “A stochastic algorithm for high speed capacitance extraction in integrated 

circuits,” Solid State Electronics, vol. 35, no. 7, pp. 1005–1012, 1992. 

[11] Lesslie Greengard. The Rapid Evaluation of Potential Fields in Particle Systems. MIT Press, Cambridge, 

Massachusetts, 1988. 

[12] Harry Yserentant. On the multi-level splitting of finite element spaces. Numer. Math., 49:379–412, 1986. 

[13] K. Nabors, F. T. Korsmeyer, F. T. Leighton, and J. White. Preconditioned, adaptive, multipole-accelerated 

iterative methods for three-dimensional first-kind integral equations of potential theory. SIAM J. Sci. Statist. 

Comput., 15(3):713–735, 1994.

DAC'99, pages 543-548 

Efficient Capacitance Computation for Structures with Non-Uniform 

Adaptive Surface Meshes 

Vikram Jandhyala, Scott Savage, Eric Bracken, Zoltan Cendes 

Ansoft corporation, Pittsburgh, PA 15219 

Abstract 

Circuit parasitic extraction problems are typically formulated using discretized integral equations 

that use basis functions defined over tesselated surface meshes. The Fast Multipole Method 

(FMM) accelerates the solution process by rapidly evaluating potentials and fields due to these 

basis functions. Unfortunately, the FMM suffers from the drawback that its efficiency degrades 

if the surface mesh has disparately-sized elements in close proximity to each other. Closelyspaced 

non-uniformly sized elements can appear in realistic situations for a variety of reasons: 

owing to mesh refinement, due to accurate modeling requirements for fine structural features, 

and because of the presence of thin doubly-walled structures. In this paper, modifications to the 

standard multilevel FMM are presented that permit efficient potential and field evaluation over 

specific non-uniform meshes. The efficiency of the new technique is demonstrated through 

examples involving large surface meshes with non-uniformly sized elements in close proximity. 

References 

[1] R.F. Harrington, Field Computation by Moment Methods, Krieger, Malabar, FL, 1982. 

[2] Y. Saad, Iterative Methods for Sparse Systems, PWS Publishing Company, New York, NY, 1996. 

[3] C. R. Anderson, “An implementation of the fast multipole method without multipoles," SIAM J. Sci. Stat. 

Comput., vol. 16, pp. 1082-1091, July 1992. 

[4] L. Greengard and V. Rokhlin, “A fast algorithm for particle simulations," J. Comp. Phys., vol. 73, pp. 1447- 

1459, 1987. 

[5] K. Nabors, F.T. Korsmeyer, F.T. Leighton, and J. White, “Preconditioned, adaptive, multipole-accelerated 

iterative methods for three-dimensional potential problems," SIAM J. Sci. Stat. Comput., vol. 15, pp. 713-735, May 

1994. 

[6] L. Greengard, The Rapid Evaluation of Potential Fields in Particle Systems, MIT Press, Cambridge, MA, 1988. 

[7] S. Rao, T. Sarkar, and R. Harrington, “The electrostatic field of conducting bodies in multiple dielectric media," 

IEEE Trans. Microwave Theory Tech., vol. 32, pp. 1441-1448, November 1984. 

[8] K. Nabors and J. White, “Fastcap : a multipole accelerated 3-d capacitance extraction program," IEEE Trans. 

Computer-Aided Design, vol. 10, pp. 1447-1459, November 1991. 

[9] V. Jandhyala, E. Michielssen, and R. Mittra, “Multipole-accelerated capacitance computation for 3-d structures 

in a stratified dielectric medium using a closed form green's function," Int. J. Microwave Millimeter-Wave 

Computer-Aided Engg., vol. 5, pp. 68-78, May 1995. 

[10] J.R. Phillips and J.K. White, “A precorrected-fft method for electrostatic analysis of complicated 3-d 

structures," IEEE Trans. Computer-Aided Design Integ. Circuits Syst., vol. 16, pp. 1059-1072, October 1997. 

[11] J. M. Song and W. C. Chew, “Multilevel fast-multipole algorithm for solving combined field integral equations 

of electro-magnetic scattering," Microwave Opt. Tech. Lett., vol. 10, pp. 

14-19, September 1995. 

[12] Z. Wang, Y. Yuan, and Q. Wu, “A parallel multipole accelerated 3-d capacitance simulator based on an 

improved model," IEEE Trans. Computer-Aided Design Integ. Circuits Syst., vol. 15, pp. 1441-1450, December 

1996.

DAC'99, pages 549-554 

Substrate Modeling and Lumped Substrate Resistance Extraction for CMOS ESD/Latchup 

Circuit Simulation 

Tong Li*, Ching-Han Tsai**, Elyse Rosenbaum**, Sung-Mo (Steve) Kang** 

*Silicon Perspective Corp., Santa Clara, CA 95054 

**Coordinated Science Laboratory, Department of Electrical and Computer Engineering 

University of Illinois at Urbana-Champaign, Urbana, IL 61802 

ABSTRACT 

Due to interactions through the common silicon substrate, the layout and placement of devices 

and substrate contacts can have significant impacts on a circuit's ESD (Electrostatic Discharge) 

and latchup behavior in CMOS technologies. Proper substrate modeling is thus required for 

circuit-level simulation to predict the circuit's ESD performance and latchup immunity. In this 

work we propose a new substrate resistance network model, and develop a novel substrate 

resistance extraction method that accurately calculates the distribution of injection current into 

the substrate during ESD or latchup events. With the proposed substrate model and resistance 

extraction, we can capture the three-dimensional layout parasitics in the circuit as well as the 

vertical substrate doping profile, and simulate these effects on circuit behavior at the circuit-level 

accurately. The usefulness of this work for layout optimization is demonstrated with an industrial 

circuit example. 

References 

[1] A. Amerasekera, C. Duvvury, V. Reddy and M. Rodder, “Substrate Triggering and Salicide Effects on ESD 

Performance and Protection Circuit Design in Deep Submicron CMOS Processes," International Electron Devices 

Meeting, pp. 547-550, 1995. 

[2] A. Amerasekera, S. Ramaswamy, M. Chang and C. Duvvury, “Modeling MOS Snapback and Parasitic Bipolar 

Action for Circuit-Level in ESD and High Current Simulations," International Reliability Physics Symposium, pp. 

318-326, 1996. 

[3] J. Chen, A. Amerasekera and C. Duvvury, “Design Methodology for Optimizing Gate Driven ESD Protection 

Circuits in Submicron CMOS Processes," EOS/ESD Symposium, pp. 230-239. 

[4] C. Diaz, S. M. Kang and C. Duvvury, “Circuit-level Electrothermal Simulation of Electrical Over-stress Failures 

in Advanced MOS I/O Protection Devices," IEEE Trans. on CAD, vol. 13, no. 4, pp. 482-493, 1994. 

[5] C. Duvvury and R. Rountree, “A Synthesis of ESD Input Protection Scheme," EOS/ESD Symposium, pp. 88-97, 

1991. 

[6] C. Duvvury and A. Amerasekera, “ESD: A Pervasive Reliability Concern for IC Technologies," Proc. of the 

IEEE, vol. 81, no. 5, pp. 690-702, May 1993. 

[7] Y. Fong and C. Hu, “Internal ESD Transients in Input Protection Circuits," IEEE International Reliability 

Symposium, pp. 77-81, 1989. 

[8] R. Gharpurey and R. G. Meyer, “Modeling and Analysis of Substrate Coupling in Integrated Circuits," IEEE 

Custom Integrated Circuits Conference, pp. 125-128, 1995. 

[9] X. Guggenmos and R. Holzner, “A New ESD Protection Concept For VLSI CMOS Circuits Avoiding Circuit 

Stress," EOS/ESD Symposium, pp. 74-82, 1991. 

[10] M. J. Hargrove, S. Voldman, R. Gauthier, J. Brown, K. Duncan, and W. Craig, “Latchup in CMOS 

Technology," pp. 269-278, International Relibility Physics Symposium, 1998. 

[11] F. C. Hsu, P. K. Ko, S. Tam, C. Hu and R. S. Muller, “An Analytical Breakdown Model for Short-Channel 

MOSFET's", IEEE Trans. on Electron Devices, Vol. 29, No. 11, pp. 1735-1740, 1982. 

[12] J. Huang, Z. Liu, M. Jeng, K. Hui, M. Chan, P. Ko and C. Hu, “BSIM3 Manual (version 2)," University of 

California, Berkeley, 1994. 

[13] C. C. Johnson and T. J. Maloney “Two Unusual HBM ESD Failure Mechanims on a Mature CMOS Process," 

EOS/ESD Symposium, pp. 225-231, 1993.

[14] K. J. Kerns and A. T. Yang, “Stable and Efficient Reduction of Large, Multiport RC Networks by Pole 

Analysis via Congrunence Transformations," IEEE Trans. on Computer-Aided Design, Vol. 16, No. 7, July 1997. 

[15] S. Laux and F. Gaensslen, “A Study of Channel Avalanche Breakdown in Scaled n-MOSFETs", IEEE Trans. 

on Electron Devices, Vol. ED-34, No. 5, pp. 1066-1073, 1987. 

[16] T. Li and S. M. Kang, “Layout Extraction and Verification Methodology for CMOS I/O Circuits," IEEE/ACM 


[17] T. Li, “Design Automation for Reliable CMOS Chip I/O Circuits" Ph.D. Dissertation, University of Illinois at 

Urbana-Champaign, 1998. 

[18] S. Ramaswamy, C. Duvvury, A. Amerasekera, V. Reddy and S. M. Kang, “EOS/ESD Analysis of High-Density 

Logic Chips," EOS/ESD Symposium, pp. 285-290, 1996. 

[19] Y. Saad and M. H. Schultz, “GMRES: A Generalized Minimum Residual Algorithm for Solving Nonsymmetric 

Linear Systems," SIAM Journal Scientific Statistical Comput., vol. 7, pp. 856-859, July 1986. 

[20] T. Smedes, N. P. van der Meijs and A. J. van Genderen, “Extraction of Circuit Models for Substrate Crosstalk," 

International Conference on Computer-Aided Design, 1995. 

[21] D. K. Su, J. Loinaz, S. Masui and B. A. Wooley, “ Experimental Results and Modeling Techniques for 

Substrate Noise in Mixed-Signal Integrated Circuits," IEEE Journal of Solid-State Circuits, pp. 420-430, Vol. 28, 

No. 4, April 1993. 

[22] Technology Modeling Associates, Inc., Palt Alto, California, MEDICI, Two Dimensional Device Simulation 

Program, 1992. 

[23] R. R. Troutman, “Latchup in CMOS Technology : The Problem and Its Cure," Kluwer Academic Publishers, 

1986. 

[24] R. R. Troutman and M. J. Hargrove, “Transmission Line Modeling of Substrate Resistances and CMOS 

Latchup," IEEE Trans. on Electron Devices, Vol. 33, No. 7, pp. 945-954, July 1986. 

[25] N. K. Verghese and David J. Allstot, “Rapid Simulation of Substrate Coupling Effects in Mixed-Mode ICs," 

IEEE Custom Integrated Circuits Conference, pp. 18.3.1-18.3.4, 1993. 

[26] Y. Wei, Y. Loh, C. Wang and C. Hu, “Effect of Substrate Contact on ESD Failure of Advanced CMOS 

Integrated Circuits," EOS/ESD Symposium, pp. 221-224, 1993. 

[27] P. Yang and J. Chern, “ Design for Reliability : the Major Challenge for VLSI," Proc. of the IEEE, vol.81, no.5, 

pp. 730-743, May 1993.

DAC'99, pages 555-561 

Dynamic Power Management Based On Continuous-Time Markov Decision Processes 

Qinru Qiu and Massoud Pedram 

Department of Electrical Engineering-Systems 

University of Southern California, Los Angeles, California, USA 

Abstract 

This paper introduces a continuous-time, controllable Markov process model of a powermanaged 

system. The system model is composed of the corresponding stochastic models of the 

service queue and the service provider. The system environment is modeled by a stochastic 

service request process. The problem of dynamic power management in such a system is 

formulated as policy optimization problem and solved using an efficient "policy iteration" 

algorithm. Compared to previous work on dynamic power management, our formulation allows 

better modeling of the various system components, the power-managed system as whole, and its 

environment. In addition it captures dependencies between the service queue and service 

provider status. Finally, the resulting power management policy is asynchronous, hence it is 

more power-efficient and more useful in practice. Experimental results demonstrate the 

effectiveness of our policy optimization algorithm compared to a number of heuristic (time-out 

and N-policy) algorithms. 

REFERENCES 

[1] A. Chandrakasan, R. Brodersen, Low Power Digital CMOS Design, Kluwer Academic Publishers, July 1995. 

[2] M. Horowitz, T. Indermaur, and R. Gonzalez, “Low-Power Digital Design”, IEEE Symposium on Low Power 

Electronics, pp.8-11, 1994. 

[3] A. Chandrakasan, V. Gutnik, and T. Xanthopoulos, “Data Driven Signal Processing: An Approach for Energy 

Efficient Computing”, 1996 International Symposium on Low Power Electronics and Design, pp. 347-352, Aug. 

1996. 

[4] J. Rabaey and M. Pedram, Low Power Design Methodologies, Kluwer Academic Publishers, 1996 

[5] L. Benini and G. De Micheli, Dynamic Power Management: Design Techniques and CAD Tools, Kluwer 


[6] Intel, Microsoft and Toshiba, “Advanced Configuration and Power Interface specification”, URL: 

http://www.intel.com/ial/powermgm/specs.html, 1996 

[7] U. Narayan Bhat, “Elements Of Applied Stochastic Processes”, John Wiley & Sons, Inc. 1984 

[8] B. Miller, “Finite State Continuous Time Markov Decision Processes With an Finite Planning Horizon.” SIAM J. 

Control, Vol. 5, No. 2, pp. 266-281, 1968. 

[9] B. Miller, “Finite State Continuous Time Markov Decision Processes With an Infinite Planning Horizon”. J. Of 

Mathematical Analysis and Applications, No. 22, pp. 552-569, 1968. 

[10] R.A.Howard, Dynamic Programming and Markov Processes, Wiley, New York, 1960 

[11] G. A. Paleologo, L. Benini, et.al, “Policy Optimization for Dynamic Power Management”, Proceedings of 

Design Automation Conference, pp.182-187, Jun. 1998. 

[12] D. P. Heyman, M. J. Sobel, Stochastic Models in Operations Research, McGraw-Hill Book Company, 1982 

[13] L. Benini, A. Bogliolo, S. Cavallucci, B. Ricco, “Monitoring System Activity For OS-Directed Dynamic Power 

Management”, Proceedings of International Symposium of Low Power Electronics and Design Conference, pp. 185- 

190, Aug. 1998. 

[14] L. Benini, R. Hodgson, P. Siegel, “System-level Estimation And Optimization”, Proceedings of International 

Symposium of Low Power Electronics and Design Conference, pp. 173-178, Aug. 1998. 

[15] G. Bolch, S. Greiner, H. D. Meer and K. S. Trivedi, Queueing Networks and Markov Chains, John Wiley & 

Sons, Inc., 1998 

[16] M. Srivastava, A. Chandrakasan. R. Brodersen, “Predictive system shutdown and other architectural techniques 

for energy efficient programmable computation," IEEE Transactions on VLSI Systems, Vol. 4, No. 1 (1996), pages 

42-55.

[17] C.-H. Hwang and A. Wu, “A Predictive System Shutdown Method for Energy Saving of Event-Driven 

Computation,” Proc. of the Intl. Conference on Computer Aided Design, pages 28-32, November 1997. 

[18] Q. Qiu, Q. Wu and M. Pedram, “Dynamic Power management: A Continuous-Time Stochastic Approach”, 

USC EE-Systems Dept., CENG 99-02.

DAC'99, pages 562-567 

Parallel Mixed-Level Power Simulation Based on Spatio-Temporal Circuit Partitioning 

Mauro Chinosi, Roberto Zafalon, and Carlo Guardiani 

Advanced Research, Central R&D DAIS, SGS-THOMSON Agrate B. (MI), ITALY 

Abstract 

In this work we propose a technique for spatial and temporal partitioning of a logic circuit based 

on the nodes activity computed by using a simulation at an higher level of abstraction. Only 

those components that are activated by a given input vector are added to the detailed simulation 

netlist. The methodology is suitable for parallel implementation on a multi-processor 

environment and allows to arbitrarily switch between fast and detailed levels of abstraction 

during the simulation run. The experimental results obtained on a significant set of benchmarks 

show that it is possible to obtain a considerable reduction in both CPU time and memory 

occupation together with a considerable degree of accuracy. Furthermore the proposed technique 

easily fits in the existing industrial design flows. 

REFERENCES 

[1] R. Zafalon, C. Guardiani, “Power Estimation and Synthesys: An Industrial perspective”, Invited talk at 

PATMOS-97 

[2] ELDO, “User Manual”, Mentor Graphics, Wilsonville, Oregon 

[3] A. Devgan and R. Rohrer, “Event Driven Adaptively Controlled Explicit Simulation of Integrated Circuits”, 

1993 

[4] C. X. Huang, etc., “The Design and Implementation of PowerMill”, ACM/IEEE International Symposium on 

Low Power Design, pp. 105-109, 1995 

[5] R. Lipsett, C. Shaefer and C. Ussery, “VHDL: Hardware Description and Design”, Kluwer, 1990, Boston, MA 

[6] DesignPower, “Reference Manual v1998.02”, Synopsys Inc., Moutainview, CA, 1998 

[7] WattWatcher, “User manual”, Sente’, Inc., Acton, MA 

[8] M. Nemani, F. N. Najm, “Towards a High-Level Power Estimation Capability”, IEEE Transaction on CAD of 

Integrated Circuits and Systems, pp. 588-598, Vol. 15, No. 6, june 1996 

[9] L. Benini, G. De Micheli, E. Macii, M. Poncino, R. Scarsi, “Fast Power Estimation for Deterministic input 

Streams”, ICCAD-97, pp. 494-501 

[10] L. Benini, G. De Micheli, E. Macii, M. Poncino, R. Scarsi, “Quick Generation of Temporal Power Waveforms 

for RT-Level Hard Macros”, IEEE-97 International Conference onInnovative Systems in Silicon, ISIS-97, pp. 331- 

337 

[11] R. A. Saleh, B. A. A. Antao and J. Singh, “Multilevel and Mixed-Domain Simulation of Analog Circuits and 

Systems”, IEEE Transaction on CAD of Integrated Circuits and Systems, Vol. 15, No. 1, Jan. 1996 

[12] P. Vanoostende, P. Six, J. Vandewalle and H. J. De Man, “Estimation of Typical Power of Synchronous CMOS 

Circuits Using a Hierarchy of Simulators”, JSSC, Vol. 28, No. 1, Jan. 1993 

[13] F. M. Johannes, “Partitioning of VLSI Circuits and Systems”, 33th DAC, pp. 83-87, 1996 

[14] D. Rabe, G. Jochens, L. Kruse, W. Nebel, “Power-Simulation of Cell Based ASICs: Accuracy and Performance 

Trade-Offs”, Proceedings of DATE-98, pp. 356-361 

[15] E. Naroska, “Parallel VHDL Simulation”, Proceedings of DATE-98, pp. 159-163 

[16] V. Kim, “Parallel Algorithms for CMOS Power Estimation”, Master Thesis, Northwestern University, 1997, 

available at http://www.ece.nwu.edu/cpdc/TechReports/ 

[17] V. Kim and P. Banerjee, “Parallel Algorithms for Power Estimation”, Proceedings of DAC-98 

[18] M. Kassab, E. Cerny, S. Aourid, T. Krodel, “Propagation of Last-Transition-Time Constraints in Gate Level 

Timing Analysis”, Proceedings of DATE-98, pp. 796-802 

[19] L. T. Pillage, R. A. Roher, C. Visweswariah, “Electronic Circuit and System Simulation Methods”, 

McGrawHill 

[20] VERILOG-XL, “Reference Manual”, CADENCE Design Systems

DAC'99, pages 568-573 

Low-Power Behavioral Synthesis Optimization Using Multiple Precision Arithmetic 

Milos Ercegovac, Darko Kirovski, George Potkonjak 

Computer Science Department, University of California, Los Angeles 

Abstract 

Many modern multimedia applications such as image and video processing are characterized by 

a unique combination of arithmetic and computational features: fixed-point arithmetic, a variety 

of short data types, high degree of instruction-level parallelism, strict timing constraints, and 

high computational requirements. Computationally intensive algorithms usually boost device's 

power dissipation which is often key to the efficiency of many communications and multimedia 

applications. Although recently virtually all general-purpose processors have been equipped with 

multiprecision operations, the current generation of behavioral synthesis tools for applicationspecific 

systems does not utilize this power/performance optimization paradigm. 

In this paper, we explore the potential of using multiple precision arithmetic units to effectively 

support synthesis of low-power application-specific integrated circuits. We propose a new 

architectural scheme for collaborate addition of sets of variable precision data. We have 

developed a novel resource allocation and computation assignment methodology for a set of 

multiple precision arithmetic units. The optimization algorithms explore the trade-off allocating 

low-width bus structures and executing multiple-cycle operations. Experimental results indicate 

strong advantages of the proposed approach. 

References 

[Ber97] http://www-cad.eecs.berkeley.edu/Software/ 

[Bli97] J.F. Blinn. Fugue for MMX. IEEE Computer Graphics and Applications, vol.17, (no. 2), pp.88-93, 1997. 

[Cha92] A.P. Chandrakasan, M. Potkonjak, J. Rabaey, R.W. Brodersen. HYPER-LP: a system for power 

minimization using architectural transformations. International Conference on Computer-Aided Design, pp.300-3, 

1992. 

[Che96] W. Chen, et. al. Native signal processing on the Ultrasparc in the Ptolemy environment. Asilomar 

Conference on Signals, Systems and Computers, vol.2, pp.1368-72, 1997. 

[Erc96] M. Ercegovac, D. Kirovski, G. Mustafa, M. Potkonjak. LowPower Behavioral Synthesis Optimization 

Using Multiple Precision Arithmetic. Technical Report, Computer Science Department, University of California, 

Los Angeles, 1996. 

[Gar79] M.R. Garey, D.S. Johnson. Computers and intractability: a guide to the theory of NP-completeness. W.H. 

Freeman, San Francisco, 1979. 

[Gol96] G. Goldman, P. Tirumalai. UltraSPARC-II: the advancement of ultracomputing. COMPCON, 1996. 

[Jah97] B. Jahne. SIMD image processing algorithms with the Intel multimedia extension instruction set. 

Automatisierungstechnik, vol.45, (no.10), pp.453-60, 1997. 

[Kar93] W. Karmer. Multiple-precision computations with result verification. Scientific computing with automatic 

result verification, Academic Press, pp.325-56, 1993. 

[Lak98] G. Lakshminarayana and N.K. Jha. Synthesis of poweroptimized and area-optimized circuits from 

hierarchical behavioral descriptions. Design Automation Conference, pp.43944, 1998. 

[Lee97] C. Lee, et. al. DSP Quant: Design, Validation, and Applications of DSP Hard Real-Time Benchmarking. 

ICCASP, 1997. 

[Lou95] M.E. Louie, M.D. Ercegovac. A variable-precision square root implementation for field programmable gate 

arrays. Journal of Supercomputing, vol.9, (no. 3), pp-315-36, 1995. 

[Mou96] Z.J.A. Mou, D.S. Rice, D. Wei. VIS-based native video processing on UltraSPARC. International 

Conference on Image Processing, pp.153-6, 1996. 

[Pel96] A. Peleg, U. Weiser. MMX technology extension to the Intel architecture. IEEE Micro, vol.16, (no. 4), 

pp.42-50, 1996.

[Pot92] M. Potkonjak, J. Rabaey. Maximally fast and arbitrarily fast implementation of linear computations (circuit 

layout CAD). International Conference on Computer-Aided Design, pp.304-8, 1992. 

[Pot95] M. Potkonjak, M.B. Srivastava. Behavioral synthesis of high performance, low cost, and low power 

application specific processors for linear computations. International Conference on Application Specific Array 

Processors, p.45-56, 1994. 

[Rag94] A. Raghunathan, N.K. Jha. Behavioral synthesis for low power. International Conference on Computer 

Design, pp.318- , 22, 1994. 

[Sal89] A. Salz, M. Horowitz. IRSIM: an incremental MOS switchlevel simulator. Design Automation Conference, 

pp.173-8, 1989. 

[Sch95] M.J. Schulte, E.E., Jr. Swartzlander. Hardware design and arithmetic algorithms for a variable-precision, 

interval arithmetic coprocessor. 12th Symposium on Computer Arithmetic, 1995. 

[Sin95] D. Singh, et al. Power conscious CAD tools and methodologies: a perspective. Proc. of the IEEE, vol.83, 

(no.4), pp.570-94, 1995. 

[Smi96] D.M. Smith. A multiple-precision division algorithm. Mathematics of Computation, vol.65, (no. 213), pp- 

157-63, 1996. 

[Tak95] N. Takagi. A multiple-precision modular multiplication algorithm with triangle additions. IEICE 

Transactions on Information and Systems, vol.E78, 1995. 

[Zho95] C.G. Zhou, et. al. MPEG video decoding with the UltraSPARC visual instruction set. COMPCON,1995.

DAC'99, pages 574-579 A Methodology For the Verification of a “System on Chip” 

Daniel Geist, Giora Biran, Tamara Arons, Michael Slavkin, Yvgeny Nustov, Monica Farkas, 

Karen Holtz 

IBM Haifa Research Lab, MATAM Advanced Technology Center, Haifa, Israel 

Andy Long, Dave King, Steve Barret 

IBM Field Design Center, Essex Junction, VT, U.S.A. 

ABSTRACT 

This paper summarizes the verification effort of a complex ASIC designated to be an "all in one" 

ISDN network router. This ASIC is unique because it actually consists of many independent 

components, called "cores" (including the processor). The integration of these components onto 

one chip results in an ISOC (Integrated System On a Chip). The complexity of verifying an 

ISOC is virtually impossible without a proper methodology. This paper presents the 

methodology developed for verifying the router. In particular, the verification method as well as 

the tools that were built to execute this method are presented. Finally, a summary of the 

verification results is given. 

Keywords: Systems on chip,verification, test and debugging. 

REFERENCES 

[1] A. Aharon, D. Goodman, M. Levinger, Y. Lichtenstein, Y. Malka, C. Metzger, M. Molcho, and G. Shurek. Test 

program generation for functional verification of powerpc processors in ibm. DAC, 1995. 

[2] A.Aharon, A. Bar-David, B. Dorfrman, E. Gofman, M. Leibowitz, and V. Schwartzburd. Verification of the IBM 

RISC System/6000 by a dynamic biased pseudo-random test program generator. IBM Systems Journal, 30(4), April 

1991. 

[3] G. Biran. MAL Functional Spec.. HDG, Haifa, ISRAEL, 1997. 

[4] A. Chandra, V. Iyengar, D. Jameson, R. Jawalkelar, I. Nair, B. Rosen, M. Mullen, J. Yoon, R. Armoni, D. Geist, 

and Y. Wolfsthal. AVPGEN - A Test Case Generator for Architecture Verification. IEEE Transactions on VLSI 

Systems, 6(6), June 1995. 

[5] R. Grinwald, Harel E., M. Orgad, S. Ur, A. Ziv. User defined coverage- a tool supported methodology for design 

verification. DAC 1998. 

[6] C. May, E. Silha, R. Simpson, and H. Warren, editors. The PowerPC Architecture. Morgan Kaufmann, 1994. 

[7] A.Mesh, EmacII Functional Spec., HDG, Haifa, ISRAEL, 1997. 

[8] M. Schaffer and E. Green. On-Chip Peripheral Bus Specification. PowerPC Embedded Proceesor Solutions, 

RTP, NC, Mar. 1996. 

[9] M. Schaffer and J. Revilla. PowerPC 4XX Local Bus Specification. PowerPC Embedded Proceesor Solutions, 

RTP, NC, Oct. 1996.

DAC'99, pages 580-585 

ICEBERG: An Embedded In-circuit Emulator Synthesizer for Microcontrollers 

Ing-Jer Huang and Tai-An Lu 

Institute of Computer and Information Engineering 

National Sun Yat-sen University, Kaohsiung, Taiwan, R. O. C. 

Abstract 

This paper presents a synthesis tool ICEBERG for embedded in-circuit emulators (ICE's), that 

are part of the development environment for microcontroller (or microprocessor)-based systems 

(PIPER-II). the tool inserts and integrates the necessary in-circuit emulation circuitry into a given 

RTL core of a microcontroller, and thus turning the core into an embedded ICE. The ICE, based 

on the IEEE 1149.I JTAG architecture, provides standard debugging mechanisms, including 

boundary scan paths, partial scan paths, single stepping, internal resource monitoring and 

modification, breakpoint detection, and mode switching between debugging and free running 

modes. ICEBERG has been successfully applied to synthesize the embedded ICE for an 

industrial microcontroller HT48100 from its RTL core. 

Reference 

[1] HT48100 Development Data Book, Holtek Microelectonics, Dec. 1994. 

[2] I-J Huang and A. Despain, "Synthesis of Application Specific Instruction Sets," IEEE Trans. On ComputerAided 

Design of Integrated Circuits and Systems, 1994. 

[3] Ing-Jer Huang, Li-Rong Wang, Yu-Min Wang, "Synthesis and Analysis of an Industrial Microcontroller," In 

Proceedings of Asia And South Pacific Design Automation Conference (ASP-DAC'97), 1997. 

[4] David W. Knapp, Behavioral Synthesis, Digital System Design Using Synopsys Behavioral Compiler, 1996. 

[5] "Concepts of Emulation And Analysis, Edition 1," Hewlett Packard, Nov. 1990 

[6] Gernot Koch, Udo Kebschull, Wolfgang Rosensitel, "Breakpoints and breakpoint Detection in Source Level 

Emulation," International Symposium of System Synthesis, 1996. 

[7] IEEE Standard Test Access Port and Boundary-Scan Architecture, IEEE Std 1149.1.1 a-1993 

[8] Colin M. Maunder and Rodham E. Tulloss, "The Test Access Port and Boundary-Scan Architecture," IEEE 

Computer Society Press Tutorial, 1990. 

[9] Nur A. Touba and Bahram Pouya," Testing Embedded Cores Using Partial Isolation Rings" 

[10] V. Fernandez and P. Sanchez, " Partial Scan HighLevel Synthesis," IEEE ED&TC, 1996. 

[11] "The ARM7TDMI Debug Architecture," Application Note 28, Dec. 1995. 

[12] ARM 7TDMI Data Sheet, Advanced RISC Machines Ltd., 1995. 

[13] Neil H.E. Weste and Kamran Eshraghian, Principles of CMOS VLSI Design, 2 nd ed., P 505-508 

[14] ICE Production Information, Microtek International, http: //server3. microtek. com. tw/mice/product. html. 

[15] Design Ware, Synopsys Corp., 1998. 

[16] K. Sievert, et al., "On-chip Emulation and Debugging for Embedded Microcontrollers using the IMS 

ScanDebugger," European Design and Test Conference, pp. 229-232, 1995. 

[17] R. Zak Jr. and Jeffrey Hill, "An IEEE 1149.1 Compliant Testability Architecture with Internal Scan," 

Proceeding of Int'l Conference on Computer Design, 1992.

DAC'99, pages 586-591 

Microprocessor Based Testing for Core-Based System on Chip 

C. A. Papachristou F. Martin M. Nourani 

Computer Engineering Program, EECS Dept., Case Western Reserve University 

Cleveland, OH 44106 

Abstract 

The purpose of this paper is to develop a flexible design for test methodology for testing a corebased 

system on chip (SOC). The novel feature of the approach is the use an embedded 

microprocessor/memory pair to test the remaining components of the SOC. Test data is 

downloaded using DMA techniques directly into memory while the microprocessor uses the test 

data to test the core. The test results are tranferred to a MISR for evaluation. The approach has 

several important advantages over conventional ATPG such as achieving at-speed testing, not 

limiting the chip speed to the tester speed during test and achieving great flexibility since most of 

the testing process is based on software. Experimental results on an example system are 

discussed. 

References 

[1] F.P.M. Beenker, R.G. Bennetts and A.P. Thijssen, “Testability Concepts for Digital ICs, The Macro Test 

Approach," Kluwer Acad. Publishers, 1995. 

[2] L. Whetsel, “An IEEE 1149.1 Based Test Architecture for ICs with Embedded IP Cores," Intern. Test Conf. 

(ITC-97), Nov. 1997. 

[3] K. De, “Test methodology for embedded cores which protects intellectual property," VLSI Test Sym. (VTS-97), 

pp. 2-9, May 1997. 

[4] R. Chandramouli and S. Pateras, “Testing Systems on a Chip," IEEE Spectrum, pp. 42-47, Nov. 1996. 

[5] I. Ghosh, N. Jha and S. Dey “A Low Overhead Design for Testability and Test Generation Technique for Core- 

Based Systems" Intern. Test Conf. (ITC-97), Nov. 1997. 

[6] V. Immaneni and S. Raman, “Direct Access Test Scheme - Design of Block and Core Cells for Embedded 

ASICs," Intern. Test Conf. (ITC-90), pp. 488-492, Oct. 1990. 

[7] M. Nourani and C. Papachristou, “Parallelism in Structural Fault Testing of Embedded Cores," 16th VLSI Test 

Sym. (VTS-98), pp. 15-20, April 1998. 

[8] N. Touba and B. Pouya, “Testing embedded cores using partial isolation rings," VLSI Test Sym. (VTS-97), pp. 

1016, May 1997. 

[9] N. Touba and B. Pouya, “Modifying User-defined Logic for Test Access to Embedded Cores," Intern. Test Conf. 

(ITC-97), Nov. 1997. 

[10] “VSI Alliance", Architecture Document, Version 1.0, 1997. 

[11] A.J. van de Goor and Th. J. Verhallen, “Functional Testing of Current Microprocessors," Intern. Test 

Conference (ITC-92), pp. 684-695, Sept. 1992. 

[12] J. Aerts and E. J. Marinissen, “Scan Chain Design for Test Time Reduction in Core-Based ICs," Intern. Test 

Conference (ITC-98), Oct. 1998.

DAC'99, pages 592-597 

Using Partitioning to Help Convergence in the Standard-Cell Design Automation 

Methodology 

Hema Kapadia, Mark A. Horowitz 

Computer Systems Lab, Stanford University, Stanford, CA 

Abstract 

This paper explores a standard-cell design methodology based on netlist partitioning as a 

solution for the problem of lack of convergence in the conventional methodology in deep 

submicron technologies. A synthesized design block is partitioned along unpredictable nets that 

are identified from the netlist structure. The size of each partition is restricted so that the longest 

possible local net in a partition can be sufficiently driven by an average library gate, hence 

allowing statistical wire-load modeling for the local nets. The block is resynthesized using a 

hybrid wire-load model that takes into account accurate wire-load information on the 

unpredictable nets derived after floorplanning the partitions, and uses custom statistical wire-load 

models within each partition. Final placement is restricted to respect the initial floorplan. The 

methodology was implemented using existing commercial tools for synthesis and layout. 

Experimental results show high correlation between synthesis estimates and post-placement 

measurements of wire-loads and gate delays with the new methodology. The trade-offs of 

partitioning, current limitations of the methodology and future work to overcome these 

limitations are also discussed. 

References 

[1] K. Keutzer, A. R. Newton, and N. Shenoy, "The future of logic synthesis and physical design in deep-submicron 

process geometries," ISPD, April 1997, pp. 218-24. 

[2] W. Gosti et al., "Wireplanning in Logic Synthesis," ICCAD, Nov. 1998, pp. 26-33. 

[3] A. Salek, J. Lou, and M. Pedram, "A DSM design flow: putting floorplanning, technology mapping and gate 

placement together," DAC, June 1998, pp. 287-90. 

[4] W. Chuang and I. N. Hajj, "Delay and area optimization for compact placement by gate resizing and relocation," 

ICCAD, Nov. 1994, pp. 145-8. 

[5] S. Hojat and P. Villarrubia, "An integrated placement and synthesis approach for timing closure of Power PC 

microprocessors," IWLS, May 1997. 

[6] L. N. Kannan, P. R. Sauris, and Hong-Gee Fang, "A methodology and algorithms for post-placement delay 

optimization," DAC, June 1994, pp. 327-32. 

[7] K. Sato et al., "Post-layout optimization for deep submicron design," DAC, June 1996, pp. 740-5. 

[8] M. Lee et al., "Incremental timing optimization for physical design by interacting logic restructuring and layout," 

IWLS, May 1998, pp. 508-13. 

[9] G. Stenz et al., "Timing driven placement in interaction with netlist transformations," ISPD, April 1997, pp. 36- 

41. 

[10] H.-P. Su, A.C-H. Wu, and Y.-L. Lin, "Performance-driven softmacro clustering and placement by preserving 

HDL design hierarchy," ISPD, April 1998. 

[11] J. Cong and X. Dongmin, "Exploiting signal flow and logic dependency in standard cell placement," ASP-DAC, 

Aug. 1995, pp. 399-404. 

[12] I. Sutherland, B. Sproull, and D. Harris, Logical Effort: Designing Fast CMOS Circuits, Morgan Kaufmann, 

1999. 

[13] A. Sangiovanni-Vincentelli G. De Micheli and P. Antognetti, Design Systems for VLSI Circuits: Logic 

Synthesis and Silicon Compilation, Martinus Nijhoff Publishers, 1986. 

[14] H. B. Bokaglu, Circuits, Interconnections, and Packaging for VLSI, Addison-Wesley Publishing Company, 

1990. 

[15] J. Kuskin et al., "The Sanford FLASH multiprocessor," International Symposium on Computer Architecture, 

April 1994, pp. 302-13.

[16] J. P. Bergmann and M. A. Horowitz, "Vex - A CAD Toolbox," DAC, June 1999.

DAC'99, pages 598-603 

Comparing RTL and Behavioral Design Methodologies in the Case of a 2M-Transistor 

ATM Shaper 

Imed Moussa 1 , Zoltan Sugar 1 , Rodolph Suescun 2 , Mario Diaz-Nava 3 , 

Marco Pavesi 4 , Salvatore Crudo 4 , Luca Gazi 4 and Ahmed Amine Jerraya 1 

1 TIMA laboratory, 38031 Grenoble France 

2 AREXSYS, Grenoble France 

3 STMicroelectronics, 38921 Crolles France 

4 Italtel, 20019 Settimo Milanese Italy 

ABSTRACT 

This paper describes the experience and the lessons learned during the design of an ATM traffic 

shaper circuit using behavioral synthesis. The experiment is based on the comparison of the 

results of two parallel design flows starting from the same specification. The first used a classical 

design method based on RTL synthesis. The second design flow is based on behavioral 

synthesis. The experiment has shown that behavioral synthesis is able to produce efficient design 

in terms of gate count and timing while bringing a threefold reduction in design effort when 

compared to RTL design methodology. 

References 

[1] D.D. Gajski, N.D. Dutt, A.CH.Wu, and S.YL. Lin. High-level Synthesis, Introduction to Chip and System 

Design. Kluwer Academic Publishers, Borton/London/Dordrecht, 1991. 

[2] D.Ku and G. DeMicheli. High-level Synthesis of ASICs under Timing and Synchronization Constraints. Kluwer 

Academic Publishers, Borton/London/Dordrecht, 1992. 

[3] E. Berrebi, P. Kission, S. Vernalde, S. De Troch, J.C. Herluison, J. Frehel, A.A. Jerraya, and I. Bolsens. 

Combined control flow dominated and data flow dominated high-level synthesis. 33rd ACM/IEEE Design 

Automation Conference DAC’96, June 1996. 

[4] T.E. Furtrman. Industrial extensions to university high-level synthesis tools: Making it work in the real work. 

28rd ACM/IEEE Design Automation Conference DAC’91, June 1991. 

[5] M. Genoe, P. Vanoostende, and G. Van Wauwe. On the use of vhdl-based behavioral synthesis for telecom asic 

design. In the Proceedings of the International Symposium on System Synthesis ISSS’95, February 1995. 

[6] M.T. Lee, Y. Hsu, Ben Chen, and M. Fujita. Domain-specific high-level modeling and synthesis for atm switch 

design using vhdl. In 33rd ACM/IEEE Design Automation Conference DAC’96, June 1996. 

[7] The ATM Forum Technical Committee. Traffic management specification v4.0. af-tm-oo56.000 Letter Ballot, 

April 1996. 

[8] R. A. Walker and Gaetano Boriello. A Survey of High-Level Synthesis Systems. Kluwer Academic Publishers, 

Borton/London/Dordrecht, 1991. 

[9] D.D. Gajski and L. Ramacahndran. Introduction to high level synthesis. IEEE Design and Test Computer, 

October 1994. 

[10] A. Seawright andW.Meyer. Partitioning and optimizing controllers synthesized from hierarchical high-level 

descriptions. 35rd ACM/IEEE Design Automation Conference, June 1998. 

[11] A.A. Jerraya, H. Ding, P. Kission, and M. Rahmouni. Behavioral Synthesis and Component Reuse with VHDL. 

Kluwer Academic Publishers, Borton/London/Dordrecht, 1997. 

[12] R.A. Bergamaschi. Productivity issues in high-level design: Are tools solving the real problems? 32rd 

ACM/IEEE Design Automation Conference DAC’96, June 1995.

DAC'99, pages 604-609 

Engineering Change: Methodology and Applications to Behavioral and System Synthesis 

Darko Kirovski, Miodrag Potkonjak 

Computer Science Department, University of California, Los Angeles 

Abstract 

Due to the unavoidable need for system debugging, performance tuning, and adaptation to new 

standards, the engineering change (EC) methodology has emerged as one of the crucial 

components in synthesis of systems-on-chip. We introduce a novel design methodology which 

facilitates design-for-EC and post-processing to enable EC with minimal perturbation. Initially, 

as a synthesis pre-processing step, the original design specification is augmented with additional 

design constraints which ensure flexibility for future correction. Upon alteration of the initial 

design, a novel post-processing technique achives the desired functionality with a near-minimal 

perturbation of the initially optimized design. The key contribution we introduce is a constraint 

manipulation technique which enables reduction of an arbitrary EC problem into its 

corresponding classical synthesis problem. As a result, in both pre- and post-processing for EC, 

classical synthesis algorithms can be used to enable flexibility and perform the correction 

process. We demonstrate the developed EC methodology on a set of behavioral and system 

synthesis tasks. 

References 

[Bra94] D. Brand, etal. Incremental synthesis. ICCAD, p.14-18, 1994. [Buc97] P. Buch, et al. EC for power 

optimization using global sensitivity and synthesis flexibility. Low Power Electronics and Design, pp.88-91, 1997. 

[Cha97] S.-C. Chang et al. Postlayout logic restructuring using alternative wires. TOAD, Vol.16, (no.6), pp.587-96, 

1997. 

[DeM94] G. De Micheli. Synthesis and optimization of digital circuits. McGraw-Hill, New York, NY, 1994. 

[Edw97] S. Edwards et al. Design of embedded systems: formal models, validation, and synthesis. Proceedings of 

the IEEE, Vol-85, (no.3), pp.366-90, 1997. 

[Fan97] W.-J. Fang, et al. A real time RTL engineering change method supporting online debugging for logic 

emulation applications. DAC, pp.101-6, 1997. 

[Gar79] M.R. Garey and D.S. Johnson. Computers and intractability: a guide to the theory of NP-completeness. 

W.H. Freeman, 1979. 

[Hwa91] C.-T. Hwang, J.-H. Lee, and Y.-C. Hsu. A formal approach to the scheduling problem in high level 

synthesis. TOAD, Vol.10, (no.4), pp.464-475, 1991. 

[Kha96] S.P. Khatri, et al. Engineering change in a non-deterministic FSM setting. DAC, pp.451-6, 1996. 

[Kur87] F.J. Kurdahi and A.C. Parker. REAL: a program for REgister ALlocation. DAC, pp.210-215, 1987. 

[Lak98] G. Lakshminarayana, et al. Incorporating speculative execution into scheduling of control-flow intensive 

behavioral descriptions. DAC, pp.108-13, 1998. 

[Lee87] E.A. Lee and D.G. Messerschmitt. Synchronous dataflow. Proc. of the IEEE, Vol.75, (no.9), pp.1235-45, 

1987. 

[Mad89] J.C. Madre, et al. Automating the diagnosis and the rectification of design errors with PRIAM. ICCAD, 

pp.30-3, 1989. 

[Pau89] P.G. Paulin, et al. Force-directed scheduling for the behavioral synthesis of ASICs. TCAD, Vol.8, (no.6), 

pp.661-679, 1989. 

[Rab91] J. Rabaey, et al. Fast prototyping of data path intensive architectures. Design & Test, Vol.8, (no. 2), pp.40- 

51, 1991. 

[Sha95] G.A. Shaw, et al. Assessing and improving current practice in the design of ASSPs. ICASSP, pp.2707-10, 

1995. 

[Sto89] L. Stok and R. van den Born. EASY: multiprocessor architecture optimisation. Logic and Arch. Synthesis 

for Silicon Compilers, pp.313-328, 1989. 

[Swa97] G. Swamy, et al. Minimal logic re-synthesis for engineering change. ISCS, Vol.3. pp.1596-9, 1997.

[Wat91] Y. Watanabe and R.K. Brayton. Incremental synthesis for EC. ICCAD, pp.40-3, 1991.

DAC'99, pages 610-615 

Reconfigurable Computing: What, Why, and Implications for Design Automation 

André DeHon and John Wawrzynek 

Berkeley Reconfigurable, Architectures, Software, and Systems 

Computer Science Division, University of California at Berkeley, Berkeley, CA 94720-1776 

Abstract 

Reconfigurable Computing is emerging as an important new organizational structure for 

implementing computations. It combines the post-fabrication programmability of processors with 

the spatial computational style most commonly employed in hardware designs. The result 

changes traditional "hardware" and "software" boundaries, providing an opportunity for greater 

computational capacity and density within a programmable media. Reconfigurable Computing 

must leverage traditional CAD technology for building spatial designs. Beyond that, however, 

reprogrammablility introduces new challenges and opportunities for automation, including 

binding-time and specialization optimizations, regularity extraction and exploitation, and 

temporal partitioning and scheduling. 

References 

[1] Duncan Buell, Jeffrey Arnold, andWalter Kleinfelder. Splash 2: FPGAs in a Custom Computing Machine. IEEE 

Computer Society Press, 10662 Los Vasqueros Circle, PO Box 3014, 

Los Alamitos, CA 90720-1264, 1996. 

[2] Kenneth David Chapman. Fast Integer Multipliers fit in FPGAs. EDN, 39(10):80, May 12 1993. Anonymous 

FTP www.ednmag.com:EDN/di_sig/DI1223Z.ZIP . 

[3] André DeHon. Reconfigurable Architectures for General-Purpose Computing. AI Technical Report 1586, MIT 

Artificial Intelligence Laboratory, 545 Technology Sq., Cambridge, MA 02139, October 1996. 

. 

[4] André DeHon. Comparing Computing Machines. In Configurable Computing: Technology and Applications, 

volume 3526 of Proceedings of SPIE. SPIE, November 1998.. 

[5] John R. Hauser and John Wawrzynek. Garp: A MIPS Processor with a Reconfigurable Coprocessor. In 

Proceedings of the IEEE Symposium on Field-Programmable Gate Arrays for Custom Computing Machines, pages 

12–21. IEEE, April 1997. . 

[6] Daniel J. Magenheimer, Liz Peters, Karl Pettis, and Dan Zuras. Integer Multiplication and Division on the HP 

Precision Architecture. In Proceedings of the Second International Conference on the Architectural Support for 

Programming Languages and Operating Systems, pages 90–99. IEEE, 1987. 

[7] A. Peleg, S. Wilkie, and U. Weiser. Intel MMX for Multimedia PCs. Communications of the ACM, 40(1):24–38, 

January 1997. 

[8] Jan Rabaey. Reconfigurable Computing: The Solution to Low Power Programmable DPP. In Proceedings of the 

1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, April 1997. 

[9] Charlé Rupp, Mark Landguth, Tim Garverick, Edson Gomersall, Harry Holt, Jeffrey Arnold, and Maya Gokhale. 

The NAPA Adaptive Processing Architecture. In Proceedings of the IEEE Symposium on FPGAs for Custom 

Computing Machines, pages 28–37, April 1998. 

[10] Lawrence Snyder. An Inquiry into the Benefits of Multigauge Parallel Computation. In Proceedings of the 1985 

International Conference on Parallel Processing, pages 488–492. IEEE, August 1985. 

[11] Jean E. Vuillemin, Patrice Bertin, Didier Roncin, Mark Shand, Hervé Touati, and Philippe Boucard. 

Programmable Active Memories: Reconfigurable Systems Come of Age. IEEE Transactions on VLSI Systems, 

4(1):56–69, March 1996. Anonymous FTP pam.devinci.fr:pub/doc/To-Be-Published/PAMieee.ps.Z.

DAC'99, pages 616-622 

An Automated Temporal Partitioning and Loop Fission approach for FPGA based 

reconfigurable synthesis of DSP applications 

Meenakshi Kaul, Ranga Vemuri, Sriram Govindarajan and Iyad Ouaiss 

Digital Design Environments Laboratory, University of Cincinnati, Cincinnati, OH 45221-0030 

Abstract 

We present an automated temporal partitioning and loop transformation approach for developing 

dynamically reconfigurable designs starting from behavior level specifications. An Integer 

Linear Programming (ILP) model is formulated to achieve near-optimal latency designs. We, 

also present a loop restructuring method to achieve maximum throughput for a class of DSP 

applications. This restructuring transformation is performed on the temporally partitioned 

behavior and results in near-optimization of throughput. We discuss efficient memory mapping 

and address generation techniques for the synthesis of reconfigurable designs. A Case study on 

the Joint Photographic Experts Group (JPEG) image compression algorithm demonstrates the 

effectiveness of our approach. 

References 

[1] M. J. Wirthlin and B. L. Hutchings, “Sequencing Run-Time Reconfigured Hardware with Software", 

ACM/SIGDA International Symposium on Field Programmable Gate Arrays, FPGA 1996, pp. 122-128. 

[2] R. D. Hudson, D. I. Lehn and P. M. Athanas, “A Run-Time Reconfigurable Engine for Image Interpolation", 

IEEE Symposium on FPGAs for Custom Computing Machines, FCCM 1998, pp. 88-95. 

[3] M. B. Gokhale and J. M. Stone, “NAPA C:Compiling for Hybrid RISC/FPGA Architectures", IEEE Symposium 

on FPGAs for Custom Computing Machines, FCCM 1998, pp. 126-135. 

[4] M. Vasiliko and D. Ait-Boudaoud, “Architectural Synthesis for Dynamically Reconfigurable Logic", 

International Workshop on Field-Programmable Logic and Applications, FPL 1996, pp. 290-296. 

[5] K. M. GajjalaPurna and D. Bhatia, “Temporal Partitioning and Scheduling for Reconfigurable Computing", 

IEEE Symposium on FPGAs for Custom Computing Machines, FCCM 1998, pp. 329-330. 

[6] J. Spillane and H. Owen, “Temporal Partitioning for Partially-Reconfigurable-Field-Programmable Gate", 

Reconfigurable Architectures Workshop in 12th International Parallel Processing Symposium and 9th Symposium 

on Parallel and Distributed Processing, IPPS/SPDP 1998, pp. 37-42. 

[7] M. Kaul and R. Vemuri, “Optimal Temporal Partitioning and Synthesis for Reconfigurable Architectures", 

Design and Test in Europe, DATE 1998, pp. 389-396. 

[8] S. Trimberger, “Scheduling designs into a Time-Multiplexed FPGA", ACM/SIGDA International Symposium on 

Field Programmable Gate Arrays, FPGA 1998, pp. 153-160. 

[9] S. Trimberger, “A Time-Multiplexed FPGA", IEEE Symposium on FPGAs for Custom Computing Machines, 

FCCM 1997, pp. 22-28. 

[10] M. Xu, F. Kurdahi, “Layout Driven High Level Synthesis for FPGA Based Architectures", Design and Test in 

Europe '98. 

[11] I. Ouaiss, S. Govindarajan, V. Srinivasan, M. Kaul and R. Vemuri, “An Integrated Partitioning and Synthesis 

System for Dynamically Reconfigurable Multi-FPGA Architectures", Reconfigurable Architectures Workshop in 

12th International Parallel Processing Symposium and 9th Symposium on Parallel and Distributed Processing, 

IPPS/SPDP 1998, pp. 31-36. 

[12] S. Govindarajan, I. Ouaiss, M. Kaul, V. Srinivasan and R. Vemuri, “An Effective Design Approach for 

Dynamically Reconfigurable Architectures", IEEE Symposium on FPGAs for Custom Computing Machines, FCCM 

1998, pp.312-313. 

[13] J. Roy, N. Kumar and R. Vemuri, “DSS: A Distributed High-Level Synthesis System for VHDL 

Specifications", IEEE Design and Test of Computers, v9, n2, June 1992, pp. 18-32. 

[14] M. Wolf, High Performance Compilers for Parallel Computing, Addison-Wesley Publishers, 1996. 

[15] C. H. Gebotys, “An Optimal methodology of Synthesis of DSP Multichip Architectures", Journal of VLSI 

Signal Processing, v11, p9-19 1995.

[16] R. Niemann and P. Marwedel, “An Algorithm for Hardware/Software Partitioning Using Mixed Integer Linear 

Programming", Proceedings of the ED&TC, 1996. 

[17] G.K. Wallace, “The JPEG Still Picture Compression Standard", ACM Communications, 1991. 

[18] WILDFORCE Reference Manual, Document #1189 – Release Notes, Annapolis Micro Systems, Inc..

DAC'99, pages 623-628 

Dynamically Reconfigurable Architecture for Image Processor Applications 

Alexandro M. S. Adário Eduardo L. Roehe Sergio Bampi 

Institute for Informatics – Federal University at Porto Alegre 

9500 – Porto Alegre, RS – Brazil 

ABSTRACT 

This work presents an overview of the principles that underlie the speed-up achievable by 

dynamic hardware reconfiguration, proposes a more precise taxonomy for the execution models 

for reconfigurable platforms, and demonstrates the advantage of dynamic reconfiguration in the 

new implementation of a neighborhood image processor, called DRIP. It achieves a real-time 

performance, which is 3 times faster than its pipelined non-reconfigurable version. 

Keywords: Reconfigurable architecture, image processing, FPGA 

REFERENCES 

[1] Adário, A. M. S.; Côrtes, M. L.; Leite, N. J. “A FPGA Implementation of a Neighborhood Processor for Digital 

Image Applications” In: 10 Brazilian Symposium on Integrated Circuit Design, Ago 1997. Proceedings..., 1997 p. 

125-134. 

[2] ALTERA. Data Book. Altera Corporation, San Jose, California, 1996. 

[3] Athanas, P.; Silverman, H. F. “Processor Reconfiguration Through Instruction Set Metamorphosis”. IEEE 

Computer, Mar 1993. p 11-18. 

[4] Bertin, P. et al, Introduction to Programmable Active Memories. Paris: Digital Equipment Corp., Paris Research 

Lab, June 1989. (PRL Report 3). 

[5] Fountain, T. J. Processor Arrays: Architecture and Applications. Academic Press, London, 1987. 

[6] Gokhale, M. et al. “Building and Using a Highly Parallel Programmable Logic Array.” Computer, vol. 24, no. 1, 

Jan 1991. p 81-89. 

[7] Hauser, J. R.; Wawrzyneck J. “Garp: A MIPS Processor with a Reconfigurable Coprocessor”. In: IEEE 

Symposium on FPGAs for Custom Computing Machines, 1997. Proceedings... p 24-33. 

[8] Knuth, D. E. The Art of Computer Programming, Reading, Massachusetts: Addison-Wesley, 1973. 

[9] Leite, N. J.; Barros, M. A.; “A Highly Reconfigurable Neighborhood Image Processor Based on Functional 

Programming”. In: IEEE International Conference on Image Processing, Nov 1994. Proceedings... p 659-663. 

[10] Page I. Reconfigurable Processor Architectures. Microprocessors and Microsystems, May 1996. (Special Issue 

on Codesign). 

[11] Pitas, I.; Venetsanopulos, A. N. “ A New Filter Structure for Implementation of Certain of Image Processing 

Operations”. IEEE Trans. on Circuits and Systems, vol. 35, n. 6, June 1988. P 636-647. 

[12] Shand, M.; Vuillemin, J. “Fast Implementations of RSA Cryptography”. In: 11 Symposium on Computer 

Arithmetic, 1993, Los Alamitos, California. Proceedings... p 252-259. 

[13] Wirthlin, M. J.; Hutchings, B. L. “A Dynamic Instruction Set Computer”. In: IEEE Symposium on FPGAs for 

Custom Computing Machines, Apr. 1995. Proceedings... p 92-103

DAC'99, pages 629-634 Multi-Time Simulation of Voltage-Controlled Oscillators 

OnuttomNarayan*, Jaijeet Roychowdhury† 

*University of California, Santa Cruz. 

†Bell Laboratories, Murray Hill. 

Abstract 

We present a novel formulation, called the WaMPDE, for solving systems with forced 

autonomous components. An important feature of the WaMPDE is its ability to capture 

frequency modulation (FM) in a natural and compact manner. This is made possible by a key 

new concept: that of warped time, related to normal time through separate time scales. Using 

warped time, we obtain a completely general formulation that captures complex dynamics in 

autonomous nonlinear systems of arbitrary size or complexity. We present computationally 

efficient numerical methods for solving large practical problems using the WaMPDE. Our 

approach explicitly calculates a time-varying local frequency that matches intuitive expectations. 

Applied to VCOs, WaMPDE-based simulation results in speedups of two orders of magnitude 

over transient simulation. 

References 

[AT72] T.J. Aprille and T.N. Trick. Steady-state analysis of nonlinear circuits with periodic inputs. Proc. IEEE, 

60(1):108–114, January 1972. 

[BL98] H.G. Brachtendorf and R. Laur. Transient Simulation of Oscillators. Technical Report ITD-98-34096K, Bell 

Laboratories, 1998. 

[BWLBG96] H.G. Brachtendorf, G. Welsch, R. Laur, and A. Bunse-Gerstner. Numerical steady state analysis of 

electronic circuits driven by multi-tone signals. Electrical Engineering (Springer-Verlag), 79:103–112, 1996. 

[CL75] L.O. Chua and P-M. Lin. Computer-aided analysis of electronic circuits : algorithms and computational 

techniques. Prentice-Hall, Englewood Cliffs, N.J., 1975. 

[Far94] M. Farkas. Periodic Motions. Springer-Verlag, 1994. 

[Got97] I.M. Gottlieb. Practical oscillator handbook. Oxford, 1997. 

[GS91] R.J. Gilmore and M.B. Steer. Nonlinear circuit analysis using the method of harmonic balance – a review of 

the art. Part I. Introductory concepts. Int. J. on Microwave and Millimeter Wave CAE, 1(1), 1991. 

[Haa88] S.A. Haas. Nonlinear Microwave Circuits. Artech House, Norwood, MA, 1988. 

[Hay64] C. Hayashi. Nonlinear Oscillations in Physical Systems. McGraw-Hill, 1964. 

[KC81] J. Kevorkian and J.D. Cole. Perturbation methods in Applied Mathematics. Springer-Verlag, 1981. 

[Lor63] E.N. Lorenz. Deterministic nonperiodic flow. J. Atmos. Sci, 20:130–141, 1963. 

[Mar92] Markus Rösch. Schnell Simulation des stationären Verhaltens nichtlinearer Schaltungen. PhD thesis, 

Technischen Universität München, 1992. 

[MFR95] R.C. Melville, P. Feldmann, and J. Roychowdhury. Efficient multi-tone distortion analysis of analog 

integrated circuits. In Proc. IEEE CICC, pages 241–244, May 1995. 

[Mur91] J.A. Murdock. Perturbations: Theory and Methods. Wiley, 1991. 

[NB95] A. Nayfeh and B. Balachandran. Applied Nonlinear Dynamics. Wiley, 1995. 

[NV76] M.S. Nakhla and J. Vlach. A Piecewise Harmonic Balance Technique for Determination of Periodic 

Responses of Nonlinear Systems. IEEE Trans. Ckts. Syst., CAS-23:85, 1976. 

[Par83] B. Parzen. Design of crystal and other harmonic oscillators. Wiley, 1983. 

[PC89] T.S. Parker and L.O. Chua. Practical Numerical Algorithms for Chaotic Systems. Springer-Verlag, 1989. 

[RN88] V. Rizzoli and A. Neri. State of the Art and Present Trends in Nonlinear Microwave CAD Techniques. 

IEEE Trans. MTT, 36(2):343–365, February 1988. 

[Roh97] U. Rohde. Microwave and wireless synthesizers: theory and design. Wiley, 1997. 

[Roy97] J. Roychowdhury. Efficient Methods for Simulating Highly Nonlinear Multi-Rate Circuits. In Proc. IEEE 

DAC, 1997. 

[Roy99] J. Roychowdhury. Analysing Circuits with Widely-Separated Time Scales using Numerical PDE Methods. 

IEEE Trans. Ckts. Syst. – I: Fund. Th. Appl., May 1999.

[Saa96] Y. Saad. Iterative methods for sparse linear systems. PWS, Boston, 1996. 

[Ske80] S. Skelboe. Computation of the periodic steady-state response of nonlinear networks by extrapolation 

methods. IEEE Trans. Ckts. Syst., CAS-27(3):161–175, March 1980. 

[TKW95] R. Telichevesky, K. Kundert, and J. White. Efficient Steady-State Analysis based on Matrix-Free Krylov 

Subspace Methods. In Proc. IEEE DAC, pages 480–484, 1995. 

[vdP22] B. van der Pol. On oscillation hysteresis in a simple triode generator. Phil. Mag., 43:700–719, 1922. 

[Ven82] G.D. Vendelin. Design of amplifiers and oscillators by the S-parameter method. Wiley, 1982.

DAC'99, pages 635-640 

Efficient computation of quasi-periodic circuit operating conditions via a mixed 

frequency/time approach 

Dan Feng*, Joel Phillips*, Keith Nabors*, Ken Kundert*, Jacob White** 

*Cadence Design Systems, San Jose, CA 95134 

**Massachusetts Institute of Technology, Cambridge, MA 02139 

Abstract 

Design of communications circuits often requires computing steady-state responses to multiple 

periodic inputs of differing frequencies. Mixed frequency-time (MFT) approaches are orders of 

magnitude more efficient than transient circuit simulation, and perform better on highly 

nonlinear problems than traditional algorithms such as harmonic balance. We present algorithms 

for solving the huge nonlinear equation systems the MFT approach generates from practical 

circuits. 

References 

[1] A. ALLGOWER AND K. GEORG, Numerical Continuation Methods, Springer-Verlag, New York, 1990. 

[2] L.O. CHUAAND A.USHIDA,Algorithms for computing almost periodic steady-state response of nonlinear 

systems tomultiple input frequencies, IEEE Trans. Circuits and Systems, 28 (1981), pp. 953–971. 

[3] K. KUNDERT, J. WHITE, AND A. SANGIOVANNI-VINCENTELLI, A mixed frequency-time approach for 

distortion analysis of switching filter circuits, IEEE J. Solid State Circuits, 24 (1989), pp. 443–451. 

[4] K. S. KUNDERT, J. K. WHITE, AND A. SANGIOVANNIVINCENTELLI, Steady-State Methods for 

Simulating Analog And Microwave Circuits, Kluwer Academic Publishers, Boston, 1990. 

[5] P. LANCASTER AND M.TISMENETSKY,The Theory ofMatrices, Academic Press, second ed., 1985. 

[6] D. LONG,R.MELVILLE, K.ASHBY, AND B.HORTON, Full chip harmonic balance, in Proceedings of the 

Custom Integrated Circuits Conference,May 1997. 

[7] R. MELVILLE, P. FELDMANN, AND J. ROYCHOWDHURY, Efficient multi-tone distortion analysis of 

analog integrated circuits, in Proceedings of the Custom Integrated Circuits Conference, May 1995. 

[8] M. OKUMURA, T. SUGAWARA, AND H. TANIMOTO, An efficient small signal frequency analysis method 

for nonlinear circuits with two frequency excitations, IEEE transactions of computer-aided design of integrated 

circuits and systems, 9 (1990), pp. 225–235. 

[9] J. ROYCHOWDHURY, Efficient methods for simulating highly nonlinearmultirate circuits, in Proceedings of 

the 34thDesign Automation Conference, Anaheim, CA, June 1997, pp. 269–274. 

[10] J. ROYCHOWDHURY, D. LONG, AND P. FELDMANN, Cyclostationary noise analysis of large RF circuits 

with multitone excitations, IEEE J. Sol. St. Circuits, 33 (1998), pp. 324–336. 

[11] Y. SAAD AND M.H. SCHULTZ,GMRES:A generalizedminimal residual algorithm for solving nonsymmetric 

linear systems, SIAM J. Sci. Stat. Comput., 7 (1986), pp. 856–869. 

[12] R. TELICHEVESKY, K. S. KUNDERT, AND J.K.WHITE, Efficient steady-state analysis based on matrixfree 

Krylov-subspace methods, in Proceedings of the 1995 Design Automation Conference, June 1995. 

[13] ----, EfficientACand noise analysis of two-tone RF circuits, in Proceedings of the 1996 Design Automation 

Conference, June 1996. 

[14] R. TELICHEVESKY, J. WHITE, AND K. KUNDERT, Receiver characterization using periodic small-signal 

analysis, in Proceedings of the Custom Integrated Circuits Conference, May 1996. 

[15] Y. THODESEN, Two stage method for efficient simulation of parametric circuits, PhD thesis, Department of 

telecommunications, the Norwegian institute of technology, 1996.

DAC'99, pages 641-646 

Time -Mapped Harmonic Balance 

Ognen J. Nastov*, Jacob K. White** 

*Motorola, Inc., Austin, TX 78730 

**Massachusetts Institute of Technology, Cambridge, MA 02139 

Abstract 

Matrix-implicit Krylov-subspace methods have made it possible to efficiently compute the 

periodic steady-state of large circuits using either the time-domain shooting-Newton method or 

the frequency-domain harmonic balance method. However, the harmonic balance methods are 

not so efficient at computing steady-state solutions with rapid transitions, and the low-order 

integration methods typically used with shooting-Newton methods are not so efficient when high 

accuracy is required. In this paper we describe a Time-Mapped Harmonic Balance method 

(TMHB), a fast Krylov-subspace spectral method that overcomes the inefficiency of standard 

harmonic balance in the case of rapid transitions. TMHB features a non-uniform grid to resolve 

the sharp features in the signals. Results on several examples demonstrate that the TMHB 

method achieves several orders of magnitude improvement in accuracy compared to the standard 

harmonic balance method. The TMHB method is also several times faster than the standard 

harmonic balance method in reaching identical solution accuracy. 

References 

[1] Thomas J. Aprille and Timothy N. Trick. “Steady-State Analysis of Nonlinear Circuits with Periodic Inputs.” 

Proceedings of the IEEE, Vol. 60, No. 1, pp. 108–114, January 1972. 

[2] C. Canuto, M.Y. Hussaini, A. Quarteroni, and T.A. Zang. SpectralMethods in Fluid Dynamics. Springer-Verlag, 

Berlin, New York, 1987. 

[3] Rowan Gilmore and Michael B. Steer. “Nonlinear Circuit Analysis Using the Method of Harmonic Balance - A 

Review of the Art. Part I - Introductory Concepts”. Int. J. on Microwave and Millimeter Wave Computer Aided 

Engineering, Vol. 1, No. 1, 1991. 

[4] P. Heikkilä. Object-Oriented Approach to Numerical Circuit Analysis. Ph.D. dissertation, Helsinki University of 

Technology, January 1992. 

[5] Kenneth S. Kundert, Jacob K. White, and Alberto Sangiovanni-Vincentelli. Steady-State Methods for Simulating 

Analog and Microwave Circuits. Kluwer Academic Publishers, 1990. 

[6] R. Melville, P. Feldmann, and J. Roychowdhury. “Efficient Multi-Tone Distortion Analysis of Analog Integrated 

Circuits”. Proceedings of the Custom Integrated Circuits Conference, May 1995. 

[7] Ognen J. Nastov and Jacob K. White. “Grid Selection Strategies for the Time-Mapped Harmonic Balance 

Simulation of Circuits with Rapid Transitions.” Proceedings of the IEEE Custom Integrated Circuits Conference, 

May 1999. 

[8] R. Telichevesky, K. Kundert, and J.White. “Efficient Steady-State Analysis Based on Matrix-Free Krylov- 

Subspace Methods”. Proceedings of the IEEE Design Automation Conference, pp. 480–484, 1995.

DAC'99, pages 647-652 

Test Generation for Gigahertz Processors Using an Automatic Functional Constraint 

Extractor 

Raghuram S. Tupuri 

Texas Microprocessor Division, Advanced Micro Devices, Austin Texas 78741 

Arun Krishnamachary and Jacob A. Abraham 

Computer Engineering Research Center, The University of Texas at Austin, Austin Texas 78712 

Abstract 

As the sizes of general and special purpose processors increase rapidly, generating high quality 

manufacturing tests which can be run at native speeds is becoming a serious problem. One 

solution is a novel method for functional test generation in which a transformed module is built 

manually, and which embodies functional constraints described using virtual logic. Test 

generation is then performed on the transformed module using commercial tools and the 

transformed module patterns are translated back to the processor level. However, the technique is 

useful only if the virtual logic can be generated automatically. This paper describes an automatic 

functional constraint extraction algorithm and a procedure to build the transformed module. We 

describe the tool, FALCON, used to extract the functional constraints of a given embedded 

module from a Verilog RTL model. The constraint extraction for embedded modules of 

benchmark processors using FALCON takes only a few seconds. We show that this method can 

generate functional patterns in a time several orders of magnitude less than one using a 

conventional, at view of the circuit. 

References 

[1] P. C.Maxwell et al., “The effect of different test sets on quality level prediction: When is 80% better than 90%?," 

Proceedings of the International Test Conference, October 1991, pp. 358-364. 

[2] R. S. Tupuri and J. A. Abraham, “A Novel Functional Test Generation Method for Processors," Proceedings of 

the International Test Conference, November 1997, pp. 743-752. 

[3] J.Lee and J.H.Patel, “ARTEST: An Architectural Level Test Generator for Data Path Faults and Control Faults," 

Proceedings of the International Test Conference, October 1991, pp. 729-739. 

[4] R.S.Ramachandani and D.E.Thomas, “Behavioral Test Generation using Mixed Integer Non-linear 

Programming," Proceedings of the International Test Conference, October, 1994, pp. 221-229. 

[5] P. Vishakantaiah, J. A. Abraham and M. Abadir, “Automatic Test Knowledge Extraction From VHDL 

(ATKET)," 29th ACM/IEEE Design Automation Conference, April 1992, pp. 273-278. 

[6] T. E. Marchok, A. El-Makeh, W. Maly and J. Rajski, “Complexity of sequential ATPG," Proceedings of the 

European Design and Test Conference, March 1995, pp. 252-261. 

[7] D. Anderson and T. Shanley, “Pentium Processor System Architecture," Addison-Wesley Publishing Company, 

1995. 

[8] A. Miczo, “The sequential ATPG: A theoretical limit," Proceedings of International Test Conference, October 

1983, pp. 143-147.

DAC'99, pages 653-659 

PROPTEST: A Property Based Test Pattern Generator for Sequential 

Circuits Using Test Compaction 

Ruifeng Guo, Sudhakar M. Reddy, Irith Pomeranz 

Electrical & Computer Engineering Department, University of Iowa, Iowa City, IA 52242 

Abstract 

We describe a property based test generation procedure that uses static compaction to generate 

test sequences that achieve high fault coverages at a low computational complexity. A class of 

test compaction procedures are proposed and used in the property based test generator. 

Experimental results indicate that these compaction procedures can be used to implement the 

proposed test generator to achieve high fault coverage with relatively smaller run times. 

References 

[1] M. Abramovici, M. A. Breuer and A. D. Friedman, “Digital Systems Testing and Testable Design," IEEE Press, 

1990 

[2] W.T. Cheng, “The Back Algorithm for Sequential Test Generation," Int'l. Conf. on Computer Design, 1988, pp. 

66-69 

[3] W. -T. Cheng and S. Davidson, “Sequential Circuit Test Generator(STG) Benchmark Results," Int'l Symp. 

Circuits & Systems, May 1989, pp. 1938-1941 

[4] W.-T. Cheng and T. Chakraborty, “Gentest – An Automatic Test-Generation System for Sequential Circuits," 

IEEE Computer, Vol. 22, No.4, April, 1989, pp. 28-35 

[5] T. Niermann and J. Patel, “HITEC: A Test Generation Package for Sequential Circuits," in European Conf. on 

Design Automation, 1991, pp. 214-218 

[6] D. H. Lee and S. M. Reddy, “A New Test Generation Method for Sequential Circuits," in Proc. Int'l Conf. on 

Computer Aided Design, 1991, pp. 446-449 

[7] X. Lin, I. Pomeranz and S. M. Reddy, “MIX: A Test Generation System for Synchronous Sequential Circuits," 

in Proc. 11th Int'l conf. on VLSI Design, Jan. 1998, pp. 456-463 

[8] T. Kelsey, K. Saluja and S. Lee, “An EÆcient Algorithm for Sequential Circuit Test Generation," IEEE Trans. 

on Computer, Vol. 42, Nov. 1993, pp. 1361-1371 

[9] S. Seshu, “On an Improved Diagnosis Program," IEEE Trans. on Electronic Computers, Vol. EC-12, NO. 2, Feb. 

1965, pp.76-79 

[10] T. J. Snethen, “Simulation-Oriented Fault Test Generator," in Proc. 14th Design Automation Conf., June 1977, 

pp. 88-93 

[11] D. G. Saab, Y. G. Saab, and J. A. Abraham, “Cris: A Test Cultivation Program for Sequential VLSI Circuits," 

in Proc. IEEE Int'l Conf. on Computer-Aided Design, Nov. 1992, pp. 216-219 

[12] E. M. Rudnick, J. G. Holm, D. G. Saab and J. H. Patel, “Application of Simple Genetic Algorithms to 

Sequential Circuit Test Generation," in Proc. European Design and Test Conf., March 1994, pp. 40-45 

[13] P. Prinetto, M. Rebaudengo and M. S. Reorda, “An Automatic Test Generator for Large Sequential Circuits 

Based on Genetic Algorithm," in Proc. Int'l Test Conf., 1994, pp. 240-249 

[14] M.S. Hsiao, E.M. Rudnick and J.H. Patel, “Sequential Circuit Test Generation Using Dynamic State Traversal," 

in Proc. 1996 Europ. Design & Test Conf., March 1996, pp. 22-28 

[15] I. Pomeranz and S. M. Reddy, “LOCSTEP: A Logic Simulation Based Test Generation Procedure," in Proc. 

25th Fault-Tolerant Computing Symp., June 1995, pp. 110-119 

[16] I. Pomeranz and S. M. Reddy, “ACTIVE-LOCSTEP: A Test Generation Procedure Based on Logic Simulation 

and Fault Activation," in Proc. 27th Fault-Tolerant Computing Symp., June 1997, pp. 144-151 

[17] L. Nechman, K. K. Saluja, S. Upadhyaya and R. Reuse, “Random Pattern Testing for Sequential Circuits 

Revisited," in Proc. of 26th Fault-Tolerant Computing Symp., June, 1996, pp. 44-52 

[18] K.-H. Tsai, M. Marek-Sadowska, J. Rajski, “Scan-Encoded Test Pattern Generation for BIST," in Proc. Int'l 

Test Conf. , 1997, pp. 548-556 

[19] I. Pomeranz and S. M. Reddy, “Built-in Test Generation for Synchronous Sequential Circuits," in Int'l. Conf. on 

Computer-Aided Design, Nov. 1997, pp. 421-426

[20] I. Pomeranz and S.M. Reddy “On Static Compaction of Test Sequences for Synchronous Sequential Circuits", 

in Proc. 33rd Design Automation Conf., June 1996, pp. 215-220 

[21] I. Pomeranz and S.M. Reddy “Vector Restoration Based Static Compaction of Test Sequences for Synchronous 

Sequential Circuits", in Proc. Intn'l. Conf. on Computer Design, Oct. 1997, pp.360-365 

[22] R. Guo, I. Pomeranz and S.M. Reddy, “On Speeding-Up Vector Restoration Based Static Compaction of Test 

Sequences for Sequential Circuits", in Proc. Asian Test Symp., Dec. 1998, pp. 467-471 

[23] R. Guo, I. Pomeranz and S.M. Reddy, “A Fault Simulation Based Test Pattern Generator for Synchronous 

Sequential Circuits,"Proc. VLSI Test Symp., April, 1999 

[24] S. Bommu, K. Doreswamy, S. Chakradhar, “Static Test Sequence Compaction Based on Segment Reordering 

and Accelerated Vector Restoration," Proc. International Test Conf., 1998, pp. 954-961 

[25] H.K. Lee and D.S. Ha “HOPE: An EÆcient Parallel Fault Simulator for Synchronous Sequential Circuits," in 

Proc. 1992 Design Automation Conf., June 1992, pp. 336-340 

[26] H.K. Lee and D.S. Ha “New Technique for Improving Parallel Fault Simulation in Synchronous Sequential 

Circuits," In Proc. 1993 Intn'l. Conf. on Computer-Aided Design, Oct. 1993, pp. 10-17

DAC'99, pages 660-665 

Multiple Error Diagnosis Based on Xlists 

Vamsi Boppana, Rajarshi Mukherjee, Jawahar Jain, Masahiro Fujita, Pradeep Bollineni* 

Fujitsu Laboratories of America, Inc., Sunnyvale, CA 

* Department of computer Science, Iowa State University, Ames, IA 

Abstract 

In this paper, we present multiple error diagnosis algorithms to overcome two significant 

problems associated with current error diagnosis techniques targeting large circuits: their use of 

limited error models and a lack of solutions that scale well for multiple errors. Our solution is 

based on a non-enumerative analysis technique, based on logic simulation (3-valued and 

symbolic), for simultaneously analyzing all possible errors at sets of nodes in the circuit. Error 

models are introduced in order to address the "locality" aspect of error location and to identify 

sets of nodes that are "local" with respect to each other. Theoretical results are provided to 

guarantee the diagnosis of modeled errors and robust diagnosis approaches are shown to address 

the cases when errors do not correspond to the modeled types. Experimental results on 

benchmark circuits demonstrate accurate and extremely rapid location of errors of large 

multiplicity. 

References 

[1] M. Tomita and Hong-Hai Jiang, “An algorithm for locating logic design errors”, in Proc. Intl. Conf. Computer- 

Aided Design, Nov. 1990, pp. 468–471. 

[2] S.-Y. Kuo, “Locating logic design errors via test generation and don’t-care propagation”, in Proc. European 

Design Automation Conf., 1992, pp. 466–471. 

[3] I. Pomeranz and S. M. Reddy, “On diagnosis and correction of design errors”, in Proc. Intl. Conf. Computer- 

Aided Design, Nov. 1993, pp. 500–507. 

[4] A. Srinivasan A. Kuehlmann, D. I. Cheng and D. P. LaPotin, “Error diagnosis for transistor-level verification”, 

in Proc. Design Automation Conf., June 1994, pp. 218–224. 

[5] M. Tomita, T. Yamamoto, Sumikawa F, and K. Hirano, “Rectification of multiple logic design errors in multiple 

output circuits”, in Proc. Design Automation Conf., June 1994, pp. 212–217. 

[6] I. Pomeranz and S. M. Reddy, “On error correction in macrobased circuits”, in Proc. Intl. Conf. Computer-Aided 

Design, Nov. 1994, pp. 568–575. 

[7] S. Y. Huang, K. T. Cheng, K. C. Chen, and D. I. Cheng, “Errortracer: A fault simulation based approach to 

design error diagnosis”, in Proc. Intl. Test Conf., Nov. 1997, pp. 974–981. 

[8] S. Y. Huang, K. T. Cheng, K. C. Chen, and J. J. Lu, “Fault-simulation based design error diagnosis for sequential 

circuits”, in Proc. Design Automation Conf., June 1998, pp. 632–637. 

[9] V. Boppana and M. Fujita, “Modeling the unknown! Towards model-independent fault and error diagnosis”, in 

Proc. Intl. Test Conf., Oct. 1998, pp. 1094–1101. 

[10] M. Abramovici, M. A. Breuer, and A. D. Friedman, Digital System Testing and Testable Design, New York, 

NY: Computer Science Press, 1990. 

[11] P-Y. Chung, Y-M. Wang, and I. N. Hajj, “Diagnosis and correction of logic design errors in digital circuits”, in 

Proc. Design Automation Conf., June 1993, pp. 503–508. 

[12] H-T. Liaw, J-H. Tsaih, and C-S. Lin, “Efficient automatic diagnosis of digital circuits”, in Proc. Intl. Conf. 

Computer-Aided Design, Nov. 1990, pp. 464–467. 

[13] P-Y. Chung and I. N. Hajj, “Accord: Automatic catching and correction of logic design errors in combinational 

circuits”, in Proc. Intl. Test Conf., Sept. 1992, pp. 742–751. 

[14] S. B. Akers, B. Krishnamurthy, S. Park, and A. Swaminathan, “Why is less information from logic simulation 

more useful in fault simulation?”, in Proc. Intl. Test Conf., Sept. 1990, pp. 786–800. 

[15] R. C. Aitken and P. C. Maxwell, “Better models or better algorithms? Techniques to improve fault diagnosis”, 

Hewlett-Packard Journal, pp. 110–116, Feb. 1995. 

[16] T. Niermann and J. H. Patel, “HITEC: A test generation package for sequential circuits”, in Proc. European 

Design Automation Conf., Feb. 1991, pp. 214–218.

DAC'99, pages 666-671 

Simulation Vector Generation from HDL Descriptions for Observability-Enhanced 

Statement Coverage 

Farzan Fallah 

Fujitsu Labs. of America, Inc., Sunnyvale, CA 

Pranav Ashar 

CCRL, NEC USA, Princeton 

Srinivas Devadas 

Laboratory for Computer Science, MIT, Cambridge 

Abstract 

Validation of RTL circuits remains the primary bottleneck in improving design turnaround time, 

and simulation remains the primary methodology for validation. Simulation-based validation has 

suffered from a disconnect between the metrics used to measure the error coverage of a set of 

simulation vectors, and the vector generation process. This disconnect has resulted in the 

simulation of virtually endless streams of vectors which achieve enhanced error coverage only 

infrequently. Another drawback has been that most error coverage metrics proposed have either 

been too simplistic or too inefficient to compute. Recently, an effective observability-based 

statement coverage metric was proposed along with a fast companion procedure for evaluating it. 

The contribution of our work is the development of a vector generation procedure targeting the 

observability-based statement coverage metric. Our method uses repeated coverage computation 

to minimize the number of vectors generated. For vector generation, we propose a novel 

technique to set up constraints based on the chosen coverage metric. Once the system of 

interacting arithmetic and Boolean constraints has been set up, it can be solved using hybrid 

linear programming and Boolean satisfiability methods. We present heuristics to control the size 

of the constraint system that needs to be solved. We present experimental results which show the 

viability of automatically generating vectors using our approach for industrial RTL circuits. We 

envision our system being used during the design process, as well as during post-design 

debugging. 

References 

[1] R. C. Ho, C. H. Yang, M. A. Horowitz, and D. L. Dill, “Architecture Validation for Processors,” in Proceedings 

of the Annual Symposium on Computer Architecture, June 1995. 

[2] K.-T. Cheng and A. S. Krishnakumar, “Automatic Functional Test Generation Using the Extended Finite State 

Machine Model,” in Proceedings of the Design Automation Conference, pp. 86–91, June 1993. 

[3] S. Devadas, A. Ghosh, and K. Keutzer, “An Observability-Based Code Coverage Metric for Functional 

Simulation,” in Proceedings of the International Conference on Computer-Aided Design, pp. 418–425, November 

1996. 

[4] F. Fallah, S. Devadas, and K. Keutzer, “OCCOM: Efficient Computationa of Observability-Based Code 

Coverage Metrics for Functional Simulation,” in Proceedings of the Design Automation Conference, pp. 152–157, 

June 1998. 

[5] F. Fallah, S. Devadas, and K. Keutzer, “Functional Test Generation Using Linear Programming and 3- 

Satisfiability,” in Proceedings of the Design Automation Conference, pp. 528–533, June 1998. 

[6] T. Larrabee, “Test Pattern Generation Using Boolean Satisfiability,” IEEE Transactions on Computer-Aided 

Design, vol. 11, pp. 4–15, January 1992. 

[7] D. E. Thomas and P. R. Moorby, The Verilog Hardware Description Language. Kluwer Academic Publishers, 

Boston, MA, second ed., 1994.

[8] R. K. Brayton and others, “VIS: A System for Verification and Synthesis,” in Proc. Computer-Aided 

Verification, vol. 1102, pp. 428–432, June 1996.

DAC'99, pages 672-677 

A Two-State Methodology for RTL Logic Simulation 

Lionel Bening 

Hewlett-Packard Company, Richardson, TX 75083-3851 

ABSTRACT 

This paper describes a two-state methodology for register transfer level (RTL) logic simulation 

in which the use of the X-state is completely eliminated inside ASIC designs. Examples are 

presented to show the gross pessimism and optimism that occurs with the X in RTL simulation. 

Random two-state initialization is offered as a way to detect and diagnose startup problems in 

RTL simulation. Random two-state initialization (a) is more productive than the X-state in gatelevel 

simulation, and (b) provides better coverage of startup problems than X-state in RTL 

simulation. Consistent random initialization is applied (a) as a way to duplicate a startup state 

using a slower diagnosis-oriented simulator after a faster detection-oriented simulator reports the 

problem, and (b) to verify that the problem is corrected for that startup state after the design 

change intended to fix the problem. In addition to combining the earlier ideas of two-state 

simulation, and random initialization with consistent values across simulations, an original 

technique for treatment of tri-state Z's arriving into a two-state model is introduced. 

Keywords: RTL, simulation, 2-state, X-state, pessimism, optimism, random, initialization. 

REFERENCES 

[1] Ashar P. and Malik, S., "Fast functional simulation using branching programs," Proc. IEEE ICCAD-96, pp. 408- 

412, November, 1996. 

[2] Breuer, M. A., “A note on three-valued logic simulation,” IEEE Trans. Computers, vol. C-21, pp. 399-402, 

April, 1972. 

[3] Evans, A., Silburt, A., Vrckovnik, G., Brown, T., Dufresne, M., Hall, G., Ho, T. and Liu, Y., "Functional 

verification of large ASICs," Proc. Design Automation Conference, pp. 650-655, June, 1998. 

[4] Fitzpatrick, T., “Verilog modeling style guide for the Cobra cycle simulator,” Cadence Design Systems, 

Chelmsford, MA, Rev. 2, pp. 11-12, August 17, 1998. 

[5] Foster, H. "Techniques for higher performance boolean equivalence verification," Hewlett-Packard Journal, 

August, 1998, pp. 30-38. 

[6] Hoehne, H. and Piloty, R., "Design verification at the register transfer language level." IEEE Trans. Computers, 

vol. C-24, pp. 861-867, September, 1975. 

[7] McGeer, P. C., McMillan, K. L., Saldanha, A., Sangiovanni-Vincentelli, A. L. and Scaglia, P., "Fast discrete 

function evaluation using decision diagrams," Proc. IEEE ICCAD-96, pp. 402-407, November, 1996. 

[8] System HILO - DWL Reference Manual, document 2523-0103, page 9.6, VEDA Design Automation Inc., 

Campbell, CA 95008, January, 1991. 

[9] Taylor, S., Quinn, M., Brown, D., Dohm, N., Hildebrandt, S., Huggins, J. and Ramey, C., "Functional 

verification of a multiple-issue out-of-order, superscalar Alpha processor -- the DEC Alpha 21264 microprocessor," 

Proc. Design Automation Conference, pp. 638-643, June, 1998. 

[10] Thomas, D. E., Moorby, P. R., The Verilog Hardware Description Language, Kluwar Academic Publishers, 

Norwell, MA 02061, pp. 136, 4th Edition, 1998. 

[11] VCS User’s Guide, Synopsys Inc., Mountain View, CA, pp. 2-19 – 2-30, December, 1998. 

[12] Yim, J. S., Hwang, Y. H., Park, C. J., Choi, H., Yang, W. S., Oh, H. S., Park, I. C. and Kyunge, C. M., "A Cbased 

RTL design verification methodology for Complex Microprocessor," Proc. Design Automation Conference, 

pp. 83-88, June, 1997.

DAC'99, pages 678-683 

An Approach for Extracting RT Timing Information to Annotate Algorithmic VHDL 

Specifications 

Cordula Hansen 

Forschungszentrum Informatik (FZI) at the University of Karlsruhe 

Francisco Nascimento 

University of Tübingen 

Wolfgang Rosenstiel 

University of Tübingen and FZI 

ABSTRACT 

This paper presents a new approach for extracting timing information defined in a simulation 

vector set on register transfer level (RTL) and reusing them in the behavioral specification. 

Using a VHDL RTL simulation vector set and a VHDL behavioral specification as entry, the 

timing information are extracted and as well as the specification transformed in a Partial Order 

based Model (POM). The POM expressing the timing information is then mapped on the 

specification POM. The result contains the behavioral specification and the RTL timing and is 

retransformed in a corresponding VHDL specification. Additionally, timing information 

contained in the specification can be checked using the RTL simulation vectors. 

References 

[1] Bryant, R.E.: "Symbolic boolean manipulation with ordered binary-decision diagrams", ACM Computing 

surveys 24, 3 (September 1993), pp. 293-318. 

[2] Garcez, E.; Nascimento, F.: "A Model Checker for a Partial Order based Model of Concurrency", Proceedings 

of Workshop of Beschreibungs-sprachen und Modellierungsparadigmen, March 1998. 

[3] Gupta, V.: “Chu Spaces: A Model of Concurrency”, PhD thesis, Department of Computer Science, Stanford 

University, Stanford, CA, USA, 1994. 

[4] Gutberlet, P.; Krämer, H.; Rosenstiel, W.: „CASCH - a Scheduling Algorithm for High Level -Synthesis“, 

Proceedings of the EDAC, pp. 311-315, February 1991. 

[5] Gutberlet, P.; Rosenstiel, W.: “Timing Preserving Interface Transformations for the Synthesis of Behavioural 

VHDL”, Proceedings of EURO-DAC, September 1994. 

[6] Hansen, C.; Nascimento, F.; Rosenstiel, W.: „Verifying High-Level Synthesis Results Using a Partial Order 

based Model“, HLDVT‘98, La Jolla (CA), November 1998. 

[7] Heinkel, U.; Glauert, W.: „An Approach for a Dynamic Generation/ Validation System for the Functional 

Simulation Considering Timing Constraints“, Proceedings of ED & TC, Paris 1996. 

[8] Mayer, C.; Sahm, H.; Pleickhardt, J.: “A Graphical Data Management System for HDL-Based ASIC Design 

Projects”, Proceedings of EURO-DAC, September 1996. 

[9] Nascimento, F.; Rosenstiel, W.: “Partial Order Based Modeling of Concurrency at the System Level”, 

Proceedings of CONSYSE, September 1997.

DAC'99, pages 684-690 

A Massively-Parallel Easily-Scalable Satisfiability Solver Using Reconfigurable Hardware 

Miron Abramovici, Jose T. de Sousa, Daniel Saab* 

Bell Labs - Lucent Technologies, Murray Hill, NJ 07974 

*Case Western Reserve University, Cleveland, Ohio 44106 

ABSTRACT 

Satisfiability (SAT) is a computationally expensive algorithm central to many CAD and test 

applications. In this paper, we present the architecture of a new SAT solver using reconfigurable 

logic. Our main contributions include new forms of massive fine-grain parallelism and structured 

design techniques based on iterative logic arrays that reduce compilation times from hours to a 

few minutes. Our architecture is easily scalable. Our results show several orders of magnitude 

speed-up compared with a state-of-the-art software implementation, and with a prior SAT solver 

using reconfigurable hardware. 

REFERENCES 

[1] M. Abramovici and D. Saab, “Satisfiability On Reconfigurable Hardware,” Proc. Intn’l. Workshop on Field- 

Programmable Logic and Applications, Sept., 1997 

[2] Miron Abramovici, J. T. de Sousa, “A Virtual Logic System for Solving Satisfiability Problems Using 

Reconfigurable Hardware,” to appear in Proc. Symp. on Field-Programmable Custom Computing Machines, 1999 

[3] R. Brayton, G. Hachtel, C. McMullen, and A. Sangiovanni-Vincentelli, Logic Minimization Algorithms for VLSI 

Synthesis, Kluwer Academic Publishers, 1984 

[4] S. A. Cook, “The Complexity of Theorem-Proving Procedures,” Proc. 3rd Annual ACM Symp. on Theory of 

Computation, pp. 151-158, 1971 

[5] M. Davis and H. Putnam, “A Computing Procedure for Quantification Theory,” Journal of the ACM, vol. 7, pp. 

167--187, 1960 

[6] S. Devadas, “Optimal Layout Via Boolean Satisfiability,” Proc. Intn’l. Conf. on CAD, pp. 294-297, November 

1989 

[7] S. Devadas, K. Keutzer, S. Malik, and A. Wang, “Certified Timing Verification and the Transition Delay of a 

Logic Circuit,” Proc. Design Automation Conf., pp. 549-555, June, 1992 

[8] DIMACS Challenge Benchmarks, ftp://dimacs.rutgers.edu/pub/challenge/sat/benchmarks/cnf/ 

[9] H. Fujiwara and T. Shimono, “On the Acceleration of Test Generation Algorithms,” IEEE Trans. on Computers, 

vol. C-32, no 12, pp. 1137-1144, December, 1983. 

[10] P. Goel, “An Implicit Enumeration Algorithm to Generate Tests for Combinational Logic Circuits,” IEEE 

Trans. on Computers, Vol. C-30, No. 3, pp. 215-222, March, 1981. 

[11] J. Gu, “Satisfiability Problems in VLSI Engineering,” DIMACS Workshop on Satisfiability Problem: Theory 

and Applications, March 1996 

[12] J. Gu, P. W. Purdom, J. Franco, and B. W. Wah, “Algorithms for the Satisfiability (SAT) Problem: A Survey,” 

DIMACS Workshop on Satisfiability Problem: Theory and Applications, pp.19-51, March 1996 

[13] J. Gu and R. Puri, “Asynchronous Circuit Synthesis with Boolean Satisfiability”, IEEE Trans. on CAD, Vol. 

14, No. 8, pp. 961-973, August 1995 

[14] T. Larrabee, “Test Pattern Generation Using Boolean Satisfiability,” IEEE Trans. on CAD, Vol. 11, No. 1, pp. 

4-15, January, 1992 

[15] P. C. McGeer et al., “Timing Analysis and Delay-Fault Test Generation Using Path Recursive Functions,” 

Proc. Intn’l. Conf. on CAD, pp. 180-183, November 1991 

[16] M. Platzner and G. De Micheli, “Acceleration of Satisfiability Algorithms by Reconfigurable Hardware,” Proc. 

Intn’l. Workshop on Field-Programmable Logic and Applications, Sept., 1998 

[17] A. Rashid, J. Leonard, and W.H. Mangione-Smith, “Dynamic Circuit Generation for Solving Specific Problem 

Instances of Boolean Satisfiability,” Proc. IEEE Symp. on Field-Programmable Custom Computing Machines, April 

1998 

[18] J. M. Silva, “An Overview of Backtrack Search Satisfiability Algorithms,” Proc. 5th Intn’l. Symp. on Artificial 

Intelligence and Mathematics, January 1998

[19] J. M. Silva and K. A. Sakallah, “GRASP - A New Search Algorithm for Satisfiability,” Proc. Intn’l. Conf. on 

CAD, pp. 220-227, November 1996 

[20] L. G. Silva et al., “Realistic Delay Modeling in Satisfiability-Based Timing Analysis,” Proc. Intn’l. Symp. on 

Circuits and Systems (ISCAS), May 1998 

[21] P. R. Stephan, R. K. Brayton, and A. Sangiovanni-Vincentelli, “Combinational Test Generation Using 

Satisfiability,” IEEE Trans. on CAD, vol. 15, no. 9, pp. 1167-1176, Sept. 1996. 

[22] T. Suyama, M. Yokoo, and H. Sawada, “Solving Satisfiability Problems on FPGAs,” Proc. Intn’l. Workshop on 

Field-Programmable Logic and Applications, 1996 

[23] G. Nam, K. A. Sakallah, and R.A. Rutenbar, “Satisfiability-Based Layout Revisited: Routing Complex FPGAs 

Via Search-Based Boolean SAT”, Proc. Intn’l. Symp. on FPGAs, February 1999 

[24] P. Zhong, M. Martonosi, P. Ashar, and S. Malik, “Accelerating Boolean Satisfiability with Configurable 

Hardware,” Proc. IEEE Symp. on Field-Programmable Custom Computing Machines, April, 1998 

[25] P. Zhong, M. Martonosi, P. Ashar, and S. Malik, “Using Reconfigurable Computing Techniques to Accelerate 

Problems in the CAD Domain: A Case Study with Boolean Satisfiability,” Proc. Design Automation Conf., June 

1998

DAC'99, pages 691-696 

Dynamic Fault Diagnosis on Reconfigurable Hardware 

Fatih Kocan and Daniel G. Saab 

Electrical Engineering and Computer Science Department 

Case Western Reserve University, Cleveland, Ohio, 44106 

Abstract: 

In this paper, we introduce a new approach for locating and diagnosing faults in combinational 

circuits. The approach is based on automatically designing a circuit which implements a closestmatch 

fault location algorithm specialized for the combinational circuit under diagnosis (CUD). 

This approach eliminates the need for large storage required by a software based fault diagnosis. 

In this paper, we show the approach's feasibility in terms of hardware resources, speed, and how 

it compares with software based techniques. 

References 

[1] M. Abramovici and D. G. Saab, "Satisfiability on Reconfigurable Hardware," Sevent International Workshop on 

Field Programmable Logic and Applications, September 1997. 

[2] M. Abramovici, M. A. Breuer and A. D. Friedman, Digital Systems Testing and Testable Design , IEEE Press, 

1994. 

[3] I. Pomeranz and S. M. Reddy, "On Dictionary-Based Fault Location in Digital Logic Circuits," IEEE Trans. on 

Computers, vol. 46, no. 1, pp. 48-59, January 1997. 

[4] P. G. Ryan, Shishpal Rawat and W. K. Fuchs, "Two-Stage Fault Location," Proc. 1991 ITC, pp. 963-968. 

[5] P. G. Ryan, W. K. Fuchs and I. Pomeranz, "Fault Dictionary Compression and Equivalence Class Computation 

for Sequential Circuits," Proc. 1993 ICCAD, pp. 508-511. 

[6] K. Kubiak, S. Parkes, W. K. Fuchs and R. Saleh, "Exact Evaluation of Diagnostic Test Resolution," Proc. 1992 

DAC, pp. 347-352. 

[7] M. Abromovici, "A Maximal Resolution Guided-Probe Testing Algorithm," Proc. 1981 DAC, pp. 189-195. 

[8] I. Hartanto, V. Boppana and W. K. Fuchs, “Diagnostic Fault Equivalence Identification Using Redundancy 

Information & Structural Analysis," Proc. 1996 ITC, pp. 294-301. 

[9] J. Richman and K. R. Bowden, "The Modern Fault Dictionary," Proc. 1985 ITC, pp. 696-702. 

[10] H. Cox and J. Rajski, “A Method of Fault Analysis for Test Generation and Fault Diagnosis," IEEE Trans. 

CAD, pp. 813-833, July 1988. 

[11] L. Burgun, F. Reblewski, G. Fenelon, J. Barbier, and O. Lepape, “ Serail Fault Simulation," Proc. 1996 DAC, 

pp. 801-806. 

[12] M. Butts, J. Bacheler, and J. Varghese, “An Efficient Logic Emulation System," Proc. 1992 ICCD, pp. 138-141. 

[13] K.-T Cheng, S.-Y Huang, and W.-J Dai, "Fault Emulation: A New Approach to Fault Grading," Proc. 1995 

ICCAD, pp. 681-686, Nov. 

[14] I. Pomeranz, and S.M. Reddy, "On The Generation of Small Dictionaries for Fault Location," Proc. 1992 

ICCAD, pp. 272-279, Nov. 

[15] K. Takayama, F. Hirose, and N. Kawato, "A Test Generation System Using a Logic Simulation Engine," 

FUJITSU Sci. Tech. J., 27, 3, pp. 285-289, September 1991. 

[16] M. Abramovici, and M. A. Breuer, “Fault Diagnosis Based on Effect-Cause Analysis,", Proc. 1980 DAC, pp. 

69-76. 

[17] R. Murgai, N. Shenoy, R. K. Brayton, and A. Sagiovanni Vincetelli., "Improved Logic Synthesis Algorithms 

for Table Look Up Architecture," Proc. 1991 ICCAD, pp. 564-567. 

[18] P. Zhong, M. Martonosi, P. Ashar, and S. Malik "Using Reconfigurable Computing techniques to Accelerate 

Problem in th CAD Domain: A Case Study with Boolean Satisfiability," Proc. 1998 DAC. 

[19] M. Abramovici and P. Menon, "Fault Simulation on Reconfigurable Hardware," Proc. IEEE Symp. On FCCM, 

April 1997. 

[20] T. Suyama, M. Yokoo, and H. Sawada, "Solving Satisfiability Problems on FPGAs," Proc. Intn'l.Workshop on 

Field-Programmable Logic and Applications , 1996. 

[21] R.G. Wood and R.A. Rutenbar, "FPGA Routing and Routability Estimation Via Boolean Satisfiability," Proc. 

Intn'l. Symp. On FPGAs, February 1997.

DAC'99, pages 697-702 

Hardware Compilation for FPGA-based Configurable Computing Machines 

Xiaohan Zhu, Bill Lin 

University of California, San Diego 

Abstract 

Configurable computing machines are an emerging class of hybrid architectures where a field 

programmable gate array (FPGA) component is tightly coupled to a general-purpose 

microprocessor core. In these architectures, the FPGA component complements the generalpurpose 

microprocessor by enabling a developer to construct application-specific gate-level 

structures on-demand while retaining the flexibility and rapid reconfigurability of a fully 

programmable solution. High computational performance can be achieved on the FPGA 

component by creating custom data paths, operators, and interconnection pathways that are 

dedicated to a given problem, thus enabling similar structural optimization benefits as ASICs. In 

this paper, we present a new programming environment for the development of applications on 

this new class of configurable computing machines. This environment enables developers to 

develop hybrid hardware/software applications in a common integrated developme nt framework. 

In particular, the focus of this paper is on the hardware compilation part of the problem starting 

from a software-like algorithmic process-based specification. 

References 

[1] A. V. Aho et al. Compilers - principles, techniques, and tools, Reading: Addison-Wesley, 1986. 

[2] J. Bhasker. A Verilog HDL Primer. Prentice-Hall, 1997. 

[3] J. Bhasker. A VHDL Primer. Prentice-Hall, 1994. 

[4] R. Bittner, P. Athanas. “Wormhole Run-Time Reconfiguration”, ACM/SIGDA International Symposium on Field 

Programmable Gate Arrays, ACM, 1997. 

[5] R. Camposano andW.Wolf (editors), Trends in High-Level Synthesis, Kluwer Academic Publishers, 1993. 

[6] G. de Jong, B. Lin. “A communicating Petri net model for the design of concurrent asynchronous modules”, 

ACM/IEEE Design Automation Conference, 1994. 

[7] A. DeHon. “DPGA-Coupled Microprocessors: Commodity ICs for the Early 21st Century”, Proc. of the IEEE 

Workshop on FPGAs for Custom Computing Machines, April 1994. 

[8] H. De Man, F. Catthoor, G. Goossens, J. Vanhoof, J. Van Meerbergen, S. Note, J.A. Huisken, “Architecturedriven 

synthesis techniques for VLSI implementation of DSP algorithms”, Proceedings of IEEE, vol.72, no.2, 

pp.319-335, February 1990. 

[9] C. A. R.Hoare. Communicating Sequential Processes. Prentice-Hall, 1985. 

[10] J. R. Hauser, J. Wawrzynek. “A MIPS Processor with a Reconfigurable Coprocessor” Proc. Symposium on 

Field-Programmable Custom Computing Machines (FCCM), April 16-18, 1997. 

[11] B. W. Kernighan, D. M. Ritchie. The C Programming Language, Prentice-Hall, Englewood Cliffs, New Jersey, 

1978. 

[12] J. Larus, SPIM (http://www.cs.wisc.edu/ larus/spim.html), a MIPS R2000/R3000 simulator 

[13] B. Lin. “Software synthesis of process-based concurrent programs”, ACM/IEEE Design Automation 


[14] D. P. Lopresti, “P-NAC: A Systolic Array for Comparing Nucleic Acid Sequences”, Computer, vol. 20(7), pp. 

98-99, 1993. 

[15] L. Moll, J. Vuillemin, P. Boucard. “High-energy Physics on DECPeRLe-1 Programmable 

Active Memory”, ACM International Symposium on FPGAs, Monterey, February 1995. 

[16] I. Page. “Constructing Hardware-Software Systems from a Single Description”, Submitted 

to VLSI signal processing, July 1995. 

[17] J.L. Peterson. Petri net Theory and Modeling of Systems, Prentice Hall, 1981.

[18] A.W. Roscoe, C. A. R. Hoare. “Laws of occam programming”, Theoretical Computer Science, 60, 177-229, 

(1988). 

[19] R. M. Stallman, Using and porting GNU CC, Free Software Foundation, June 1993. 

[20] S. Vercauteren, D. Verkest, G. de Jong, B. Lin, “Derivation of formal representations from process-based 

specification and implementation models”, Proc. of ISSS’97, September 1997. 

[21] J. Villasenor, B. Schoner, K. N. Chia, C. Zapata, H. J. Kim, C. Jones, S. Lansing, B. Mangione-Smith. 

“Configurable Computing Solutions for Automatic Target Recognition”, IEEE Symposium on FPGAs for Custom 

Computing Machines (FCCM’96), April 1996. 

[22] J. Vuillemin, P. Bertin, D. Roncin, M. Shand, H. Touati, P. Boucard. “Programmable Active Memories: 

Reconfigurable Systems Come of Age”, IEEE Transactions on VLSI Sysetms, March 1996, vol. 4, (no.1):56-69. 

[23] M. J. Wirthlin, B. L. Hutchings. “DISC: The dynamic instruction set computer”, Field Programmable Gate 

Arrays (FPGAs) for Fast Board Development and Reconfigurable Computing, Proc. SPIE 2607, pp. 92-103 (1995). 

[24] R. D. Wittig, P. Chow, “OneChip: An FPGA Processor with Reconfigurable Logic”, IEEE Symposium on 

FPGAs for Custom Computing Machines, April 1996. 

[25] Altera Corporation (http://www.altera.com), RIPP10 Programming Board, California. 

[26] Exemplar (http://www.exemplar.com), Leonardo Spectrum, Alameda, CA. 

[27] Frontier Design System (http://www.frontierd.com), ART and DSPStation, Leuven, Belgium. 

[28] National Semiconductor Corporation, Napa 1000 Data Sheet, (http://www.national.com/appinfo/milaero 

/napa1000), 1997. 

[29] Synopsys (http://www.synopsys.com), Behavioral Compiler, Mountain View, CA. 

[30] Synopsys (http://www.synopsys.com), FPGA Express, Mountain View, CA. 

[31] Synplicity (http://www.synplicity.com), Synplify, Sunnyvale, CA. 

[32] Virtual Computer Corporation (http://www.vcc.com), California. 

[33] Xilinx (http://www.xilinx.com), Foundation Series Software, California. 

[34] Xilinx Corporation (http://www.xilinx.com), California.

DAC'99, pages 703-708 

0.18 µm CMOS and beyond 

D.J. Eaglesham 

Bell Labs, Lucent Technologies, Murray Hill, NJ 07974 

ABSTRACT 

As we move to the 0.18µm node and beyond, the dominant trend in device and process 

technology is a simple continuation of several decades of scaling. However, some serious 

challenges to straightforward scaling are on the horizon. This paper will review the present status 

of process technology and examine the likely departures from scaling in the various areas. The 

0.18µm node is seeing the first major new materials introduced into the Si process for many 

years in the interconnect, and major departures from the traditional process are being actively 

considered for the transistor. However, it is probable that continued scaling will continue to 

dominate advanced processes for several generations to come. 

References 

1) Waskiewicz et al., Proceedings SPIE 1997, 3048. 

2) G. Timp, et al., Proceedings IEDM 1998, 615. 

3) I.C. Kizilyalli et al., VLSI Tech Digest 1998. 

4) M. Hargrove, et al., Proceedings 1998 IEDM, 627. 

5) E. Leobandung, et al., Proceedings 1998 IEDM, 403. 

6) R.Chau et al., Proceedings IEDM 1997, 591 

7) D.P. Monroe and J.M. Hergenrother, Proceedings 1998 IEEE Conference on Silicon-on-Insulator. 

8) H.-S. P. Wong, K.K. Chan, and Y. Taur, Proceedings IEDM 1997, 427. 

9) D. Hasimoto et al., Proceedings IEDM 1998, 1032. 

10) H. Takato et al. Proceeding IEDM 1988, 222.

DAC'99, pages 709-714 

SOI Digital CMOS VLSI - A Design Perspective 

C. T. Chuang and R. Puri 

IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, U. S. A. 

Abstract 

This paper reviews the recent advances of SOI for digital CMOS VLSI applications with 

particular emphasis on the design issues and advantages resulting from the unique SOI device 

structure. The technology/device requirements and design issues/challenges for highperformance, 

general-purpose microprocessor applications are differentiated with respect to lowpower 

portable applications. Particular emphases are placed on the impact of floating-body in 

partially-depleted devices on the circuit operation, stability, and functionality. Unique SOI 

design aspects such as parasitic bipolar effect and hysteretic VT variation are addressed. Circuit 

techniques to improve the noise immunity and global design issues are discussed. 

References 

[1] C. Hu, "SOI and Device Scaling," Proc. IEEE International SOI Conf., 1998, pp. 1-4. 

[2] C. T. Chuang, P. F. Lu, and C. J. Anderson, "SOI for Digital CMOS VLSI: Design Considerations and 

Advances," Proc. IEEE, vol. 86, no. 4, April 1998, pp. 689-720. 

[3] B. Yu, et. al., Int'l Semicon. Device Res. Symp., 1997, p. 623. 

[4] H. S. Wong, K. C. Chan, and Y. Taur, "Self-Aligned (Top and Bottom) Double-Gate MOSFET with a 25 nm 

Thick Silicon Channel," Tech. Digest, IEDM, 1997, pp. 427-430. 

[5] L. Su, et. al., Proc. IEEE Int'l SOI Conf., 1993, pp. 112-113. 

[6] D. J. Schepis, et al., " A 0.25 µm CMOS SOI Technology and Its Application to 4 Mb SRAM," Tech. Digest, 

IEDM, 1997, pp. 587-590. 

[7] F. Assaderaghi, et al., "A 7.9/5.5 psec Room/Low Temperature SOI CMOS," Tech. Digest, IEDM, 1997, pp. 

415-418. 

[8] E. Leobandung,M. Sherony, J. Sleight, R. Bolam, F. Assaderaghi, S. Wu, D. Schepis, A. Ajmera, W. Rausch, B. 

Davari, and G. Shahidi, "Scalability of SOI Technology into 0.13 µm1.2 V CMOS Generation," Tech. Digest, 

IEDM, 1998, pp. 403-406. 

[9] T. Fuse, Y. Oowaki, M. Terauchi, S. Watanabe, M. Yoshimi, K. Ohuchi, and J. Matsunaga, "0.5V SOI CMOS 

Pass-Gate Logic," Digest Tech. Papers, ISSCC, 1996, pp. 88-89. 

[10] T. Douseki, S. Shigematsu, Y. Tanabe, M. Harada, H. Inokawa, and T. Tsuchiya, "A 0.5V SIMOX-MTCMOS 

Circuit with 200ps Logic Gate," Digest Tech. Papers, ISSCC, 1996, pp. 84-85. 

[11] M. Canada, et al., "A 580MHz RISC Microprocessor in SOI," Dig. Tech. Papers, ISSCC, 1999, pp. 430-431. 

[12] D. H. Allen, et. al., "A 0.20 µm 1.8 V SOI 550MHz 64b PowerPC Microprocessor with Cu Interconnects," Dig. 

Tech. Papers, ISSCC, 1999, pp. 438-439. 

[13] J. Sleight and K. Mistry, "A Compact Schottky Contact Technology for SOI Transistors," Tech. Digest, IEDM, 

1997, pp. 419-422. 

[14] K. Mistry, G. Grula, J. Sleight, L. Bair, R. Stephany, R. Flatley, and P. Skerry, "A 2.0V, 0.35 µm Partially 

Depleted SOI-CMOS Technology," Tech. Digest, IEDM, 1997, pp. 583-586. 

[15] Y. W. Kim, et. al., "A 0.25 µm 600 MHz 1.5V SOI 64b ALPHA Microprocessor," Dig. Tech. Papers, ISSCC, 

1999, pp. 432-433. 

[16] P. F. Lu, C. T. Chuang, J. Ji, L. F. Wagner, C. M. Hsieh, J. B. Kuang, L. Hsu, M. M. Pelella, S. Chu, and C. J. 

Anderson, "Floating Body Effects in Partially-Depleted SOI CMOS Circuits," IEEE J. Solid-State Circuits, vol. 32, 

no. 8, August 1997, pp. 1241-1253. 

[17] C. T. Chuang, P. F. Lu, J. Ji, L. F. Wagner, S. Chu, and C. J. Anderson, "Dual-Mode Parasitic Bipolar Effect in 

Dynamic CVSL XOR Circuit with Floating-Body Partially-Depleted SOI Devices," Proc. Tech. Papers, Int. Symp. 

on VLSI Tech., Syst., and Applications, Taipei, Taiwan, June 3-5, 1997, pp. 288-292. 

[18] A. Wei, D. A. Antoniadis, and L. A. Bair, "Minimizing Floating-Body-Induced Threshold Voltage Variation in 

Partially Depleted SOI CMOS," IEEE Electron Device letters, vol. 17, no. 8, August 1996, pp. 391-394.

[19] T. W. Houston and S. Unnikrishnan, "A Guide to Simulation of Hysteretic Gate Delays Based on Physical 

Understanding," Proc. IEEE International SOI Conf., 1998, pp. 121-122. 

[20] M. M. Pelella, C. T. Chuang, J, G. Fossum, C. Tretz, B. W. Curran, and M. G. Rosenfield, "Hysteresis in 

Floating-Body PD/SOI Circuits," Proc. Tech. Papers, Int. Symp. on VLSI Tech., Syst., and Applications, Taipei, 

Taiwan, June 8-10, 1999. 

[21] R. Puri and C. T. Chuang, "Hysteresis Effect in Pass-Transistor Based Partially-Depleted SOI CMOS Circuits," 

Proc. IEEE International SOI Conf., 1998, pp. 103-104. 

[22] A. Wei and D. Antoniadis, "Design Methodology for Minimizing Hysteretic VT -Variation in Partially- 

Depleted SOI CMOS," Tech. Digest, IEDM, 1997, pp. 411-414. 

[23] Y. Tosaka, K. Suzuki, and T. Sugii, "a-Particle-Induced Soft Errors in Submicron SOI SRAM," Dig. Tech. 

Papers, Symp. VLSI Technology, 1995, pp. 39-40. 

[24] T. Wada, et. al., "A 128Kb SRAM with Soft Error Immunity for 0.35 µm SOI-CMOS Embedded Cell Arrays," 

Proc. IEEE International SOI Conf., 1998, pp. 127-128. 

[25] K. Kumagai, T. Yamada, H. Iwaki, H. Nakamura, and H. Onishi, "A New SRAM Cell Design Using 0.35 µm 

CMOS/SIMOX Technology," Proc. IEEE International SOI Conf., 1997, pp. 174-175.

DAC'99, pages 715-720 

Equivalent Elmore Delay for RLC Trees 

Yehea I. Ismail, Eby G. Friedman, and Jose L. Neves 1 

Department of Electrical and Computer Engineering, University of Rochester, 

Rochester, New York 14627 

1 IBM Microelectronics, East Fishkill, New York 12533 

Abstract 

Closed form solutions for the 50% delay, rise time, overshoots, and settling time of signals in an 

RLC tree are presented. These solutions have the same accuracy characteristics as the Elmore 

delay model for RC trees and preserves the simplicity and recursive characteristics of the Elmore 

delay. The solutions introduced here consider all damping conditions of an RLC circuit including 

the underdamped response, which is not considered by the classical Elmore delay model due to 

the non-monotone nature of the response. Also, the solutions have significantly improved 

accuracy as compared to the Elmore delay for an overdamped response. The solutions introduced 

here for RLC trees can be practically used for the same purposes that the Elmore delay is used 

for RC trees. 

References 

[1] J. M. Rabaey, Digital Integrated Circuits, A Design Perspective, Prentice Hall, Inc., New Jersey, 1996. 

[2] D. A. Priore, “Inductance on Silicon for Sub-Micron CMOS VLSI,” Proceedings of the IEEE Symposium on 

VLSI Circuits, pp. 17-18, May 1993. 

[3] D. B. Jarvis, “The Effects of Interconnections on High-Speed Logic Circuits,” IEEE Transactions on Electronic 

Computers, Vol. EC-10, No. 4, pp. 476 - 487, October 1963. 

[4] A. Deutsch, et al., “High-Speed Signal Propagation on Lossy Transmission Lines,” IBM Journal of Research 

and Development, Vol. 34, No. 4, pp. 601 - 615, July 1990. 

[5] A. Deutsch, et al., “Modeling and Characterization of Long Interconnections for High-Performance 

Microprocessors,” IBM Journal of Research and Development, Vol. 39, No. 5, pp. 547 - 667, September 1995. 

[6] Y. I. Ismail, E. G. Friedman, and J. L. Neves, “Figures of Merit to Characterize the Importance of On-Chip 

Inductance,” Proceedings of the IEEE/ACM Design Automation Conference, pp. 560 – 565, June 1998. 

[7] M. P. May, A. Taflove, and J. Baron, “FD-TD Modeling of Digital Signal Propagation in 3-D Circuits with 

Passive and Active Loads,” IEEE Transactions on Microwave Theory and Techniques, Vol. MTT-42, No. 8, pp. 

1514 - 1523, August 1994. 

[8] T. Sakurai, “Approximation of Wiring Delay in MOSFET LSI,” IEEE Journal of Solid-State Circuits, Vol. SC- 

18, No. 4, pp. 418 - 426, August 1983. 

[9] G. Y. Yacoub, H. Pham, and E. G. Friedman, “A System for Critical Path Analysis Based on Back Annotation 

and Distributed Interconnect Impedance Models,” Microelectronic Journal, Vol. 18, No. 3, pp. 21 - 30, June 1988. 

[10] J. Torres, “Advanced Copper Interconnections for Silicon CMOS Technologies,” Applied Surface Science, Vol. 

91, No. 1, pp. 112 - 123, October 1995. 

[11] C. F. Webb et al., “A 400MHz S/390 Microprocessor,” Proceedings of the IEEE International Solid-State 

Circuits Conference, pp. 448 – 449, February 1997. 

[12] P. J. Restle and A. Duetsch, “Designing the Best Clock Distribution Network,” Proceedings of the IEEE VLSI 

Circuit Symposium, pp. 2- 5, June 1998. 

[13] W. C. Elmore, “The Transient Response of Damped Linear Networks,” Journal of Applied Physics, Vol. 19, pp. 

55 - 63, January 1948. 

[14] J. L. Wyatt, Circuit Analysis, Simulation and Design, Elsevier Science Publishers, North-Holland, 1987. 

[15] J. Cong, L. He, C-K. Koh, and P. Madden, “Performance Optimization of VLSI Interconnect,” Integration, The 

VLSI Journal, Vol. 21, pp. 1 - 94, November 1996. 

[16] J. Cong and K. S. Leung, “Optimal Wire Sizing Under the Distributed Elmore Delay Model,” Proceedings of 

the IEEE/ACM International Conference on Computer-Aided Design, pp. 634 - 639, November 1995. 

[17] S. S. Sapatnekar, “RC Interconnect Optimization Under the Elmore Delay Model,” Proceedings of the 

IEEE/ACM Design Automation Conference, pp. 387 – 391, June 1994.

[18] J. Cong and L. He, “Optimal Wire Sizing for Interconnects with Multiple Sources,” Proceedings of the IEEE 

International Conference on Computer-Aided Design, pp. 586 – 574, November 1995 

[19] L. W. Nagel, “SPICE2: A Computer Program to Simulate Semiconductor Circuits,” Technical Report ERL- 

M520, UCBerkeley, May 1975 

[20] K. D. Boese, A. B. Kahng, B. A. McCoy, and G. Robins, “Fidelity and Near-Optimality of Elmore-Based 

Routing Constructions,” Proceedings of the IEEE International Conference on Computer Design, pp. 81 – 84, 

October 1993. 

[21] J. Cong, A. B Kahng, C.-K. Koh and C.-W. A. Tsao, “Bounded- Skew Clock and Steiner Routing Under 

Elmore Delay,” Proceedings of the IEEE International Conference On Computer-Aided Design, pp. 66 - 71, January 

1995. 

[22] L. P. P. P. van Ginneken, “Buffer Placement in Distributed RC-tree Networks for Minimal Elmore Delay,” 

Proceedings of the IEEE International Symposium on Circuits and Systems, pp. 865 - 868, May 1990. 

[23] C. J. Alpert, “Wire Segmenting for Improved Buffer Insertion,” Proceedings of the IEEE/ACM Design 


[24] J. Rubinstein, P. Penfield, Jr., and M. Horowitz, “Signal Delay in RC Tree Networks,” Proceedings of the 

IEEE/ACM Design Automation Conference, pp. 202 – 211, June 1983. 

[25] L. T. Pillage and R. A. Rohrer, “Delay Evaluation with Lumped Linear RLC Interconnect Circuit Models,” 

Proceedings of the Caltech Conference on VLSI, pp. 143-158, May 1989. 

[26] M. A. Horowitz, “Timing Models for CMOS Circuits,” Ph.D. Thesis, Stanford University, January 1984. 

[27] L. T. Pillage, R. A. Rohrer, and C. Visweswariah, Electronic Circuit and System Simulation Methods, McGraw- 

Hill, Inc., 1994. 

[28] L. T. Pillage and R. A. Rohrer, “Asymptotic Waveform Evaluation for Timing Analysis,” IEEE Transactions 

on Computer-Aided Design, Vol. CAD-9, No. 4, pp. 352 - 366, April 1990. 

[29] C. L. Ratzlaff, N. Gopal, and L. T. Pillage, “RICE: Rapid Interconnect Circuit Evaluator,” Proceedings of the 

IEEE/ACM Design Automation Conference, pp. 555 – 560, June 1991. 

[30] L. T. Pillage, “Coping with RC(L) Interconnect Design Headaches,” Proceedings of the IEEE/ACM 

International Conference on Computer-Aided Design, pp. 246 – 253, September 1995. 

[31] AS/X User’s Guide, IBM Corporation, New York, 1996. 

[32] B. C. Kuo, Automatic Control Systems, A Design Perspective, Prentice Hall of India, New Delhi, India, 1989.

DAC'99, pages 721-724 

Effects of Inductance on the Propagation Delay and Repeater Insertion in VLSI Circuits 

Yehea I. Ismail and Eby G. Friedman 


University of Rochester, Rochester, New York 14627 

Abstract 

A closed form expression for the propagation delay of a CMOS gate driving a distributed RLC 

line is introduced that is within 5% of dynamic circuit simulations for a wide range of RLC 

loads. It is shown that the traditional quadratic dependence of the propagation delay on the length 

of an RC line approaches a linear dependence as inductance effects increase. The closed form 

delay model is applied to the problem of repeater insertion in RLC interconnect. Closed form 

solutions are presented for inserting repeaters into RLC lines that are highly accurate with 

respect to numerical solutions. An RC model as compared to an RLC model creates errors of up 

to 30% in the total propagation delay of a repeater system. Considering inductance in repeater 

insertion is also shown to significantly save repeater area and power consumption. The error 

between the RC and RLC models increases as the gate parasitic impedances decrease which is 

consistent with technology scaling trends. Thus, the importance of inductance in high 

performance VLSI design methodologies will increase as technologies scale. 

References 

[1] D. A. Priore, “Inductance on Silicon for Sub-Micron CMOS VLSI,” Proceedings of the IEEE Symposium on 

VLSI Circuits, pp. 17-18, May 1993. 

[2] M. P. May, A. Taflove, and J. Baron, “FD-TD Modeling of Digital Signal Propagation in 3-D Circuits with 

Passive and Active Loads,” IEEE Transactions on Microwave Theory and Techniques, Vol. MTT-42, No. 8, pp. 

1514 - 1523, August 1994. 

[3] T. Sakurai, “Approximation of Wiring Delay in MOSFET LSI,” IEEE Journal of Solid-State Circuits, Vol. SC- 

18, No. 4, pp. 418 - 426, August 1983. 

[4] G. Y. Yacoub, H. Pham, and E. G. Friedman, “A System for Critical Path Analysis Based on Back Annotation 

and Distributed Interconnect Impedance Models,” Microelectronic Journal, Vol. 18, No. 3, pp. 21 - 30, June 1988. 

[5] M. Shoji, High-Speed Digital Circuits, Addison Wesley, Massachusetts, 1996. 

[6] J. Torres, “Advanced Copper Interconnections for Silicon CMOS Technologies,” Applied Surface Science, Vol. 

91, No. 1, pp. 112 - 123, October 1995. 

[7] A. Deutsch et al., “When are Transmission-Line Effects Important for On-Chip Interconnections?,” IEEE 

Transactions on Microwave Theory and Techniques, Vol. MTT-45, No. 10, pp. 1836-1846, October 1997. 

[8] Y. I. Ismail, E. G. Friedman, and J. L. Neves, “Figures of Merit to Characterize the Importance of On-Chip 

Inductance,” Proceedings of the IEEE/ACM Design Automation Conference, pp. 560 – 565, June 1998. 

[9] H. B. Bakoglu and J. D. Meindl, “Optimal Interconnection Circuits for VLSI,” IEEE Transactions on Electron 

Devices, Vol. ED-32, No. 5, pp. 903 - 909, May 1985. 

[10] V. Adler and E. G. Friedman, “Repeater Design to Reduce Delay and Power in Resistive Interconnect,” IEEE 

Transactions on Circuits and Systems II: Analog and Digital Signal Processing, Vol. CAS-45, No. 5, pp. 607 - 616, 

May 1998. 

[11] H. B. Bakoglu, Circuits, Interconnections, and Packaging for VLSI, Addison-Wesley Publishing Company, 

1990. 

[12] L. N. Dworsky, Modern Transmission Line Theory and Applications, John Wiley & Sons, Inc., New York, 

1979. 

[13] W. C. Elmore, “The Transient Response of Damped Linear Networks,” Journal of Applied Physics, Vol. 19, pp. 

55 - 63, January 1948. 

[14] AS/X User’s Guide, IBM Corporation, New York, 1996.

DAC'99, pages 725-730 

Retiming for DSM with Area-Delay Trade-offs and Delay Constraints 

Abdallah Tabbara, Robert K. Brayton, A. Richard Newton 

Department of Electrical Engineering and Computer Sciences, 


Abstract 

The concept of improving the timing behavior of a circuit by relocating registers is called 

retiming and was first presented by Leiserson and Saxe. They showed that the problem of 

determining an equivalent minimum area (total number of registers) circuit is polynomial-time 

solvable. In this work we show how this approach can be reapplied in the DSM domain when 

area-delay trade-offs and delay constraints are considered. The main result is that the concavity 

of the trade-off function allows for a casting of this DSM problem into a classical minimum area 

retiming problem whose solution is polynomial time solvable. 

References 

[1] R. Alur, "Timed Automata", NATO-ASI Summer School on Verification of Digital and Hybrid Systems, 1998. 

[2] R.B. Deokar and S.S. Sapatnekar, "A Fresh Look at Retiming via Clock Skew Optimization", DAC pp. 310-315, 

1995. 

[3] F. Eory, "Systems to Silicon Design: Methodology for Deep Sub-micron ASICs", SuperCon, 1997. 

[4] C.E. Leiserson and J.B. Saxe, "Retiming Synchronous Circuitry", Algorithmica, vol. 6, pp. 5-35, 1991. 

[5] N. Maheshwari and S.S. Sapatnekar, "An Improved Algorithm for Minimum-Area Retiming", DAC, 1997. 

[6] J.B. Orlin, "A Faster Strongly Polynomial Minimum Cost Flow Algorithm", Operations Research, vol.41, no.2, 

pp. 338-50, 1993. 

[7] R.H.J.M. Otten and R.K. Brayton "Planning for Performance", DAC, 1998. 

[8] Y. Pinto, R. Shamir, "Efficient Algorithms for Minimum-Cost Flow Problems with Piecewise-Linear Convex 

Costs", Algorithmica, vol.11, pp. 256-277, 1994. 

[9] E. Sentovich, "Sequential Circuit Synthesis at the Gate Level", Ph.D. Thesis, UC Berkeley, Chap. 5, 1993. 

[10] N. Shenoy and R. Rudell, "Efficient Implementation of Retiming", ICCAD, pp. 226-233, 1994. 

[11] "National Technology Roadmap for Semiconductors", Semiconductor Industry Association, 4300 Stevens 

Creek Blvd., Suite 271, San Jose, CA 95129.

DAC'99, pages 731-736 

Functional Timing Analysis for IP Characterization 

Hakan Yalcin†, Mohammad Mortazavi†, Robert Palermo†, Cyrus Bamji†, Karem Sakallah‡ 

† Cadence Design Systems, San Jose, CA 95134 

‡ Dept. of Electrical Eng. and Computer Science, University of Michigan, Ann Arbor, MI 48109 

ABSTRACT 

A method that characterizes the timing of Intellectual Property (IP) blocks while taking into 

account IP functionality is presented. IP blocks are assumed to have multiple modes of operation 

specified by the user. For each mode, our method calculates IO path delays and timing 

constraints to generate a timing model. The method thus captures the mode-dependent variation 

in IP delays which, according to our experiments, can be as high as 90%. The special manner in 

which delay calculation is performed guarantees that IP delays are never underestimated. The 

resulting timing models are also compacted through a process whose accuracy is controlled by 

the user. 

Keywords: Timing analysis, false path, functional (mode) dependency, IP characterization. 

REFERENCES 

[1] D. Brand, V. Iyengar, “Timing Analysis Using Functional Relationships,” Proc. Int’l. Conf. on Computer Aided 

Design, 1986, pp. 126-129. 

[2] H.-C. Chen, and D.H.C. Du, “Path Sensitization in Critical Path Problem,” IEEE Trans. on CAD, vol. 12, Feb. 

1993, pp. 196-207. 

[3] V.H. Hrapcenko, “Depth and Delay in a Network,” Soviet Math Dokl., vol. 19, 1978, pp. 1006-1009. 

[4] M. Hansen, H. Yalcin. J. Hayes, “Unveiling the ISCAS-85 Benchmarks: A Case Study in Reverse Engineering,” 

IEEE Design and Test, 1999, to appear. 

[5] Y. Kukimoto, R.K. Brayton, “Hierarchical Functional Timing Analysis,” Proc. Design Automation Conf., 1998, 

pp. 580-585. 

[6] Y. Kukimoto, W. Gosti, A. Saldanha, R.K. Brayton, “Approximate Timing Analysis of Combinational Circuits 

Under the XBD0 Model,” Proc. Int’l. Conf. on Computer Aided Design, 1997, pp. 176-181. 

[7] P.C. McGeer, R.K. Brayton, Integrating Functional and Temporal Domains in Logic Design: The False Path 

Problem and Its Implications, Kluwer Academic Publishers, Boston, 1991. 

[8] T.M. McWilliams, “Verification of Timing Constraints on Large Digital Systems,” Proc. Design Automation 

Conf., 1980, pp. 139-147. 

[9] J. P. M. Silva, K. Sakallah, “Efficient and Robust Test Generation-Based Timing Analysis,” Proc. Int’l Symp. on 

Circuits and Systems, 1994. 

[10]H. Yalcin, J. Hayes, “Hierarchical Timing Analysis Using Conditional Delays,” Proc. Int’l. Conf. on Computer 

Aided Design, 1995, pp. 371-377. 

[11]H. Yalcin, J. Hayes, K. Sakallah, “Approximate Timing Analysis For Datapath Circuits,” Proc. Int’l. Conf. on 

Computer Aided Design, 1996, pp. 114-118. 

[12]H. Yalcin, M. Mortazavi, R. Palermo, C. Bamji, K. Sakallah, “Quantization-Based Timing Model Reduction,” 

in preparation.

DAC'99, pages 737-741 

Detecting False Timing Paths: Experiments on PowerPC TM Microprocessors 

Richard Raimi*, Jacob Abraham** 

*Motorola Corp., Austin, TX 78730 

**Computer Engineering Research Center, The University of Texas at Austin, Austin, TX 78712 

Abstract 

We present a new algorithm for detecting both combinationally and sequentially false timing 

paths, one in which the constraints on a timing path are captured by justifying symbolic functions 

across latch boundaries. 

We have implemented the algorithm and we present, here, the results of using it to detect false 

timing paths on a recent PowerPC microprocessor design. We believe these are the first 

published results showing the extent of the false path problem in industry. Our results suggest 

that the reporting of false paths may be compromising the effectiveness of static timing analysis. 

References 

[1] T. Chakraborty, V. Agraawal, Effective Path Selection for Delay Fault Testing of Sequential Circuits. 

International Test Conference, 1997. 

[2] A. Biere, A. Cimatti, E. M. Clarke, M. Fujita, Y. Zhu Symbolic Model Checking using SAT procedures instead of 

BDDs, Proceeding Design Automation Conference, 1999. 

[3] R. E. Bryant, Graph-Based Algorithms for Boolean Function Manipulation. IEEE Transactions on Computers, 

Vol. C-35, No. 8, August, 1986. 

[4] M. Davis, H. Putnam, A Computing Procedure for Quantification Theory. Journal of the Association for 

Computing Machinery, vol. 7, 1960. 

[5] H. Chang, J. Abraham An Efficient Critical Path Tracing Algorithm for Designing High Performance VLSI 

Systems. Journal of Electronic Testing, Theory and Applications, 11, pp. 119-129, Kluwer Academic Publishers, 

1997. 

[6] H. Chang, Strategies for Design and Test of High Performance Systems. Ph.D. Dissertation, The University of 

Texas at Austin, August 1993. 

[7] S. Jah, Y. Lu, M. Minea, E. Clarke, Equivalence Checking Using Abstract BDDs, ICCD, 1997. 

[8] Y. Kukimoto, R. Brayton Exact Required Time Analysis via False Path Detection. Design Automation 

Conference, 1997. 

[9] C.-J. Seger Voss–A Formal Hardware Verification System User’s Guide, Tech. Rep. 93-45, Dept. of Comp. Sci., 

Univ. of British Columbia, 1993. 

[10] P. McGeer, A. Saldanha, R. Brayton, A. Sangiovanni-Vincentelli Delay Models and Exact Timing Analysis. in 

T. Sasao, editor, Logic Synthesis and Optimization, pp. 167-189, Kluwer Publishers, 1993.

DAC'99, pages 742-747 

On ILP Formulations for Built-In Self-Testable Data Path Synthesis 

Han Bin Kim, Dong Sam Ha 

Dept. of Electr. & Comput. Eng., Virginia Tech, Blacksburg, VA 24061-0111 

Takeshi Takahashi 

Advantest Lab. Ltd., 48-2 Matsubara, Kamiayashi, Aoba-ku, Sendai, Miyagi 989-31, Japan 

ABSTRACT 

In this paper, we present a new method to the built-in self-testable data path synthesis based on 

integer linear programming (ILP). Our method performs system register assignment, built-selftest 

(BIST) register assignment, and interconnection assignment concurrently to yield optimal 

designs. Our experimental results show that our method successfully synthesizes BIST circuits 

for all six circuits experimented. All the BIST circuits are better in area overhead than those 

generated by existing high-level BIST synthesis methods. 

Keywords: high-level BIST synthesis, built-in self-test, BIST, ILP. 

REFERENCES 

[1] C.A. Papachristou, S. Chiu, and H. Harmanani, "A data path synthesis method for self-testable designs, " Proc. 

28 th Design Automation Conf., pp. 378-384, June 1991. 

[2] H. Harmanani and C.A. Papachristou, "An improved method for RTL synthesis with testability tradeoff," Intl. 

Conf. on Computer-Aided Design, pp. 30-35, Nov. 1993. 

[3] L.J. Avra, "Allocation and assignment in high-level synthesis for self-testable data paths," Proc. Int. Test Conf., 

pp. 463-472, Oct. 1991. 

[4] I. Parulkar, S. Gupta, and M.A. Breuer, “Data path allocation for synthesizing RTL designs with low BIST area 

overhead,” Proc. 32nd Design Automation Conf., pp. 395-401, June 1995. 

[5] A. Orailoglu and I.G. Harris, “Microarchitectural synthesis for rapid BIST testing,” IEEE Trans. Computer- 

Aided Design, Vol.16, No. 6, pp. 573-586, June 1997. 

[6] H.B. Kim, T. Takahashi, and D.S. Ha, "Test session oriented built-in self-testable data path synthesis,” Proc. Int. 

Test Conf., pp. 154-163, Oct. 1998. 

[7] L. Hafer and A. Parker, “A formal method for the specification, analysis, and design of register-transfer level 

digital logic,” IEEE Trans. on Computer-Aided Design, Vol. 2, pp. 4-18, Jan. 1983. 

[8] C.H. Gebotys and M.I. Elmasry, “Global optimization approach for architecture synthesis,” IEEE Trans. 

Computer-Aided Design, Vol. CAD-12, pp.1266-1278, Sept. 1993. 

[9] M. Rim, A. Mujumdar, R. Jain, and R. DeLeone, “Optimal and heuristic algorithms for solving the binding 

problem,” IEEE Trans. on VLSI Systems, Vol. 2, No. 2, pp. 211-225, June 1994. 

[10] G. DeMichelli, Synthesis and Optimization of Digital Circuits, McGraw Hill, 1994. 

[11] Koenemann, B.J. Mucha, and G. Zwiehoff, “Built-in logic block observation techniques,” Proc. Int. Test Conf., 

pp. 37-41, Oct. 1979. 

[12] L.-T. Wang and E.J. McCluskey, “Concurrent built-in logic block observer (CBILBO),” Proc. Int. Symp. on 

Circuits and Systems, pp. 1054-1057, May 1986. 

[13] CPLEX 6.0 Reference Manual, ILOG, 1998. 

[14] M. Potkonjak and J. Rabaey, “A scheduling and resource allocation algorithm for hierarchical signal flow 

graphs,” Proc. 36th Design Automation Conf., pp. 7-12, June 1989.

DAC'99, pages 748-753 

Improving The Test Quality for Scan-based BIST Using A General Test 

Application Scheme 

Huan-Chih Tsai*, Kwang-Ting Cheng*, Sudipta Bhawmik** 

*Department of ECE, University of California, Santa Barbara, CA 93106 

**Bell Laboratories, Lucent Technologies, Princeton, NJ 08542 

Abstract 

In this paper, we propose a general test application scheme for existing scan-based BIST 

architectures. The objective is to further improve the test quality without inserting additional 

logic to the Circuit Under Test (CUT). The proposed test scheme divides the entire test process 

into multiple test sessions. A different number of capture cycles is applied after scanning in a test 

pattern in each test session to maximize the fault detection for a distinct subset of faults. We 

present a procedure to find the optimal number of capture cycles following each scan sequence 

for every fault. Based on this information, the number of test sessions and the number of capture 

cycles after each scan sequence are determined to maximize the random testability of the CUT. 

We conduct experiments on ISCAS89 benchmark circuits to demonstrate the effectiveness of our 

approach. 

References 

[1] V.D. Agrawal, C.R. Kime, and K.K. Saluja, “A Tutorial on Built-In Self-Test, Part 2: Applications,” IEEE 

Design & Test of Computers, vol. 10, no. 22, pp. 69–77, June 1993. 

[2] B. Konemann, J. Mucha, and C. Zwiehoff, “Built-In Logic Block Observation Technique,” Digest of Papers 

1979 Test Conf., pp. 37–41, Oct. 1979. 

[3] A. Krasniewski and S. Pilarski, “Circular Self-Test Path: A Low-Cost BIST Technique for VLSI Circuits,” IEEE 

Trans. on CAD, vol. 8, no. 1, pp. 46–55, Jan. 1989. 

[4] P.H. Bardell and W.H. McAnney, “Selt-Testing of Multichip Logic Modules,” Digest of Papers 1982 Int’l Test 

Conf., pp. 200–204, Nov. 1982. 

[5] C.-J. Lin, Y. Zorian, and S. Bhawmik, “Integration of Partial Scan and Built-In Self-Test,” JETTA, vol. 7, no. 1– 

2, pp. 125–137, Aug. 1995. 

[6] P.H. Bardell, “Design Considerations for Parallel Pseudo-Random Pattern Generators,” JETTA, vol. 1, pp. 73– 

87, Feb. 1990. 

[7] Y. Zorian and A. Ivanov, “Programmable Space Compaction for BIST,” Proc. of Int’l Symp. on Fault-Tolerant 

Computing, pp. 340–349, June 1993. 

[8] J.A. Waicukauski and E. Lindbloom, “Fault Detection Effectiveness of Weighted Random Patterns,” Proc. of 

ITC, pp. 245–255, 1988. 

[9] I. Pomeranz and S.M. Reddy, “3-weight Pasudo-random Test Generation Based on A Deterministic Test Set for 

Combinational and Sequential Circuits,” IEEE Trans. on CAD, vol. 24, pp. 1050–1058, July 1993. 

[10] M. Bershteyn, “Calculation of Multiple Sets of Weighted Random Testing,” Proc. of ITC, pp. 1031–1040, Oct. 

1993. 

[11] R. Kapur, S. Patil, T.J. Snethen, and T.W. Williams, “A Weighted Random Pattern Generation System,” IEEE 

Trans. on CAD, vol. 15, no. 8, pp. 1020–1025, Aug. 1996. 

[12] N. Tamarapalli and J. Rajski, “ConstructiveMulti-Phases Test Point Insertion for Scan-Based BIST,” Proc. of 

ITC, pp. 649–658, Oct. 1996. 

[13] H.-C. Tsai, S. Bhawmik, and K.-T. Cheng, “An Almost Fullscan BIST Solution — Higher Fault Coverage and 

Shorter Test Application Time,” Proc. of ITC, pp. 1065–1073, Oct. 1998. 

[14] F. Brglez, “On Testability of Combinational Networks,” Proc. of ISCAS, pp. 221–225, May 1984. 

[15] K.-T. Cheng and C.-J. Lin, “Timing-Driven Test Point Insertion for Full-Scan and Partial-Scan BIST,” Proc. of 

ITC, pp. 506–514, Oct. 1995. 

[16] H.-C. Tsai, K.-T. Cheng, C.-J. Lin, and S. Bhawmik, “A Hybrid Algorithm for Test Point Selection for Scan- 

Based BIST,” Proc. of DAC, pp. 478–483, June 1997.

DAC'99, pages 754-759 

Built-In Test Sequence Generation for Synchronous Sequential Circuits 

Based on Loading and Expansion of Test Subsequences ? 

Irith Pomeranz and Sudhakar M. Reddy 

Electrical and Computer Engineering Department, University of Iowa, Iowa City, IA 52242 

Abstract 

We describe an on-chip test generation scheme for synchronous sequential circuits that allows atspeed 

testing of such circuits. The proposed scheme is based on loading of (short) input 

sequences into an on-chip memory, and expansion of these sequences on-chip into test 

sequences. Complete coverage of modeled faults is achieved by basing the selection of the 

loaded sequences on a deterministic test sequence T0 , and ensuring that every fault detected by 

T0 is detected by the expanded version of at least one loaded sequence. Experimental results 

presented for benchmark circuits show that the length of the sequence that needs to be stored at 

any time is on the average 10% of the length of T0 , and that the total length of all the loaded 

sequences is on the average 46% of the length of T0. 

References 

[1] P. C. Maxwell, R. C. Aitken, K. R. Kollitz and A. C. Brown, "IDDQ and AC Scan: The War Against 

Unmodelled Defects", in Proc. 1996 Intl. Test Conf., Oct. 1996, pp. 250-258. 

[2] "Best Methods for At-Speed Testing?", Panel 3, 16th VLSI Test Symp., April 1998, p. 460. 

[3] L. Nachman, K. K. Saluja, S. Upadhyaya and R. Reuse, "Random Pattern Testing for Sequential Circuits 

Revisited", in Proc. 26th Fault-Tolerant Computing Symp., June 1996, pp. 44-52. 

[4] I. Pomeranz and S. M. Reddy, "Built-In Test Generation for Synchronous Sequential Circuits", in Proc. Intl. 

Conf. on Computer-Aided Design, Nov. 1997, pp. 421-426. 

[5] V. Iyengar, K. Chakrabarty, and B. T. Murray "Built-in Self Testing of Sequential Circuits Using Precomputed 

Test Sets," in Proc. VLSI Test Symp., April 1998, pp. 418-422. 

[6] I. Pomeranz and S. M. Reddy, "A Learning-Based Method to Match a Test Pattern Generator to a Circuit-Under- 

Test", in Proc. 1993 Intl. Test Conf., Oct. 1993, pp. 998-1007. 

[7] S. Gupta, J. Rajski and J. Tyszer, "Arithmetic Additive Generators of Pseudo-Exhaustive Test Patterns", IEEE 

Trans. on Computer-Aided Design, Aug. 1996, pp. 939-949. 

[8] R. Dandapani, J. H. Patel and J. A. Abraham, "Design of Test Pattern Generation for Built-In Test", in Proc. Intl. 

Test Conf., 1984, pp. 315-319. 

[9] K.-H. Tsai, S. Hellebrand, J. Rajski and M. Marek-Sadowska, "STARBIST: Scan Autocorrelated Random 

Pattern Generation", in Proc. 34th Design Autom. Conf., June 1997, pp. 472-477. 

[10] K.-H. Tsai, J. Rajski and M. Marek-Sadowska, "Scan Encoded Test pattern Generation for BIST", in Proc. Intl. 

Test Conf., Oct. 1997, pp. 548-556. 

[11] M. S. Hsiao, E. M. Rudnick, and J. H. Patel, "Sequential Circuit Test Generation Using Dynamic State 

Traversal", in Proc. 1996 Europ. Design & Test Conf., March 1996, pp. 22-28. 

[12] I. Pomeranz and S. M. Reddy, "Vector Restoration Based Static Compaction of Test Sequences for 

Synchronous Sequential Circuits", in Proc. Intl. Conf. on Computer Design, Oct. 1997, pp. 360-365.

DAC'99, pages 760-765 Analysis of Performance Impact Caused by Power Supply 

Noise in Deep Submicron Devices 

Yi-Min Jiang, Kwang-Ting Cheng 

Department of Electrical & Computer Engineering, 

University of California, Santa Barbara, CA 93106 

Abstract 

The paper addresses the problem of analyzing the performance degradation caused by noise in 

power supply lines for deep submicron CMOS devices. We first propose a statistical modeling 

technique for the power supply noise including inductive ?I noise and power net IR voltage 

drop. The model is then integrated with a statistical timing analysis framework to estimate the 

performa nce degradation caused by the power supply noise. Experimental results of our analysis 

framework, validated by HSPICE, for benchmark circuits implemented on both 0.25 µ, 2.5 V and 

0.55 µ, 3.3 V technologies are presented and discussed. The results show that on average, with 

the consideration of this noise effect, the circuit critical path delays increase by 33% and 18%, 

respectively for circuits implemented on these two technologies. 

References 

[1] R. B. Brashear, N. Menezes, C. Oh, L. T. Pillage, and M. R. Mercer, “Predicting Circuit Performance Using 

Circuit-level Statistical Timing Analysis,” Proceedings of the European Design and Test Conference, pp. 332-337, 

1994. 

[2] K.-T. Cheng, A. Krstic and H.-C. Chen, “Generation of High Quality Tests for Robustly Untestable Path Delay 

Faults,” IEEE Transactions on Computers, Vol. 45, No.12, pp. 1379-1392, December 1996. 

[3] D. E. Goldberg, R. Burch, Genetic Algorithms in Search, Optimization, and Machine Learning, Reading, MA: 

Addison-Wesley, 1989. 

[4] Y.-M. Jiang, K.-T. Cheng, and A. Krstic, “Estimation of Maximum Power and Instantaneous Current Using a 

Genetic Algorithm,” Proc. of IEEE Custom Integrated Circuits Conference, pp. 135-138, May 1997. 

[5] H.-F. Jyu, S. Malik, S. Devadas, and K.W. Keutzer, “Statistical Timing Analysis of Combinational Logic 

Circuits,” IEEE Transactions on VLSI Systems, Vol. 1, No 2, pp. 126-137, June 1993. 

[6] H.-F. Jyu and S. Malik, “Statistical Delay Modeling in Logic Design and Synthesis,” Proceedings of Design 


[7] SYNOPSYS, “PowerMill Reference Manual,” August 1998. 

[8] D. R. Tryon, F. M. Armstrong, and M. R. Reiter, “Statistical Failure Analysis of System Timing,” IBM J. Res. 

Develop., pp. 340-355, July 1984. 

[9] G. de Veciana, M. Jacome, and J.-H. Guo, “Hierarchical Algorithms for Assessing Probabilistic Constraints on 

System Performance,” Proceedings of Design Automation Conference, pp. 251-256, June 1998.

DAC'99, pages 766-771 

A Floorplan-based Planning Methodology for Power and Clock Distribution in ASICs 

Joon-Seo Yim, Seong-Ok Bae, Chong-Min Kyung* 

DSP Group, Information Technology Lab., LG Corporate Institute of Technology, 16, 

Woomyeon-Dong, Seocho-Gu, Seoul, 137-140, Korea 

*Department of Electrical Engineering, KAIST, 373-1, Kusong-Dong, Yusong-Gu, 

Taejon, 305-701, Korea 

Abstract 

In deep submicron technology, IR-drop and clock skew issues become more crucial to the 

functionality of chip. This paper presents a floorplan-based power and clock distribution 

methodology for ASIC design. From the floorplan and the estimated power consumption, the 

power network size is determined at an early design stage. Next, without detailed gate-level 

netlist, clock interconnect sizing, the number and strength of clock buffers are planned for 

balanced clock distribution. This early planning methodology at the full-chip level enables us to 

fix the global interconnect issues before the detailed layout composition is started. 

References 

[1] Semiconductor Industry Association, National Technology Roadmap for Semiconductors, 1994 

[2] William E.Guthrie et al. “Noise and Signal Integrity in Deep Submicron Design", Proc. 34th DAC, pp.720-721, 

1997 

[3] David Blaauw, “IR-Drop Analysis Signal Net Noise Analysis", Proc. 34th DAC, Tutorial, 1997 

[4] Howard H. Chen and David D.Ling, “Power Supply Noise Analysis Methodology for Deep-Submicron VLSI 

Chip Design", Proc. 34th DAC, pp.638-643, June, 1997 

[5] G.Steele et al., “Full-Chip Verification Methods for DSM Power Distribution Systems", Proc. 35th DAC, 

pp.744-749, June, 1998 

[6] A.Dharchoudhury et al., “Design and Analysis of Power Distribution Networks in PowerPC Microprocessors", 

Proc. 35 th DAC, pp.738-743, June, 1998 

[7] P.E.Gronowski, “High-Performance Microprocessor Design", IEEE JSSC, Vol.33, no.5, pp.676-686, May 1998 

[8] Y.Shimazu, “High Speed Clock Design", ASP-DAC Tutorial, pp.40-53, 1997 

[9] H.Fair and D.Bailey, “Clocking Design and Analysis for a 600MHz Alpha Microprocessor", ISSCC Digest of 

Technical Papers, pp.398-399, 1998 

[10] M.Edahiro, “Delay Minimization for Zero-Skew Routing", ICCAD-93, pp.563-566, 1993 

[11] J.G.Xi et al., “Useful-Skew Clock Routing with Gate Sizing for Low Power Design", Proc. 33th DAC, pp.383- 

388, 1996 

[12] C.P.Chen et al., “Fast Performance-Driven Optimization for Buffered Clock Trees Based on Largrangian 

Relaxation", Proc. 33rd DAC, pp.405-408, 1996 

[13] “H2SD480i HDTV all-format single chip decoder for HDTV Settop box & PC Add-on card for HDTV 

receiving", Rev.0.3, Preliminary Specification, LG CIT, 1998 

[14] J.Cong et al., “Analysis and Justification of a Simple, Practical 2 1/2-D Capacitance Extraction Methodology", 

Proc. 34 th DAC, pp.627-632, June, 1997 

[15] F.Dartu and L.T.Pileggi, “Calculating Worst-Case Gate Delay Due to Dominant Capacitance Coupling", Proc. 

34th DAC, pp.46-51, June, 1997 

[16] “Raphael NES", TMA, 1997 

[17] “Apollo Fundamentals Training Guide", Avant!, 1997

DAC'99, pages 772-777 

Digital Detection of Analog Parametric Faults in SC Filters 

Ramesh Harjani and Bapiraju Vinnakota 

University of Minnesota, Minneapolis, MN 55455 

Abstract 

Many design for test techniques for analog circuits are ineffective at detecting multiple 

parametric faults because either their accuracy is poor, or the circuit is not tested in the 

configuration it is used in. We present a DFT scheme that offers the accuracy needed to test 

high-quality circuits. The DFT scheme is based on a circuit that digitally measures the ratio of a 

pair of capacitors. The circuit is used to completely characterize the transfer function of a 

switched capacitor circuit, which is usually determined by capacitor ratios. In our DFT scheme, 

capacitor ratios can be measured to within 0.01% accuracy, and filter parameters can be shown 

to be satisfied to within 0.1% accuracy. A filter can be shown to satisfy all its functional 

specifications through this characterization process. We believe the accuracy of our scheme is at 

least an order of magnitude greater than that offered by any other scheme reported in the 

literature. 

References 

[1] B. Vinnakota, Ed., Analog and Mixed-Signal Test, Prentice-Hall, 1998. 

[2] K. Arabi and B. Kaminska, “Oscillation-test strategy for analog and mixed-signal integrated circuits,” in 14th 

IEEE VLSI Test Symposium, pp. 476-482, April 1996. 

[3] C.-Y. Pang and K.-T. Cheng, and S. Gupta, ”A comprehensive fault macromodel for op amps,” in IEEE 

International Conference on Computer Aided Design, 1994. 

[4] M. Soma, ”A design-for-test methodology for active analog filters,” in Proc. IEEE International Test 

Conference, pp. 183-192, 1990. 

[5] C. Dufaza and H. Ihs, ”Test synthesis for DC test and maximal diagnosis of switched capacitor circuits,” in 15th 

IEEE VLSI Test Symposium, pp. 252-259, 1997. 

[6] R. Harjani and B. Vinnakota, ”Analog circuit observer blocks,” in IEEE Transactions on Circuits and Systems II, 

pp. 258-263, 1997. 

[7] S. Mir, V. Kolarik, M. Lubaszewski, C. Nielsen, and B. Courtois, ”Built-in self-test and fault diagnosis of fully 

differential analogue circuits,” in IEEE International Conference on Computer Aided Design, 1994. 

[8] J. L. Huertas, A. Rueda, and D. Vazquez, ”Improving the testability of switched capacitor filters,” Analog 

Integrated Circuits and Signal Processing, vol. 4, Kluwer Academic Publishers, pp. 199 -213, 1993. 

[9] R. Harjani, B. Vinnakota and W.-Y. Choi,”Pseudoduplication: an ACOB technique for single-ended circuits,” in 

Int. Conf VLSI Design, Hyderabad, India, January 1997. 

[10] A. Chatterjee, ”Concurrent error detection in linear analog and switched-capacitor state variable systems using 

continuous checkers,” in IEEE International Test Conference, pp. 582-591, 1991. 

[11] D. Vazquez, A. Rueda, and J. L. Huertas, ”A new strategy for testing analog filters,” in IEEE VLSI Test 

Symposium, pp. 36-41, 1994. 

[12] J. B. Shyu, G. C. Temes, and F. Krummenarcher, ”Random errors in MOS capacitors,” IEEE Journal of Solid 

State Circuits, pp. 1070-1075, 1982. 

[13] M. J. M. Pelgrom, A. C. J. Duinmaijer, and A. P. G.Welvers, ”Matching properties of MOS transistors,” IEEE 

Journal of Solid-State Circuits, October 1989. 

[14] L. Milor and A. Sangiovanni-Vincentelli, ”Optimal test set design for analog circuits,” in IEEE International 

Conference on Computer Aided Design, pp. 294-297, November 1990. 

[15] C. W. Helstrom, Programability and Stochastic Processes for Engineers, MacMillan Publishing Company, 

1991. 

[16] G. N. Stenbakken and T. M. Souders, ”Linear error modeling of analog and mixed-signal devices,” in IEEE 

International Test Conference, pp. 573-581, 1991. 

[17] Roubik Gregorian and Gabor C. Temes, ”Analog MOS Integrated Circuits for Signal Processing”, Wiley and 

Sons, 1986

[18] Ramesh Harjani and Tom Lee, ”FRC: A Method for Extending the Resolution of Nyquist Rate Converters 

using Oversampling”, IEEE Transactions on Circuits and Systems II, pp 482-494, April 1998 

[19] B. Veillette and G. Roberts, Spectrum based built-in self-test. , Analog and Mixed-Signal Test, Prentice-Hall, 

1998.

DAC'99, pages 778-783 

Application of High Level Interface-based Design to Telecommunications System 

Hardware 

Dyson Wilkes 

Ericsson Components Ltd., UK 

M.M. Kamal Hashmi 

International Computers Ltd., UK. 

Abstract 

The assumption in moving system modelling to higher levels is that this improves the design 

process by allowing exploration of the architecture, providing an unambiguous specification and 

catching system errors early. We used the interface-based high level abstractions of VHDL+ in a 

real design, and in parallel with the actual project to investigate the validity of these claims. 

References 

[1] A.Jebson, C.Jones and H.Vosper: CHISLE: An Engineer’s tool for hardware system design, ICL Technical 

Journal Vol. 8 No. 3 May 1993. 

[2] IEEE Standard VHDL Language Reference Manual. IEEE Std 1076-1993, The Institute of Electrical and 

Electronic Engineers, New York, USA, 1994. 

[3] Anders Olsen, Over Færgemand et al.: Systems Engineering Using SDL-92, Elsevier, 1994. 

[4] M.M. Kamal Hashmi and Alistair C. Bruce: Design and Use of a System-Level Specification and Verification 

Methodology, IEEE European Design Automation Conference 1995. 

[5] Dyson Wilkes 1996, SYSTEL Project Proposal, EKA/NR/W-96:136. Ericsson internal document. 

[6] J.A. Rowson and A. Sangiovanni-Vincentelli, Interface-based Design, Proceedings of the 34th Design 

Automation Conference 1997. 

[7] S. Hodgsom and M.M.K. Hashmi, SuperVISE – System Specification and Design methodology, ICL Systems 

Journal Vol. 12 Issue 2 November 1997. 

[8] A. Sangiovanni-Vincentelli, P.C. McGeer and A. Saldanha, Verification of Electronic Systems, Proceedings of 

the 33 rd Design Automation Conference 1996. 

[9] M.M.Kamal Hashmi, ICL: VHDL+ Language Reference Manual, Available on-line at http://www.icl.com/da. 

[10] F. Belina, D. Hogrefe, A Sarma; "SDL with Applications from Protocol Specification” Prentice Hall, 1991 

(SDL, ITU Recommendation Z.100 ). 

[11] Kenneth J. Turner(editor); "Using Formal Description Techniques - An Introduction to ESTELLE, LOTOS and 

SDL”, Wiley, 1993 (LOTOS, ISO/IEC 8807 ). 

[12] R.B.Cooper, Introduction to Queuing Theory, Edward Arnold Ltd., 1981 

[13] J.F. Hayes, Modelling and Analysis of Computer Communication Networks, Plenum Press, New York, 1986 

[14] Project 23909: SYSTEL - Final Report, European Commission 

[15] Yossi Malka, Avi Ziv, Design Reliability – Estimation thourgh Statistical Analysis of Bug Discovery Data, 

Proc. DAC 1998, ACM

DAC'99, pages 784-789 

Hardware Reuse at the Behavioral Level 

Patrick Schaumont, Radim Cmar, Serge Vernalde, Marc Engels, Ivo Bolsens 

IMEC vzw, B-3001 Leuven Belgium 

Abstract 

Standard interfaces for hardware reuse are currently defined at the structural level. In contrast to 

this, our contribution defines the reuse interface at the behavioral register transfer (RT) level. 

This promotes direct reuse of functionality and avoids the integration problems of structural 

reuse. We present an object oriented reuse interface in C++ and show the use of it within two 

real-life designs. 

References 

[1] P. Ashenden, P. Wilsey, and D. Martin. Reuse through genericity in suave. In Proc. VIUF 1997 Fall Conf., pages 

170-177. 

[2] B. Djafri and J. Benzakki. Oovhdl: Object oriented vhdl. In Proc. VIUF 1997 Fall Conf., pages 54-59. 

[3] E. Gamma, R. Helm, R. Johnson, and J. Vlissides. Design Patterns: Elements of Reusable Object-Oeriented 

Software. Addison-Wesley, Reading, MA, 1994. 

[4] R. K. Gupta and S. Y. Liao. Using a programming language for digital system design. IEEE Design and Test of 

Computers, pages 72 - 80, April-June 1997. 

[5] Ocapi Homepage. http://www.imec.be/ocapi. 

[6] G. Lehmann, B.Wunder, and K. Muller-Glaser. A vhdl reuse workbench. In Proc. EDAC 1996, pages 412-417. 

[7] G. Martin. Design methodologies for system level ip. In Proc. DATE 1998, pages 286-302. 

[8] P. Schaumont, S. Vernalde, L. Rijnders, M. Engels, and I. Bolsens. A programming environment for the design 

of complex high speed asics. In Proceedings 35th Design Automation Conference, pages 315 - 320, San Francisco, 

CA, 1998. 

[9] C. Schneider and W. Ecker. Stepwise refinement of behavioral vhdl specifications by separation of 

synchronization and functionality. In Proc. EURODAC 1996, pages 509-514. 

[10] G. Schumacher, W. Nebel, and C. von Ossietzky. Object-oriented modeling of parallel hardware systems. In 

Proc. DATE 1998, pages 234-241. 

[11] S. Vercauteren and Bill Lin. Hardware/software Communication and System Integration for Embedded 

Architectures. Design Automation of Embedded Systems, Kluwer Academic Publishers, 2:1-24, 1997. 

[12] C. Weiler, U. Kebschull, and W. Rosenstiel. C++ base classes for specification, simulation and partitioning of a 

hardware/software system. In Proc. ASP-DAC 1995, CHDL 1995, VLSI 1995, pages 777-784.

DAC'99, pages 790-793 

Description and Simulation of Hardware/Software Systems with Java 

Tommy Kuhn, Wolfgang Rosenstiel 

University of Tübingen, Sand 13, Germany 

Udo Kebschull 

University of Leipzig, Augustusplatz, Germany 

ABSTRACT 

In this paper a newly developed object model is presented which allows the description of 

hardware/ software systems in all its parts. An adaption of the component model JavaBeans 

allows to combine different kinds of reuse in one unitary language. A model based design flow 

and some tools are presented and applied to a JPEG example. 

Keywords: Object oriented hardware modeling, simulation, codesign. 

REFERENCES 

[1] Helaihel, R., and Olukotun, K. Java as a Specification language for Hardware-Software Systems. In Proc. 

ICCAD’97 

[2] Kuhn, T., and Rosenstiel, W. Java Based Modeling and Simulation of Digital Systems on Register Transfer 

Level. In Int. Workshop on System Design Automation, Dresden, 1998. 

[3] Liao, S., et. al. An Efficient Implementation of Reactivity for Modeling Hardware in Scenic Design 

Environment. In Proc. of the 34th DAC, 1997. 

[4] Nebel, W., and Schumacher, G. Object-Oriented Hardware Modelling - Where to apply and what are the 

objects? In Proc. of the Euro-Dac ‘96 with Euro-VHDL. 

[5] Rational Software Corporation. URL: http://www.rational.com/uml 

[6] Swamy, S., and Molin, A., and Covnot, B. OO-VHDL: Object-Oriented Extensions to VHDL. IEEE Computer, 

October, 1995 

[7] Young, J.S., et al. Design and Specification of Embedded Systems in Java Using Successive, Formal 

Refinement. In Proc. of the DAC’98, 70-75

DAC'99, pages 794-797 

Java Driven Codesign and Prototyping of Networked Embedded Systems 

Josef Fleischmann*, Klaus Buchenrieder**, Rainer Kress** 

*Technical University of Munich, Inst. of Electronic Design Automation, 

D-80290 Munich, Germany 

**Siemens AG, Corporate Technology, D-81730 Munich, Germany 

Abstract 

While the number of embedded systems in consumer electronics is growing dramatically, several 

trends can be observed which challenge traditional codesign practice: An increasing share of 

functionality of such systems is implemented in software; flexibility or reconfigurability is added 

to the list of non-functional requirements. Moreover, networked embedded systems are equipped 

with communication capabilities and can be controlled over networks. In this paper, we present a 

suitable methodology and a set of tools targeting these novel requirements. JACOP is a codesign 

environment based on Java and supports specification, co-synthesis and prototyping of 

networked embedded systems. 

References 

[1] P. Bellows, B. Hutchings: JHDL - An HDL for Reconfigurable Systems. In IEEE Symposium on Field- 

Programmable Custom Computing Machines, 1998. 

[2] Peter Clarke: Tricore to get flash FPGA integration. In EE Times, No. 1000; CMP Media, 1998. 

[3] J. Fleischmann, et. al.: A Hardware/Software Prototyping Environment for Dynamically Reconfigurable 

Embedded Systems. In Int. Workshop on HW/SW Codesign (CODES), 1998. 

[4] R. Helaihel, K. Olukotun: Java as a Specification Language for Hardware-Software Systems. In Int. Conf. on 

Computer-Aided Design (ICCAD), 1997. 

[5] JavaBeans API specification, Sun Microsystems, http://java.sun.com/beans, 1998. 

[6] A. Kalavade and P. Moghe: A Tool for Performance Estimation of Networked Embedded End-Systems. In 

Design Automation Conference (DAC), 1998. 

[7] T. Kuhn, W. Rosenstiel: Java Based Modeling and Simulation of Digital Systems on Register Transfer Level. In 

Workshop on System Design Automation, 1998. 

[8] D. E. Lechner and S. A. Guccione: The Java Environment for Reconfigurable Computing. In Int. Workshop on 

Field-Programmable Logic and Applications (FPL), 1997. 

[9] National Semiconductor: Napa1000 Adaptive Processor, http://www.national.com/appinfo/milaero/napa1000, 

1998. 

[10] S. Nisbet, S. A. Guccione: The XC6200DS Development System. In Int. Workshop on Field-Programmable 

Logic and Applications (FPL), 1997. 

[11] R. Passerone, et al.: Modeling Reactive Systems in Java. In Int. High Level Design Validation and Test 

Workshop, Nov. 1997. 

[12] M. Vasilko: Dynamically Reconfigurable Hardware WWW Library, Bournemouth University, 

http://dec.bournemouth.ac.uk/drhw_lib/ 

[13] J. S. Young, et. al.: Design and Specification of Embedded Systems in Java Using Successive, Formal 

Refinement. In Design Automation Conference (DAC), 1998.

DAC'99, page 798 

Panel: Subwavelength Lithography: How Will it Affect Your Design Flow? 

Chair: Andrew B. Kahng – UCLA Computer Science Department, Los Angeles, CA 

Panel Members: Y. C. Pati, Warren Grobman, Robert Pack, Lance Glasser, 

Kenneth V. Rousseau 

In the sub 0.25 micron regime, IC feature sizes become smaller than the wavelength of light used 

for silicon exposure. Resultant light distortions create patterns on silicon that are substantially 

different from a GDSII layout. Although light distortions have traditionally not affected the 

design flow, the techniques used to control these distortions have a potential impact on the 

design flow that is as formidable as the recently addressed Deep Sub-Micron transition. This 

session will discuss the design implications arising from techniques used to control subwavelength 

lithography. It will begin with an embedded tutorial on subwavelength mask design 

techniques and their resultant effect on the IC design process. The panel will then debate the 

extent of the resulting impact on IC performance, design flow, and CAD tools.

DAC'99, pages 799-804 

Subwavelength Lithography and its Potential Impact on Design and EDA 

Andrew B. Kahng and Y. C. Pati† 

UCLA Department of Computer Science, Los Angeles, CA 90095-1596 USA 

†Numerical Technologies, Inc., Santa Clara, CA 95051 USA 

Abstract 

This tutorial paper surveys the potential implications of subwave-length optical lithography for 

new tools and flows in the interface between layout design and manufacturability. We review 

control of optical process effects by optical proximity correction (OPC) and phase-shifting masks 

(PSM), then focus on the implications of OPC and PSM for layout synthesis and verification 

methodologies. Our discussion addresses the necessary changes in the design-to-manufacturing 

flow, including infrastructure development in the mask and process communities, evolution of 

design methodology, and opportunities for research and development in the physical lay-out and 

verification areas of EDA. 

References 

[1] A. CHATTERJEE, I. ALI, K. JOYNER, D.MERCER, ET AL., Integration of Unit Processes in a Shallow 

Trench Isolation Module for a 0.25 µm Complementary Metal-Oxide Semiconductor Technology, Journal of 

Vacuum Science and Technology B, 15 (1997), pp. 1936–1942. 

[2] J. F. CHEN, T. LAIDIG, K. E. WAMPLER, AND R. CALDWELL, Practical Method for Full-Chip Optical 

Proximity Correction, in SPIE, vol. 3051, 1997, pp. 790–803. 

[3] V. K. R. CHILUVURI AND I. KOREN, Layout-Synthesis Techniques for Yield Enhancement, IEEE Trans. 

Semiconductor Manufacturing, 8 (1995), pp. 178–187. 

[4] G. GALAN, F. LALANNE, M. TISSIER, AND M. BELLEVILLE, Alternating phase shift generation for 

complex circuit designs, in SPIE 16th Annual BACUS Symposium on Photomask Technology, vol. SPIE 2884, 

1996, pp. 508–519. 

[5] P. GILBERT ET AL., A High Performance 1.5V, 0.10um Gate Length CMOS Technology with Scaled Copper 

Metalization, IEDM 1998, pp. 1013-1016. 

[6] W. B. GLENDINNING AND J. N. HELBERT, Handbook of VLSI Microlithography: Principles, Technology, 

and Applications, Noyes Publications, 1991. 

[7] F. O. HADLOCK, Finding a Maximum Cut of a Planar Graph in Polynomial Time, SIAM J. Computing, 4 

(1975), pp. 221–225. 

[8] A. B. KAHNG, S.MUDDU, E. SARTO, AND R. SHARMA, Interconnect Tuning Strategies for High- 

Performance ICs, in Proc. Conference on Design Automation and Test in Europe, February 1998. 

[9] A. B. KAHNG, G. ROBINS, A. SINGH, H.WANG, AND A. ZELIKOVSKY, Filling and Slotting : Analysis 

and Algorithms, in Proc. International Symposium on Physical Design, 1998, pp. 95–102. 

[10] A. B. KAHNG, H.WANG, AND A. ZELIKOVSKY, Automated Layout and Phase Assignment Techniques for 

Dark Field Alternating PSM, in Proc. SPIE 18th Annual BACUS Symposium on Photomask Technology, 1998. 

[11] M. D. LEVENSON, Wavefront engineering from 500 nm to 100 nm CD, in Proceedings of the SPIE - The 

International Society for Optical Engineering, vol. 3049, 1997, pp. 2–13. 

[12] M. D. LEVENSON, N. S. VISWANATHAN, AND R. A. SIMPSON, Improving Resolution in 

Photolithography with a Phase-Shifting Mask, IEEE Trans. on Electron Devices, ED-29 (1982), pp. 1828–1836. 

[13] L. LIEBMANN, A. MOLLESS, R. FERGUSON, A. WONG, AND S. MANSFIELD, Understanding Across 

Chip Line Width Variation: The First Step Toward Optical Proximity Correction, in SPIE, vol. 3051, 1997, pp. 124– 

136. 

[14] L. W. LIEBMANN, T.H.NEWMAN, R. A. FERGUSON, R. M. MARTINO, A. F. MOLLESS, M. O. 

NEISSER, AND J. T. WEED, A Comprehensive Evaluation of Major Phase Shift Mask Technologies for Isolated 

Gate Structures in Logic Designs, in SPIE, vol. 2197, 1994, pp. 612–623. 

[15] H.-Y. LIU, L. KARKLIN, Y.-T. WANG, AND Y. C. PATI, The Application of Alternating Phase-Shifting 

Masks to 140 nm Gate Patterning (II): Mask Design and Manufacturing Tolerances, in SPIE 

OpticalMicrolithography XI, vol. 3334, Feb. 1998, pp. 1-14.

[16] H.-Y. LIU, L. KARKLIN, Y.-T.WANG, AND Y. C. PATI, The Application of Alternating Phase-Shifting 

Masks to 140 nm Gate Patterning: Line Width Control Improvements and Design Optimization, in SPIE 17th 

Annual BACUS Symposium on Photomask Technology, vol. SPIE 3236, 1998, pp. 328–337. 

[17] Y. LIU, A. ZAKHOR, AND M. A. ZUNIGA, Computer-Aided Phase Shift Mask Design with Reduced 

Complexity, IEEE Transactions on Semiconductor Manufacturing, 9 (1996), pp. 170–181. 

[18] W. MALY, Computer-aided design for VLSI circuit manufacturability, Proceedings of IEEE, 78 (1990), pp. 

356–392. 

[19] W. MALY, Moore’s Law and Physical Design of ICs, in Proc. International Symposium on Physical Design, 

Monterey, California, April 1998. special address. 

[20] A. MISAKA, A. GODA, K. MATSUOKA, H. UMIMOTO, AND S. ODANAKA, Optical Proximity 

Correction in DRAM Cell Using a New Statistical Methodology, in SPIE, vol. 3051, 1997, pp. 763–773. 

[21] A. MONIWA, T. TERASAWA, N. HASEGAWA, AND S. OKAZAKI, Algorithm for Phase-Shift Mask 

Design with Priority on Shifter Placement, Jpn. J. Appl. Phys., 32 (1993), pp. 5874–5879. 

[22] A. MONIWA, T. TERASAWA, K. NAKAJO, J. SAKEMI, AND S. OKAZAKI, Heuristic Method for Phase- 

Conflict Minimization in Automatic Phase-Shift Mask Design, Jpn. J. Appl. Phys., 34 (1995), pp. 6584–6589. 

[23] J. NISTLER, G. HUGHES, A. MURAY, AND J. WILEY, Issues Associated with the Commercialization of 

Phase ShiftMasks, in SPIE 11th Annual BACUS Symposium on Photomask Technology, vol. SPIE 1604, 1991, pp. 

236–264. 

[24] K. OOI, S. HARA, AND K. KOYAMA, Computer Aided Design Software for Designing Phase-Shifting 

Masks, Jpn. J. Appl. Phys., 32 (1993), pp. 5887–5891. 

[25] K. OOI, K. KOYAMA, AND M. KIRYU, Method of Designing Phase-Shifting Masks Utilizing a Compactor, 

Jpn. J. Appl. Phys., 33 (1993), pp. 6774–6778. 

[26] G. I. ORLOVA AND Y. G. DORFMAN, Finding the Maximum Cut in a Graph, Engr. Cybernetics, 10 (1972), 

pp. 502–506. 

[27] P. RAI-CHOUDHURY, Handbook of Microlithography, Micromachining, and Microfabrication, vol. 1: 

Microlithography, SPIE Optical Engineering Press, Bellingham, 1997. 

[28] F. M. SCHELLENBERG, H. ZHANG, AND J. MORROW, Evaluation of OPC Efficacy, in Proc. Intl. Symp. 

on Aerospace/Defense Sensing and Dual-Use Photonics, vol. 2726, 1996, pp. 680–688. 

[29] SEMATECH, Workshop Notes, in 3rd SEMATECH Litho-Design Workshop, Skamania Lodge, February 1996. 

[30] SIA, The National Technology Roadmap for Semiconductors, Semiconductor Industry Association, December 

1997. 

[31] B. E. STINE, D. S. BONING, J. E. CHUNG, AND L. CAMILLETTI, The Physical and Electrical Effects of 

Metal-fill Patterning Practices for Oxide Chemical-Mechanical Polishing Processes, IEEE Transactions on Electron 

Devices, 45 (1998), pp. 665–679. 

[32] B. E. STINE, V. MEHROTRA, D. S. BONING, J. E. CHUNG, AND D. J. CIPLICKAS, A Simulation 

Methodology for Assessing the Impact of Spatial/Pattern Dependent Interconnect Parameter Variation on Circuit 

Performance, in IEDM Technical Digest, 1997, pp. 133–136. 

[33] D. SYLVESTER AND K. KEUTZER, Getting to the Bottom of Deep-Submicron, in Proc. IEEE Intl. Conf. 

Computer-Aided Design (to appear), November 1998. 

[34] T. WAAS, H. HARTMANN, AND W. HENKE, Automatic Generation of Phase Shift Mask Layouts, 

Microelectronic Engineering, 23 (1994), pp. 139–142.

DAC'99, pages 805-810 Synthesis of Embedded Software Using Free-Choice Petri Nets 

Marco Sgroi*, Luciano Lavagno**, YosinoriWatanabe** and Alberto Sangiovanni-Vincentelli* 

* University of California, Berkeley, CA 

** Cadence Design Systems 

Abstract 

Software synthesis from a concurrent functional specification is a key problem in the design of 

embedded systems. A concurrent specification is well-suited for medium-grained partitioning. 

However, in order to be implemented in software, concurrent tasks need to be scheduled on a 

shared resource (the processor). The choice of the scheduling policy mainly depends on the 

specification of the system. For pure dataflow specifications, it is possible to apply a fully static 

scheduling technique, while for algorithms containing data-dependent control structures, like the 

if-then-else or while-do constructs, the dynamic behaviour of the system cannot be completely 

predicted at compile time and some scheduling decisions are to be made at run-time. For such 

applications we propose a Quasi-static scheduling (QSS) algorithm that generates a schedule in 

which run-time decisions are made only for data-dependent control structures. We use Free 

Choice Petri Nets (FCPNs), as underlying model, and define quasi-static schedulability for 

FCPNs. The proposed algorithmis complete, in that it can solve QSS for any FCPN that is quasistatically 

schedulable. Finally, we show how to synthesize from a quasi-static schedule a C code 

impleme ntation that consists of a set of concurrent tasks. 

References 

[1] E.A.Lee and D.G.Messerschmitt. Static scheduling of synchronous dataflowprograms for digital signal 

processing. IEEE Transactions on computers, January 1987. 

[2] E.Filippi et al. Intellectual property re-use in embedded system co-design: an industrial case study. In 

InternationalSymposium System Synthesis, December 1998. 

[3] F. Thoen et al. Real-time multi-tasking in software synthesis for information processing systems. In Proceedings 

of the International System Synthesis Symposium, 1995. 

[4] E.Teruel. Structure theory of Weighted Place/Transition Net systems. The Equal Conflict hiatus. Ph.D 

dissertation. Universidad de Zaragoza, 1994. 

[5] J.Buck. Scheduling dynamic dataflow graphs with bounded memory using the token flow model. Ph.D 

dissertation. UC Berkeley, 1993. 

[6] B. Lin. Software synthesis of process-based concurrent programs. In Proceedings of the Design Automation 


[7] M.Hack. Analysis of Production Schemata by PetriNets.Master thesis. MIT, 1972. 

[8] M. Sgroi. Quasi-static scheduling of embedded software using free-choice petri nets. Technical Report Memo 

No. UCB/ERL M98/, M.S. dissertation. UC Berkeley,May 1998. 

[9] T.Murata. Petri nets: properties, analysis and applications. In Proceedings of the IEEE, April 1989.

DAC'99, pages 811-816 

Exact Memory Size Estimation for Array Computations without Loop Unrolling 

Ying Zhao and Sharad Malik 

Department of Electrical Engineering, Princeton University, Princeton, New Jersey 

Abstract 

This paper presents a new algorithm for exact estimation of the minimum memory size required 

by programs dealing with array computations. Memory size is an important factor affecting area 

and power cost of memory units. For programs dealing mostly with array computations, memory 

cost is a dominant factor in the overall system cost. Thus, exact estimation of memory size 

required by a program is necessary to provide quantitative information for making high-level 

design decisions. 

Based on formulated live variables analysis, our algorithm transforms the minimum memory size 

estimation into an equivalent problem: integer point counting for intersection/union of mappings 

of parameterized polytopes. Then, a heuristics was proposed to solve the counting problem. 

Experimental results show that the algorithm achieves the exactness traditionally associated with 

totally-unrolling loops while exploiting the reduced computation complexity by preserving 

original loop structure. 

References 

[1] A.Sudarsanam. Code optimization libraries for retargetable compilation for embedded digital signal processors. 

Phd thesis, Princeton University, May 1998. 

[2] P. Clauss. Counting solutions to linear and nonlinear constraints through ehrhart polynomials: Applications to 

analyze and transform scientific programs. 10th ACM Int. Conf. on Supercomputing, May 1996. 

[3] P. Clauss. Handling memory cache policy with integer points countings. Euro-Par'97, pages 285-293, 1997. 

[4] H. M. E.De Greef, F.Catthoor. Array placement for storage size reduction in embedded multimedia systems. 

11th International Conference on Application-specific Systems, Architectures and processors, July 1997. 

[5] H. D. M. F. Balasa, F. Catthoor. Background memory area estimation for multi-dimensional signal processing 

systems. IEEE Trans. on Comp-aided Design, CAD-14, 1995. 

[6] A. F.J.Kurdahi. Real: a program for register allocation. Proc. 24th DAC, pages 210-215, June 1987. 

[7] C. Lengauer. Loop parallelization in the polytope model. in e.best. CONCUR'93, Lecture Notes in Computer 

Science 715, pages 398-416, 1993. 

[8] W. Pugh. Counting solutions to presburger formulas: How and why. Proc. of the 1994 ACM SIGPLAN 

Conference on Programming Language Design and Implementation, 1994. 

[9] A. Sudarsanam and S. Malik. Simultaneous reference allocation in code generation for dual data memory bank 

asips. To be published in ACM Transactions on Design Automation for Electronic Systems, 1999.

DAC'99, pages 817-822 

Constraint Driven Code Selection for Fixed-Point DSPs 

Steven Bashford, Rainer Leupers 

Dept. of Computer Science 12, University of Dortmund, Germany 

Abstract 

Fixed-point DSPs are a class of embedded processors with highly irregular architectures. This 

irregularity makes it difficult to generate high-quality machine code from programming 

languages such as C. In this paper we present a novel constraint driven approach to code 

selection for irregular processor architectures, which provides a twofold improvement of earlier 

work. First, it handles complete data flow graphs instead of trees and thereby generates better 

code in presence of common subexpressions. Second, the presented technique is not restricted to 

computation of a single solution, but it generates alternative solutions. This feature enables the 

tight coupling of different code generation phases, resulting in better exploitation of instructionlevel 

parallelism. Experimental results indicate that our technique is capable of generating 

machine code that competes well with handwritten assembly code. 

References 

[1] G. Araujo, S. Malik, and M. Lee. Using Register-Transfer Paths in Code Generation for Heterogeneous 

Memory-Register Architectures. In 33rd Design Automation Conference (DAC). 1996. 

[2] A. Fauth, G. Hommel, A. Knoll, and C. Mueller. Global code selection for directed acyclic graphs. In Peter A. 

Fritzson, editor, Compiler Construction, volume 786 of LNCS, pages 128–141. Springer–Verlag, 

Edinburgh, U.K., April 1994. 5’th International Conference, CC’94. 

[3] C. Fraser, R. Henry, and T. A. Proebsting. Engineering a Simple, Efficient Code-Generator Generator. ACM 

Letters on Programming Languages and Systems, 1(3):213–226, September 1992. 

[4] C.H. Gebotys. An Efficient Model for DSP Code Generation: Performance, Code Size, Estimated Energy. In 

10th International Symposium on System Synthesis (ISSS). 1997. 

[5] S. Hanono, G. Hadjiyiannis, and S. Devadas. Aviv: A Retargetable Code Generator Using ISDL. In Proc. 34th 

DAC’97, 1997. 

[6] D. Lanner, M. Cornero, G. Goossens, and H. De Man. Data routing: a paradigm for efficient data–path synthesis 

and code generation. In Proc. 7th IEEE/ACM Int. Symp. on High–Level Synthesis, May 1994. 

[7] R. Leupers. Retargetable Code Generation for Digital Signal Processors. Kluwer Academic Publishers, 1997. 

[8] R. Leupers and P. Marwedel. Retargetable code generation based on structural processor descriptions. In Design 

Automation for Embedded Systems, vol. 3, no. 1, 1998. 

[9] S. Liao, S. Devadas, K. Kreuzer, and S. Tjiang. Instruction Selection Using Binate Covering for Code Size 

Optimization. International Conference on CAD (ICCAD), 1995. 

[10] K. Marriott and P.J. Stuckey. Programming with Constraints: An Introduction. The MIT Press, 1998. 

[11] P. Marwedel and G. Goossens, editors. Code Generation for Embedded Processors. Kluwer Academic 


[12] P. Paulin, C. Liem, T. May, and S. Sutarwala. Flexware: A Flexible Firmware Developement Envirenment for 

Embedded Systems. In Marwedel and Goossens [11], chapter 4, pages 65–84. 

[13] K. Rimey and P.N. Hilfinger. Lazy Data Routing and Greedy Scheduling. In MICRO, volume 21, pages 111– 

115. 1988. 

[14] M. Wallace, S. Novello, and J. Schimpf. ECLiPSe: A Platform for Constraint Logic Programming, 1997. 

Publications at http://www.icparc.ic.ac.uk/. 

[15] T.Wilson, G. Grewal, S. Henshall, and D. Banerji. An ILP-Based Approach to Code Generation. In Marwedel 

andGoossens [11], chapter 6, pages 103–118. 

[16] V. Zivojnovic, J.M. Velarde, C. Schlaeger, and H. Meyr. DSPStone – A DSP oriented 

BenchmarkingMethodology. In ICSPAT. 1994.

DAC'99, pages 823-826 

Rapid Development of Optimized DSP Code From a High 

Level Description Through Software Estimations 

Alain Pegatoquet, Emmanuel Gresset 

VLSI Technology, 06560 Valbonne FRANCE 

Michel Auguin, Luc Bianco 

Université de Nice, Laboratoire I3S, 06041 Nice, FRANCE 

ABSTRACT 

Generation of optimized DSP code from a high level language such as C is very time consuming 

since current DSP compilers are generally unable to produce efficient code. We present a 

software estimation methodology from a C description that helps for a rapid development of DSP 

applications. Our tool VESTIM provides both a performance evaluation for assembly code 

generated by the compiler and an estimation of an optimized assembly code. Blocks of 

applications G.721 and G.728 have been evaluated using VESTIM. Results show that 

estimations are very accurate and allow software development time to be significantly reduced. 

Keywords: DSP, Code generation, Performance Estimation. 

REFERENCES 

[1] Edward A. LEE. Programmable DSP Architectures: Part 1. IEEE ASSP Magazine, October 1988. 

[2] Vojin Zivovjnovic et al. DSP Processor/Compiler Co-Design: A Quantitative Approach. Proc. ICSPAT, pp. 679- 

683, Boston, MA, USA, October 7-10, 1996. 

[3] Guido ARAUJO and Sharad MALIK, Code Generation for Fixed-Point DSPs. ACM Transactions on Design 

Automation of Electronics Systems, Vol. 3, No 3, July 1998. 

[4] C. Liem, P. Paulin and A. Jerraya, Address Calculation for Retargetable Compilation and Exploration of 

Instruction-Set Architectures, 33rd DAC, Las Vegas, Nevada, June 3-7, 1996. 

[5] VVF3500 C Compiler. Revision 1.0. Getting Started With the OakDSPCore C Compiler. VLSI Technology, 

1996. 

[6] S. Malik et al. Static Timing Analysis Of Embedded Software, 34th DAC, pp. 147-152, Anaheim, CA, 1997. 

[7] Marc SOLER et al. An Embedded DSP Platform for multistandard ITU G.728, G.729 and G.723.1 audio 

compression. Proc. ICSPAT, Boston, MA, October 7-10, 1996. 

[8] Jie Gong et al. Software Estimation from Executable Specifications, Technical Report ICS-93-5, March 8, 1993. 

[9] Rizos Sakellariou et al. Efficient Implementation of the ROW-Column 8x8 IDCT on VLIW Architectures, 

EUSIPCO, Vol. 2, pp. 869-872, Greece, Sept. 7-11, 1998. 

[10] Recommendation G.721, 32 kbit/s Adaptative Differential Pulse Code Modulation, ITU (1984). 

[11] Recommendation G.728, Coding of Speech at 16 kbit/s using Low-Delay Code Excited Linear Prediction , ITU 

(1994). 

[12] C. Liem et al., Industrial Experience using Rule-Driven Retargetable Code Generation for Multimedia 

Applications, 8th Symposium on System Level Synthesis, September 1995. 

[13] G. Goossens et al, Embedded Software in Real-Time Signal Processing Systems: Design Technologies, 

Proceedings of the IEEE, Vol. 85, No. 3, March 1997. 

[14] J-H Yang et al., MetaCore: An Application Specific DSP Development System, 35th DAC, pp. 800-803, CA, 

1998.

DAC'99, pages 827-830 

SOFTWARE ENVIRONMENT FOR A MULTIPROCESSOR DSP 

Asawaree Kalavade 

Networked Multimedia Research Dept., Bell Labs, Lucent Technologies, Murray Hill, NJ 07974 

Joe Othmer, Bryan Ackland, K. J. Singh 

DSP and VLSI Systems Research, Dept., Bell Labs, Lucent Technologies, Holmdel, NJ 07733 

ABSTRACT 

In this paper, we describe the software environment for Daytona, a single-chip, bus-based, 

shared-memory, multiprocessor DSP. The software environment is designed around a layered 

architecture. Tools at the lower layer are designed to deliver maximum performance and include 

a compiler, debugger, simulator, and profiler. Tools at the higher layer focus on improving the 

programmability of the system and include a run-time kernel and parallelizing tools. The runtime 

kernel includes a low-over-head, preemptive, dynamic scheduler with multiprocessor 

support that guarantees real-time performance to admitted tasks. 

Keywords: Multiprocessor DSP, media processor, software environment, run-time kernel, 

RTOS 

REFERENCES 

[1] B. Ackland et al. “A Single-Chip 1.6 Billion 16-b MAC/s Multiprocessor DSP”, Proc. CICC’99, May 1999. 

[2] C.L. Liu, J. W. Layland, “Scheduling Algorithms for Multiprogramming in a Hard-Real-Time Environment”, 

Journal of the ACM, vol. 20, no. 1, Jan, 1993, pp. 46-61. 

[3] DSP FAQ: What DSP operating systems are available? http://www.bdti.com/faq/7.htm 

[4] Spectron Mircrosystems. http://www.spectron.com 

[5] Eonic Systems. http://www.eonic.com

DAC'99, pages 831-836 

Robust FPGA Intellectual Property Protection Through Multiple Small Watermarks 

John Lach, William H. Mangione-Smith 

UCLA EE Department, Los Angeles, CA 90095 

Miodrag Potkonjak 

UCLA CS Department, Los Angeles, CA 90095 

ABSTRACT 

A number of researchers have proposed using digital marks to provide ownership identification 

for intellectual property. Many of these techniques share three specific weaknesses: complexity 

of copy detection, vulnerability to mark removal after revelation for ownership verification, and 

mark integrity issues due to partial mark removal. This paper presents a method for 

watermarking field programmable gate array (FPGA) intellectual property (IP) that achieves 

robustness by responding to these three weaknesses. The key technique involves using secure 

hash functions to generate and embed multiple small marks that are more detectable, verifiable, 

and secure than existing IP protection techniques. 

Keywords: Field programmable gate array (FPGA), intellectual property protection, 

watermarking 

REFERENCES 

[1] W. Bender et al., "Techniques for Data Hiding,” IBM Systems Journal, vol. 35, no 3-4, 1996, 313-336. 

[2] L. Boney et al., "Digital Watermarks for Audio Signals,” International Conference on Multimedia Computing 

and Systems, 1996. 

[3] E. Charbon, "Hierarchical Watermarking in IC Design,” Custom Integrated Circuits Conference, 1998. 

[4] I.J. Cox et al., "Secure Spread Spectrum Watermarking for Images, Audio, and Video,” International Conference 

on Image Processing, 1996. 

[5] S. Craver et al., "Can Invisible Watermarks Resolve Rightful Ownership?" Storage and Retrieval for Image and 

Video Databases, Proceedings of the SPIE, vol. 3022, 1997, 310-321. 

[6] W. Diffie and M. Hellman, "New Directions on Cryptography," IEEE Transactions on Information Theory, vol. 

IT-22, no. 6, Nov. 1976, 644-654. 

[7] S. Furber, ARM System Architecture, Menlo Park: Addison-Wesley, 1996, 329. 

[8] R. Goering, “IP98 Forum Exposes Struggling Industry – Undefined Business Models, Unstable Core Prices 

Cited,” EE Times, Issue 1000, March 30, 1998. 

[9] F. Hartung and B. Girod, "Copyright Protection in Video Delivery Networks by Watermarking of Pre- 

Compressed Video," ECMAST’97, Springer Lecture Notes in Computer Science, vol. 1242, 1997, 423-436. 

[10] I. Hong and M. Potkonjak, "Behavioral Synthesis Techniques for Intellectual Property Protection,” Design 

Automation Conference, 1999. 

[11] B. Hutchings et al., BYUcore: A MIPS R2000 Processor for FPGAs, 1997. 

[12] A.B. Kahng et al., "Robust IP Watermarking Methodologies for Physical Design," Design 

Automation Conference, 1998, 782-787. 

[13] A.B. Kahng et al., “Watermarking Techniques for Intellectual Property Protection,” Design Automation 

Conference, 1998, 776-781. 

[14] J. Lach, W. H. Mangione-Smith, and M. Potkonjak, “Fingerprinting Digital Circuits on Programmable 

Hardware,” International Workshop on Information Hiding, 1998, 16-31. 

[15] J. Lach, W. H. Mangione-Smith, and M. Potkonjak, “Signature Hiding Techniques for FPGA Intellectual 

Property Protection,” International Conference on Computer-Aided Design, 1998. 

[16] J. Leonard and W. H. Mangione-Smith, "A Case Study of Partially Evaluated Hardware Circuits: Key-Specific 

DES," Field Programmable Logic, 1997, 151-160. 

[17] J. Montanaro et al., “A 160MHz 32b 0.5W CMOS RISC Microprocessor,” IEEE Journal of Solid-State 

Circuits, vol. 31, no. 11, Nov. 1996, 1703-1714.

[18] B. Schneier, 1963- Applied Cryptography: Protocols, Algorithms, and Source Code in C, New York: John 

Wiley & Sons, 1996. 

[19] G.A. Spanos and T.B. Maples, "Performance Study of a Selective Encryption Scheme for the Security of 

Networked, Real-Time Video,” International Conference on Computer Communications and Networks, 1995. 

[20] M.D. Swanson et al., "Transparent Robust Image Watermarking," International Conference on Image 

Processing, 1996. 

[21] A.H. Tewfik and M. Swanson, "Data Hiding for Multimedia Personalization, Interaction, and Protection," IEEE 

Signal Processing Magazine, 1997, 41-44. 

[22] J. Turley, “ARM Grabs Embedded Speed Lead,” Microprocessor Report, vol. 10, 1996. 

[23] J. Villasenor et al., "Configurable Computing Solutions for Automatic Target Recognition," Proceedings of 

IEEE Workshop on FPGAs for Custom Computing Machines, 1996, 70-79. 

[24] R.B. Wolfgang and E.J. Delp, "A Watermark for Digital Images," Applications of Toral Automorphisms, vol. 3, 

1996, 219-222. 

[25] Xilinx, The Programmable Logic Data Book, San Jose, CA, 1996.

DAC'99, pages 837-842 

Robust Techniques For Watermarking Sequential Circuit Designs 

Arlindo L. Oliveira 

IST-INESC / CEL, 1000 Lisboa, Portugal 

Abstract 

We present a methodology for the watermarking of synchronous sequential circuits that makes it 

possible to identify the authorship of designs by imposing a digital watermark on the state 

transition graph of the circuit. The methodology is applicable to sequential designs that are made 

available as firm Intellectual Property (IP), the designation commonly used to characterize 

designs specified as structural descriptions or circuit netlists. 

The watermarking is obtained by manipulating the state transition graph of the design in such a 

way as to make it exhibit a chosen property that is extremely rare in non-watermarked circuits, 

while, at the same time, not changing the functionality of the circuit. This manipulation is 

performed without ever actually computing this graph in either implicit or explicit form. We 

present both theoretical and experimental results that show that the watermarking can be created 

and verified efficiently. 

References 

[1] H. Berghel and L. O’Gorman. Protecting ownership rights through digital watermarking. IEEE Computer, 

29(7):101–103, 1996. 

[2] R. Bryant. Graph-based algorithms for Boolean function manipulation. IEEE Transactions on Computers, 

35(8):677–691, August 1986. 

[3] E. Charbon. Hierarchical watermarking in IC design. In Proc. Custom Integrated Circuit Conference, pages 

295–298, Santa Clara, CA, May 1998. 

[4] O. Coudert, C. Berthet, and J. C. Madre. Verification of synchronous sequential machines based on symbolic 

execution. In J. Sifakis, editor, Proceedings of the Workshop on Automatic Verification Methods for Finite State 

Systems, volume 407 of Lecture Notes in Computer Science, pages 365–373. Springer-Verlag, June 1989. 

[5] E. M. Sentovich, K. J. Singh, L. Lavagno, C. Moon, R. Murgai, A. Saldanha, H. Savoj, P. R. Stephan, R. K. 

Brayton and A. Sangiovanni-Vincentelli. SIS: A system for sequential circuit synthesis. Technical report, U.C. 

Berkeley, May 1992. 

[6] H. Cho, G.D. Hachtel, and F. Somenzi. Redundancy identification/removal and test generation for sequential 

circuits using implicit state enumeration. IEEE Transactions on Computer-Aided Design of Integrated Circuits and 

Systems, 12(7):935–945, July 1993. 

[7] D. Kirovski, Y. Hwang, M. Potkonjak, and J. Cong. Intellectual property protection by watermarking 

combinational logic synthesis solutions. In Proc. of the ACM/IEEE International Conference on Computer Aided 

Design, pages 194–198. IEEE Computer Society Press, 1998. 

[8] J. Lach, W. H. Mangione-Smith, and M. Potkonjak. Signature hiding techniques for FPGA intellectual property 

protection. In Proc. of the ACM/IEEE International Conference on Computer Aided Design, pages 186–189. IEEE 

Computer Society Press, 1998. 

[9] J.-K. Rho, G. Hachtel, F. Somenzi, and R. Jacoby. Exact and heuristic algorithms for the minimization of 

incompletely specified state machines. IEEE Transactions on Computer-Aided Design, 13(2):167–177, February 

1994. 

[10] I. Torunoglu and E. Charbon. Watermarking-based copyright protection of sequential functions. In Proc. 

Custom Integrated Circuit Conference, Sa Diego, CA, May 1999.

DAC'99, pages 843-848 

Effective Iterative Techniques for Fingerprinting Design IP 

Andrew E. Caldwell, Hyun-Jin Choi, Andrew B. Kahng, Stefanus Mantik, Miodrag Potkonjak, 

Gang Qu and Jennifer L. Wong 

UCLA Computer Science Dept., Los Angeles, CA 90095-1596 

Abstract 

While previous watermarking-based approaches to intellectual property protection (IPP) have 

asymmetrically emphasized the IP provider's rights, the true goal of IPP is to ensure the rights of 

both the IP provider and the IP buyer. Symmetric fingerprinting schemes have been widely and 

effectively used to achieve this goal; however, their application domain has been restricted only 

to static artifacts, such as image and audio. In this paper, we propose the first generic symmetric 

fingerprinting technique which can be ap-plied to an arbitrary optimization/synthesis problem 

and, therefore, to hardware and software intellectual property. The key idea is to apply iterative 

optimization in an incremental fashion to solve a fingerprinted instance; this leverages the 

optimization effort already spent in obtaining a previous solution, yet generates a uniquely 

fingerprinted new solution. We use this approach as the basis for developing specific 

fingerprinting techniques for four important problems in VLSI CAD: partitioning, graph 

coloring, satisfiability, and standard-cell placement. We demonstrate the effectiveness of our 

fingerprinting techniques on a number of standard benchmarks for these tasks. Our approach 

provides an effective tradeoff between runtime and resilience against collusion. 

References 

[1] C. J. Alpert, “Partitioning Benchmarks for the VLSI CAD Community”, http://vlsicad.cs.ucla.edu/ 

~cheese/benchmarks.html 



[3] I. Biehl and B.Meyer, “Protocols for Collusion-Secure Asymmetric Fingerprinting”, Proc. 14th Annual 

Symposium on Theoretical Aspect of Computer Science, Springer-Verlag, 1997, pp. 399-412. 

[4] D. Boneh and J. Shaw, “Collusion-Secure Fingerprinting for Digital Data”, Proc. 15th annual International 

Cryptology Conference, Springer-Verlag, 1995, pp. 452-465. 

[5] S. Dutt and W. Deng, “VLSI Circuit Partitioning by Cluster-Removal Using Iterative Improvement Techniques”, 

Proc. IEEE International Conference on Computer-Aided Design, 1996, pp. 194-200. 



[7] M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-completeness, New 

York, W. H. Freeman and Company, 1979. 

[8] I. Hong and M. Potkonjak, “Behavioral Synthesis Techniques for Intellectual Property Protection”, unpublished 

manuscript, 1997. 

[9] A. B. Kahng, J. Lach, W. H. Mangione-Smith, S. Mantik, I. L. Markov, M. Potkonjak, P. Tucker, H. Wang and 

G. Wolfe, “Watermarking Techniques for Intellectual Property Protection”, Proc. ACM/IEEE Design Automation 

Conference, June 1998, pp. 776-781. 

[10] A. B. Kahng, S. Mantik, I. L. Markov, M. Potkonjak, P. Tucker, H. Wang and G. Wolfe, “Robust IP 

Watermarking Methodologies for Physical Design”, Proc. ACM/IEEE Design Automation Conference, June 1998, 

pp. 782-787. 


Journal 49 (1970), pp. 291-307. 

[12] D. Kirovski, Y. Hwang, M. Potkonjak and J. Cong, “Intellectual Property Protection by Watermarking 

Combinational Logic Synthesis Solutions”, Proc. IEEE/ACM International Conference on Computer Aided Design, 

1998.

[13] J.Lach, W.H.Mangione-Smith and M.Potkonjak, “FPGA Fingerprinting Techniques for Protecting Intellectual 

Property”, Proceedings of CICC, 1998. 

[14] I. H. Osman and J. P. Kelly, eds., Meta-Heuristics: Theory and Applications, Kluwer, 1996. 

[15] B. Pfitzmann, and M. Schunter, “Asymmetic Fingerprinting”, Proc. International Conference on the Theory 

and Application of Cryptographic Techniques, Springer-Verlag, 1996, pp. 84-95. 

[16] G. Qu andM. Potkonjak, “Analysis ofWatermarking Techniques for Graph Coloring Problem”, Proc. 

IEEE/ACMInternational Conference on Computer Aided Design, 1998. 

[17] R. H. Storer, S. D. Wu and R. Vaccari, “New Search Spaces for Sequencing Problems With Application to Job 

Shop Scheduling”, Management Science 38 (1992), pp. 1495-1509. 

[18] http://dimacs.rutgers.edu/ 

[19] http://aida.intellektik.informatik.th-darmstadt.de/˜hoos/SATLIB/

DAC'99, pages 849-854 

Behavioral Synthesis Techniques for Intellectual Property Protection 

Inki Hong*,** and Miodrag Potkonjak** 

* Synopsys, Inc. Mountain View, CA 94043 

** Computer Science Department, University of California, Los Angeles, CA 90095 

Abstract 

The economic viability of the reusable core-based design paradigm depends on the development 

of techniques for intellectual property protection. We introduce the first dynamic watermarking 

technique for protecting the value of intellectual property of CAD and compilation tools and 

reusable core components. The essence of the new approach is the addition of a set of design and 

timing constraints which encodes the author's signature. The constraints are selected in such a 

way that they result in minimal hardware overhead while embedding the signature which is 

unique and difficult to detect, remove and forge. We establish the first set of relevant metrics 

which forms the basis for the quantitative analysis, evaluation, and comparison of watermarking 

techniques. We develop a generic approach for signature data hiding in designs, which is 

applicable in conjunction with an arbitrary behavioral synthesis task, such as scheduling, 

assignment, allocation, and transformations. Error correcting codes are used to augment the 

protection of the signature data from tampering attempts. On a large set of design examples, 

studies indicate the effectiveness of the new approach in a sense that the signature data, which 

are highly resilient, difficult to detect and remove, and yet easy to verify, can be embedded in 

designs with very low hardware overhead. 

REFERENCES 

[1] W. Bender, D. Gruhl, N. Morimoto, and A. Lu. Techniques for data hiding. IBM Systems Journal, 35(3&4):313– 

336, 1996. 

[2] S. Craver, N. Memon, B. L. Yeo, and M. M. Yeung. Can invisible watermarks resolve rightful ownerships? 

Technical report, IBM Research Technical Report RC 20509, 1996. 

[3] R.E. Crochiere and A.V. Oppenheim. Analysis of linear digital networks. Proceedings of the IEEE, 63(4):581– 

595, 1975. 

[4] D. Fernandez. Intellectual property protection in the EDA industry. In Design Automation Conference, pages 

161–163, 1994. 

[5] M.R. Garey and D.S. Johnson. Computer and Intractability: A Guide to the theory of NP-Completeness. W. H. 

Freeman & Co., New York, NY, 1979. 

[6] E. Girczyc and S. Carlson. Increasing design quality and engineering productivity through design reuse. In 

Design Automation Conference, pages 48–53, 1993. 

[7] Virtual Socket Initiative. http://www.vsi.org. 

[8] D.S. Johnson, C.R. Aragon, L.A. McGeoch, and C. Schevon. Optimization by simulated annealing: an 

experimental evaluation; II. graph coloring and number partitioning. Operations Research, 39(3):378–406, 1991. 

[9] A. B. Kahng, et al. Robust IP Watermarking Methodologies for Physical Design. In Design Automation 

Conference, pages 782–787, 1998. 

[10] D. Kirovski, Y.-Y. Hwang, M. Potkonjak, and J. Cong. Intellectual Property Protection by Watermarking 

Combinational Logic Synthesis Solutions. In International Conference on Computer-Aided Design, pages 194–198, 

1998. 

[11] S. Lin and D.J. Costello. Error Control Coding. Prentice Hall, 1983. 

[12] G. De Micheli. Synthesis and optimization of digital circuits. McGraw-Hill, New York, NY, 1994.

DAC'99, pages 855-860 

Design and Implementation of a Scalable Encryption 

Processor with Embedded Variable DC/DC Converter 

James Goodman, Anantha Chandrakasan 

Department of EECS, Massachusetts Institute of Technology, Cambridge, MA 02139 

Abram P. Dancy 

SynQor, Hudson, MA 01749 

ABSTRACT 

This work describes the design and implementation of an energy-efficient, scalable encryption 

processor that utilizes variable voltage supply techniques and a high-efficiency embedded 

variable output DC/DC converter. The resulting implementation dissipates 134nJ/bit @ VDD = 

2.5V, when encrypting at its maximum rate of 1Mb/s using a maximum datapath width of 512 

bits. The embedded converter achieves an efficiency of 96% at this peak load. The processor is 

2-3 orders of magnitude more energy efficient than optimized assembly code running on a lowpower 

processor such as the StrongARM. 

REFERENCES 

[1] Blum, L., M. Blum, M. Shub, “A simple unpredictable pseudo-random number generator,” SIAM Journal on 

Computing, vol. 15, no. 2, pp. 364-383, May 1986. 

[2] Gutnik, V., A. P. Chandrakasan, “Embedded power supply for low power DSP,” IEEE Transactions on VLSI 

Systems, vol. 5, no.4, pp. 425-435, December 1997. 

[3] Takagi, N., “A radix-4 modular multiplication hardware algorithm for modular exponentiation,” IEEE 

Transactions on Computers, vol. 41, no. 8, pp. 949-956, August 1992. 

[4] Dancy, A. P., A. P. Chandrakasan, “Ultra low power control circuits for PWM converters,” IEEE Power 

Electronics Specialists Conference, pp. 21-27, 1997. 

[5] Wei, G-Y., M. Horowitz, “A low power switching power supply for self-clocked systems,” 1996 International 

Symposium on Low Power Electronics and Design, pp. 313-318, 1996.

DAC'99, pages 861-866 

Design Considerations for Battery-Powered Electronics 

Massoud Pedram, Qing Wu 

Department of Electrical Engineering-Systems 

University of Southern California, Los Angeles, CA 90089 

Abstract 

In this paper, we consider the problem of maximizing the battery life (or duration of service) in 

battery-powered CMOS circuits. We first show that the battery efficiency (or utilization factor) 

decreases as the average discharge current from the battery increases. The implication is that the 

battery life is a super-linear function of the average discharge current. Next we show that even 

when the average discharge current remains the same, different discharge current profiles 

(distributions) may result in very different battery lifetimes. In particular, the maximum battery 

life is achieved when the variance of the discharge current distribution is minimized. Analytical 

derivations and experimental results underline importance of the correct modeling of the batteryhardware 

system as a whole and provide a more accurate basis (i.e., the battery discharge times 

delay product) for comparing various low power optimization methodologies and techniques 

targeted toward battery-powered electronics. Finally, we calculate the optimal value of Vdd for a 

battery-powered VLSI circuit so as to minimize the product of the battery discharge times the 

circuit delay. 

REFERENCES 

[1] A. Chandrakasan, R. Brodersen, Low Power Digital CMOS Design, Kluwer Academic Publishers, July 1995. 

[2] M. Horowitz, T. Indermaur, and R. Gonzalez, “Low-Power Digital Design”, IEEE Symposium on Low Power 

Electronics, pp.8-11, 1994. 

[3] A. Chandrakasan, V. Gutnik, and T. Xanthopoulos, “Data Driven Signal Processing: An Approach for Energy 

Efficient Computing”, 1996 International Symposium on Low Power Electronics and Design”, pp. 347-352, Aug. 

1996. 

[4] J. Rabaey and M. Pedram, Low Power Design Methodologies, Kluwer Academic Publishers, 1996 

[5] URL: http://infopad.eecs.berkeley.edu/~anthonys/quals 

[6] M. Pedram and Q. Wu, “Battery-Powreed Digital CMOS Design”, Proceedings of Design Automation and Test 

in Europe Conference, pp. 72-76, Mar., 1999. 

[7] M. Pedram, “Power Minimization in IC Design: Principles and Applications”, ACM transactions on Design 

Automation of Electronic Systems, Vol. 1, No. 1, pp. 3-56, Jan., 1996. 

[8] M. Doyle, T. F. Fuller, and J. Newman, “Modeling of Galvanostatic Charge and Discharge of the 

Lithium/Polymer/Insertion Cell”, J. Electrochem. Soc., Vol. 140, No. 6, pp.1526-1533, Jun. 1993. 

[9] T. F. Fuller, M. Doyle, and J. Newman, “Simulation and Optimization of the Dual Lithium Ion Insertion Cell”, J. 

Electrochem. Soc., Vol. 141, No. 1, pp.1-9, Jan. 1994. 

[10] D. Fauteux, “Lithium Polymer Electrolyte Rechargeable Battery”, The Electrochemical Society Proceedings, 

Vol. 94-28, pp.379-388. 

[11] L. Xie, W. Ebner, D. Fouchard, and S. Megahed, “Electrochemical Studies of LiNiO2 for Lithium-Ion 

Batteries”, The Electrochemical Society Proceedings, Vol. 94-28, pp.263-276. 

[12] K. M. Abraham, D. M. Pasquariello, T. H. Nguyen, Z. Jiang, and D. Peramunage, “Lithiated Manganese Oxide 

Cathodes for Rechargeable Lithium Batteries”, The Battery Conference, pp. 317-323, 1996. 

[13] N. Cui, B. Luan, D. Bradhurst, H. K. Liu, and S. X. Dou, “Surface-Modified Mg2Ni-Type Negative Electrode 

Materials for Ni-MH Battery”, The Battery Conference, pp. 317-322, 1997. 

[14] J. K. Erbacher and S. P. Vukson, “Commercial Nickel-Metal Hydride (Ni-MH) Technology Evaluation”, The 

Battery Conference, pp. 9-15, 1997 

[15] B. Nelson, “TMP Ultra-High Rate Discharge Performance”, The Battery Conference, pp. 139-143, 1997. 

[16] S. Gold, “A PSPICE Macromodel for Lithium-Ion Batteries”, The Battery Conference, pp. 215-222, 1997 

[17] URL: http://www.valence-tech.com/products/index.htm

[18] URL: http://www.mosis.org/html/hp-gmos10qa-prm.html

DAC'99, pages 867-872 

Cycle-Accurate Simulation of Energy Consumption in Embedded Systems 

Tajana Šimunic, Luca Benini* and Giovanni De Micheli 

Computer Systems Lab, Stanford University 

*DEIS University of Bologna, Italy 

Abstract 

This paper presents a methodology for cycle-accurate simulation of energy dissipation in 

embedded systems. The ARM Ltd. [1] instruction-level cycle-accurate simulator is extended 

with energy models for the processor, the L2 cache, the memory, the interconnect and the DC- 

DC converter. A SmartBadge, which can be seen as an embedded system consisting of 

StrongARM-1100 processor, memory and the DC-DC converter, is used to evaluate the 

methodology with the Dhrystone benchmark. We compared performance and energy computed 

by our simulator with measurements in hardware and found them in agreement within a 5% 

tolerance. The simulation methodology was applied to design exploration for enhancing a 

SmartBadge with real-time MPEG feature. 

References 

[1] Advanced RISC Machines Ltd (ARM), ARM Software Development Toolkit Version 2.11, 1996. 

[2] G. Q. Maguire, M. Smith, H. W. Peter Beadle, “SmartBadges: a wearable computer and communication system," 

Invited talk slides url: www.it.kth.se/maguire/Talks/CODES-980313.pdf, 6 th International Workshop on 

Hardware/Software Codesign, 1998. 

[3] CoWare, CoWareN2c url:www.coware.com/n2c.html . 

[4] Mentor Graphics, www.mentor.com/codesign. 

[5] Synopsys, www.synopsys.com/products/hwsw. 

[6] Cadence, www.cadence.com/alta/products. 

[7] P. Landman, J. Rabaey, “Activity-Sensitive Architectural Power Analysis," IEEE Transactions on CAD, pp.571- 

587, June 1996. 

[8] D. Liu, C. Svensson, “Power Consumption Estimation in CMOS VLSI Chips," IEEE Journal of Solid-State 

Circuits, vol.29, no.6, pp. 663-670, June 1994. 

[9] M. Kamble, K. Ghose, “Energy-Efficiency of VLSI Caches: A Comparative Study," 10th International 

Conference on VLSI Design, pp.261-267, January 1997. 

[10] S. Wilton, N. Jouppi, “CACTI: An Enhanced Cache Access and Cycle Time Model," IEEE Journal of Solid- 

State Circuits, vol.31, no.5, pp.677-688, May 1996. 

[11] K. Itoh, K. Sasaki, Y. Nakagome, “Trends in Low-Power RAM Circuit Technologies," Proceedings of the 

IEEE, vol.83, no.4, pp.524-543, April 1995. 

[12] V. Tiwari, S. Malik, A. Wolfe, M. Lee, “Instruction Level Power Analysis," Journal of VLSI Signal Processing 

Systems, no.1, pp.223-2383, 1996. 

[13] M. Wan, Y. Ichikawa, D. Lidsky, J. Rabaey, “An Energy Conscious Methodology for Early Design Exploration 

of Heterogeneous DSPs," Proceedings of the Custom Intergrated Circuit Conference, 1998. 

[14] L. Benini, R. Hodgson, P. Siegel, “System-Level Power Estimation and Optimization," Proceedings of 

ISLPED, pp.173-178, 1998. 

[15] Y. Li and J. Henkel, “A Framework for Estimating and Minimizing Energy Dissipation of Embedded HW/SW 

Systems," Proceedings of DAC 1998, pp.188-193, 1998. 

[16] B. Kapoor, “Low Power Memory Architecutres for Video Applications," Proceedings of the 8th Great lakes 

symposium on VLSI, pp. 2-7, 1998. 

[17] OZ Electronics Manufacturing, PCB Modelling Tools url: www.oem.com.au/manu/pcbmodel.html. 

[18] A. El Gamal, Z.A. Syed, “A stochastic model for interconnections in custom integrated circuits," IEEE 

Transactions on Circuits and Systems, vol.CAS-28, no.9, pp.888-894, Sept. 1981. 

[19] V. Bhaskaran, K. Konstantinides, Image and Video Compression Standards Kluwer Academic Publishers, 

1997.

DAC'99, pages 873-878 

Lowering power consumption in clock by using Globally Asynchronous 

Locally Synchronous design style. 

A.Hemani 1 , T.Meincke 1 , S.Kumar 4 , A.Postula 5 , T.Olsson 2 , P.Nilsson 2 , J.Oberg 1 , P.Ellervee 1 , 

D.Lundqvist 3 

1 ESD Lab, Department of Electronics, KTH, Sweden 

2 Lund University, Sweden 

3 Ericsson Radio Systems AB, Stockholm, Sweden 

4 Indian Institute of Technology, New Delhi, India 

5 Department of CSEE, University of Queensland, Brisbane, Australia 

ABSTRACT 

Power consumption in clock of large high performance VLSIs can be reduced by adopting 

Globally Asynchronous, Locally Synchronous design style (GALS). GALS has small overheads 

for the global asynchronous communication and local clock generation. We propose methods to 

a) evaluate the benefits of GALS and account for its overheads, which can be used as the basis 

for partitioning the system into optimal number/size of synchronous blocks, and b) automate the 

synthesis of the global asynchronous communication. Three realistic ASICs, ranging in 

complexity from 1 to 3 million gates, were used to evaluate GALS benefits and overheads. The 

results show an average power saving of about 70% in clock with negligible overheads. 

REFERENCES 

1. S. Hauck, “Asynchronous Design Methodologies: An Overview”, Proceedings of IEEE, Vol. 83, No. 1, pp 69-93, 

January 1995. 

2. W. Horn, “Modelling of an ATM Multiplexer in a Network Terminal for a Mixed Hardware/Firmware 

Implementation”, Master thesis, TRITA-ESD-1998-06, 

Department of Electronics, Royal Institute of Technology, Stockholm, Sweden, May 1998. 

3. G. M. Jacobs, R. W Broderson, “A Fully Asynchronous Digital Signal Processor Using Self-Timed Circuits”, 

IEEE Journal of Solid-State Circuits, Vol 25, No. 6, Dec. 1996. 

4. P. Nilsson, M. Torkelson, “A Monolithic Digital Clock-Generator for On-Chip Clocking of Custom DSP’s”, 

IEEE Journal of Solid-State Circuits, pp. 700-706, May 1996 

5. J.M.Rabaey, “Digital Integrated Circuits”, Prentice Hall, 1997 

6. J. M. Rabaey, M. Pedram, “Low Power Design Methodologies” Ch 1, Kluwer Academic Publishers, 1996, 

ISBN0-7923-9630-8 

7. J. M. Rabaey, M. Pedram, “Low Power Design Methodologies”, Ch 5, Kluwer Academic Publishers, 1996, 

ISBN0-7923-9630-8 

8. B. Svantesson, S. Kumar, A. Hemani, “A Methodology and Algorithms for Efficient Interprocess Communication 

Synthesis from System Description in SDL”, in Proc. of VLSI Design’98, pp 78-84, 7-8 Jan 1998, Chennai, India 

9. V. Tiwari et. al., “Reducing Power in High-performance Microprocessors”, 35th DAC, June 98. 

10. T. Hotta K. Kurita and N. Kitamura. PLL-based BiCMOS on-chip clock generator for very high-speed 

microprocessors. IEEE Journal of Solid-State Circuits, 26:pp. 485-589, April 1991. 

11. T. D. Burd and R. W. Brodersen, Processor Design for Portable Systems, Journal of VLSI Signal Processing, 

Kluwer Academic Publishers, Volume 13, Numbers 2/3, August/September 1996, pp. 203-222. 

12. P. Nilsson and M. Torkelson. A Custom Digital Intermediate Frequency Filter for the American Mobile 

Telephone System. IEEE Journal of Solid-State Circuits, 32:pp. 806-815, June 1997. 

13. Inki Hong et. al. Power Optimisation of Variable Voltage Core-Based Systems. 35th DAC, June 98, pp. 176- 

181. 

14. L.Benini and G. De Micheli,” Transformations and Synthesis of FSM’s for low power gated clock 

implementation”, IEEE Trans. on CAD, Vol. 15, No. 6, June 1996.

DAC'99, pages 879-884 

A CAD Tool for Optical MEMS 

Timothy P. Kurzweg*, Steven P. Levitan*, Philippe J. Marchand**, 

Jose A. Martinez*, Kurt R. Prough*, Donald M. Chiarulli*** 

*University of Pittsburgh, Dept. of Electrical Engineering, Pittsburgh, PA, USA, 

**University of California, San Diego, ECE Dept., La Jolla, CA, USA 

***University of Pittsburgh, Dept. of Computer Science, Pittsburgh, PA, USA 

ABSTRACT 

Chatoyant models free-space opto-electronic components and systems and performs simulations 

and analyses that allow designers to make informed system level trade-offs. Recently, the use of 

MEM bulk and surface micro-machining technology has enabled the fabrication of microoptical-mechanical 

systems. This paper presents our models for diffractive optics and new 

analysis techniques which extend Chatoyant to support optical MEMS design. We show these 

features in the simulation of two optical MEM systems. 

Keywords: Optical MEMS, MEMS-CAD, MOEMS, micro-optics 

REFERENCES 

[1] Akiyama, T., et. al, “Scratch drive actuator with mechanical links for self-assembly of three-dimensional 

MEMS,” J. of Microelectromechanical Systems, Vol. 6, No. 1., Mar 1997, pp. 10-17. 

[2] Born, M., Wolf, E., Principles of Optics, (Pergamon Press, 1959) 

[3] Buck, J., et.al, “Ptolemy: a framework for simulating and prototyping heterogeneous systems,” Int. J. Computer 

Simulation, Vol. 4, pp. 155-182, (1994). 

[4] Goodman, J.W., Introduction to Fourier Optics, Second Edition (The McGraw-Hill Companies, Inc., 1996). 

[5] Karam, J.M., et. al, “CAD and foundries for microsystems”, 34th DAC, Anaheim, CA, June 9-13, 1997, pp. 674- 

679. 

[6] Kurzweg,T.P., et. al “Modeling Optical MEMS Systems,”, TR99-103, University of Pittsburgh, 1999. 

[7] Levitan, S.P., et al, “Chatoyant: a computer-aided design tool for free-space optoelectronic systems,” Applied 

Optics, Vol. 37, No. 26, Sept 1998, pp. 6078-6092. 

[8] Levitan, S.P., et. al, “Computer-Aided Design of Free-Space Opto-Electronic Systems,” 34th DAC, Anaheim, 

CA, June 9-13, 1997, pp. 768-773. 

[9] Martinez, J.A., et. al, “Piecewise Linear Large Scale Models for Optoelectronic Devices,” OSA Optics in 

Computing, Aspen, CO, Apr 1999. 

[10] Mukherjee, T., Fedder, G.K., “Structured Design Of Microelectromechanical Systems,” 34th DAC, Anaheim, 

CA, June 1997, pp. 680-685. 

[11] Piyawattanametha, W., et. al, “MEMS Technology for Optical Crosslink for Micro/Nano Satellites,” 

NANOSPACE’98, NASA/Johnson Space Center, Houston, TX, Nov 1-6, 1998. 

[12] Rubinstein, R.Y., Simulation and the Monte Carlo Method, (John Wiley & Sons, 1981). 

[13] Saleh, B.E.A., Teich, M.C., Fundamentals of Photonics (New York: Wiley-Interscience, 1991). 

[14] Senturia, S. D., “CAD for Microelectromechanical Systems,” Transducers '95, June 25-29, 1995, Stockholm, 

Sweden, Vol. 2, Paper No. 232-A7. 

[15] Wilson, N.M., et. al, “A Heterogenous Environment for Computational Prototyping and Simulation Based 

Design of MEMS Devices”, SISPAD 98, Leuven, Belgium, Sept 2-4, 1998. 

[16] Wu, M.C., “Micromachining for optical and Optoelectronic Systems,” Proc. of the IEEE, Vol. 85, No. 11, Nov 

1997, pp. 1833-1856.

DAC'99, pages 885-891 

On Thermal Effects in Deep Sub-Micron VLSI Interconnects 

Kaustav Banerjee, Amit Mehrotra, Alberto Sangiovanni-Vincentelli, Chenming Hu 

Department of Electrical Engineering and Computer Sciences 


Abstract 

This paper presents a comprehensive analysis of the thermal effects in advanced high 

performance interconnect systems arising due to self-heating under various circuit conditions, 

including electrostatic discharge. Technology (Cu, low-k etc) and scaling effects on the thermal 

characteristics of the interconnects, and on their electromigration reliability has been analyzed 

simultaneously, which will have important implications for providing robust and aggressive deep 

submicron interconnect design guidelines. Furthermore, the impact of these thermal effects on 

the design (driver sizing) and optimization of the interconnect length between repeaters at the 

upper-level signal lines are investigated. 

References 

[1] C. R. Barrett, “Microprocessor evolution and technology impact,” Symp. VLSI Technol., Dig. Tech. Papers, 

1993, pp. 7-10. 

[2] T. Makimoto, “Market and technology trends in the nomadic age,” Symp. VLSI Technol., Dig. Tech. Papers, 

1996, pp. 6-9. 

[3] R. Whittier, “Push/Pull: PC technology/end user demand,” Symp. VLSI Technol., Dig. Tech. Papers, 1996, pp. 2- 

5. 

[4] R. H. Dennard, F. H. Gaensslen, H. Yu, V. L. Rideout, E. Bassous, and A. R. LeBank, “Design of ion-implanted 

MOSFETs with very small physical dimensions,” IEEE J. Solid-State Circuits, Vol. SC-9, pp. 256-268, 1974. 

[5] P. K. Chatterjee, W. R. Hunter, A. Amerasekera, S. Aur, C. Duvvury, P. E. Nicollian, L. M. Yang, and P. Yang, 

“Trends for deep submicron VLSI and their implications for reliability,” Proc. IRPS, 1995, pp. 1-11. 

[6] J. R. Black, “Electromigration – A brief survey and some recent results,” IEEE Trans. Electron Devices, vol. 

ED-16, pp. 338-347, 1969. 

[7] B. K. Liew, N. W. Cheung, and C. Hu, “Projecting interconnect electromigration lifetime for arbitrary current 

waveforms,” IEEE Trans. Electron Devices, vol. 37, pp. 1343-50, 1990. 

[8] K. Banerjee, A. Amerasekera, N. Cheung and C. Hu, “High-current failure model for VLSI interconnects under 

short-pulse stress conditions,” IEEE Electron Device Lett., vol. 18, No. 9, pp. 405-407, 1997. 

[9] K. Banerjee, A. Amerasekera and C. Hu, “Characterization of VLSI circuit interconnect heating and failure 

under ESD conditions,” Proc. IRPS, 1996, pp. 237-245. 

[10] W. R. Hunter, “Self-consistent solutions for allowed interconnect current density – Part I: Implications for 

technology evolution,” IEEE Trans. Electron Devices, vol. ED-44, pp. 304-309, 1997. 

[11] S. Rzepka, K. Banerjee, E. Meusel, and C. Hu, “Characterization of selfheating in advanced VLSI interconnect 

lines based on thermal finite element simulation,” IEEE Trans. on Components, Packaging and Manufacturing 

Technology-Part A, vol. 21, No. 3, pp. 1-6, 1998. 

[12] J. Ida et al, “Reduction of wiring capacitance with new low dielectric SiOF interlayer film for high speed/low 

power sub-half micron CMOS,” Tech. Dig. VLSI Symp., pp. 59-60, 1994. 

[13] K. Banerjee, A. Amerasekera, G. Dixit and C. Hu, “The effect of interconnect scaling and low-k dielectric on 

the thermal characteristics of the IC metal,” in Tech. Dig. IEDM, 1996, pp. 65-68. 

[14] NS Nagaraj, F. Cano, H. Haznedar, and D. Young, “A practical approach to static signal electromigration 

analysis,” Proc. 35th Design Automation Conf., 1998, pp. 572-577. 

[15] National Technology Roadmap for Semiconductors (NTRS). 

[16] J. R. Black, “Electromigration failure modes in aluminum metallization for semiconductor devices," IEEE 

Trans. Electron Devices, vol. 57, no. 9, pp. 1587-1594, 1969. 

[17] A. A. Bilotti, “Static temperature distribution in IC chips with isothermal heat sources,” IEEE Trans. Electron 

Devices, vol. ED-21, pp. 217-226, 1974.

[18] W. R. Hunter, “Self-consistent solutions for allowed Interconnect current density – Part II: Application to 

design guidelines,” IEEE Trans. Electron Devices, vol. ED-44, pp. 310-316, 1997. 

[19] C. Jin, L. Ting, K. Taylor, T. Seta, and J. D. Luttmer, “Thermal conductivity measurement of low dielectric 

constant films,” in Proc. Second International Dielectrics for VLSI/ULSI Multilevel Interconnection Conference 

(DUMIC), 1996, pp. 21-28. 

[20] Private Communications, Professor Kenneth Goodson, Thermosciences Division, Mechanical Eng. 

Department, Stanford University. 

[21] H. A. Schafft, “Thermal analysis of electromigration test structures,” IEEE Trans. Electron Devices, vol. ED- 

34, pp. 664-672, 1987. 

[22] R. H.J.M. Otten and R. K. Brayton, “Planning for performance,” Proc 35th Design Automation Conf., 1998, pp. 

122-127. 

[23] J. Culetu, C. Amir, and J. McDonald, “A practical repeater insertion method in high speed VLSI circuits,” 

Proc. 35th Design Automation Conf., 1998, pp. 392-395. 

[24] “Physical design modelling and verification project (SPACE Project)”, http://cas.et.tudelft.nl/research/ 

space.html 

[25] C. Duvvury and A. Amerasekera, “State-of-the-art issues for technology and circuit design of ESD protection in 

CMOS ICs,” Semiconductor Science. and Tech., pp. 833-850, 1996. 

[26] A. Amerasekera and C. Duvvury, “The impact of technology scaling on ESD robustness and protection circuit 

design,” in EOS/ESD Symp. Proc., 1994, pp. 237-245. 

[27] S. H. Voldman, “ESD robustness and scaling implications of aluminum and copper interconnects in advanced 

semiconductor technology,” in EOS/ESD Symp. Proc., 1997, pp. 316-329.

DAC'99, pages 892-897 

Converting a 64b PowerPC Processor from CMOS Bulk to SOI Technology 

D. Allen, D. Behrends, B. Stanisic 

IBM Corporation, Rochester, MN 55901 

Abstract 

A 550MHz 64b PowerPC processor was developed for fabrication in Silicon-On-Insulator (SOI) 

technology from a processor previously designed and fabricated in bulk CMOS [1]. Both the 

design and the associated CAD methodology (point tools, flow, and models) were modified to 

handle demands specific to SOI technology. The challenge was to improve the cycle time by 

adapting the circuit design, timing, and chip integration methodologies to accommodate effects 

unique to SOI. 

References 

[1] D.Allen, et al,“A 550MHz 64b SOI Processor with Cu Interconnects”, ISSCC, 1999. 

[2] F. Assaderaghi, et al, “A 7.9/5.5 psec Room/Low Temperature SOI CMOS,” IEDM 97, pp. 415-418. 

[3] J-P. Colinge, Silicon-On-Insulator Technology: Materials to VLSI, Kluwer Academic Publishers, Boston MA, 

1991. 

[4] K. L. Shepard and V. Narayanan, “Noise in deep submicron digital design”, in Proceedings of the IEEE/ACM 

International Conference on Computer-Aided Design, pp. 524-531, November 1996. 

[5] “AS/X User’s Guide”, International Business Machines Technical Memorandum, No. 220-5233-00, March 

1994. 

[6] T.Drumm, J.Mollen, J.Earl,”Differences in Synthesis Behavior between Bulk and SOI Technologies,” 

International Business Machines Technical Memorandum, 1997 

[7] H. H. Chen and D. D. Ling, “Power supply noise analysis methodology for deep submicron VLSI chip design,” 

in Proceedings 34th Design Automation Conference, pp. 638-643, June 1997. 

[8] J. Rahmeh, “3DNoise User’s Guide”, International Business Machines Technical Memorandum, June 1996.

DAC'99, pages 898-903 

A Framework for Collaborative and Distributed Web-based Design 

Gangadhar Konduri, Anantha Chandrakasan 

Department of Electrical Engineering and Computer Science 

Massachusetts Institute of Technology, Cambridge, MA 02139 

Abstract 

The increasing complexity and geographical separation of design data, tools and teams has 

created a need for a collaborative and distributed design environment. In this paper we present a 

framework that enables collaborative and distributed Web-based CAD, in which the designers 

can collaborate on a design and efficiently utilize existing design tools on the Internet. The 

framework includes a Java-based hierarchical collaborative schematic/block editor with 

interfaces to distributed Web tools and cell libraries, infrastructure to store and manipulate 

design objects, and protocols for tool communication, message passing and collaboration. 

References 

[1] “What's ahead for design on the Web?", Panel Discussion, IEEE Spectrum, September 1998, pp. 53-63. 

[2] “IC Design on the World Wide Web", IEEE Spectrum, June 1998. 

[3] O. Bentz, D. Lidsky, J. M. Rabaey,”Information-based Design Environment", IEEE VLSI Signal Processing 

VIII, pp. 237-246, Nov 1995. 

[4] D. Lidsky, J. M. Rabaey, “Early Power Exploration - a World Wide Web Application", Proc. Design Automation 

Conf, Las Vegas, NV, June 1996. 

[5] The WELD Project, http://www-cad.EECS.Berkeley.EDU/Respep/Research/weld 

[6] A. Boglio, L. Benini, G. De Micheli and B. Ricco, “PPP: A Gate-Level Power Estimator - A World Wide Web 

Application", Stanford Technical Report No. CSL-TR-96-691, 1996. 

[7] XMX Home Page, http://www.cs.brown.edu/software 

[8] Xplexer: The Application sharing technology, http://andru.unx.com/DD/advisor/docs/jun95 

[9] XShare: Workstation conferencing, http://www.eit.com/software/xshare 

[10] XTV: A Users Guide, http://www.visc.vt.edu/succeed/xtv.html 

[11] Hemang Lavana, Amit Khetawat, Franc Brglez, Kyzysztof Kozminski, “Executable Workows: A Paradigm for 

Collaborative Design on the Internet", Proceedings of the Design Automation Conference, June 1997. 

[12] Debashis Saha, “Framework for distributed Web-based Microsystem design", Masters thesis, MIT, Jan 1998.

DAC'99, pages 904-909 

Dealing With Inductance In High-Speed Chip Design 

Phillip Restle, Albert Ruehli, Steven G. Walker 

IBM T.J. Watson Research Center, Yorktown Heights, NY 

Abstract 

Inductance effects in on-chip interconnects have become significant for specific cases such as 

clock distributions and other highly optimized networks [1,2]. Designers and CAD tool 

developers are searching for ways to deal with these effects. Unfortunately, accurate on-chip 

inductance extraction and simulation in the general case are much more difficult than 

capacitance extraction. In addition, even if ideal extraction tools existed, most chip designers 

have little experience designing with lossy transmission lines. This tutorial will attempt to 

demystify on-chip inductance through the discussion of several illustrative examples analyzed 

using full-wave extraction and simulation methods. A specialized PEEC (Partial Element 

Equivalent Circuit) method tailored for chip applications was used for most cases. Effects such 

as overshoot, reflections, frequency dependent effective resistance and inductance will be 

illustrated using animated visualizations of the full-wave simulations. Simple examples of design 

techniques to avoid, mitigate, and even take advantage of on-chip inductance effects will be 

described. 

References 

[1] P. J. Restle, K. A. Jerkins, A. Deutsch and P. W. Cook, "Measurement and Modeling of On-Chip Transmission- 

Line Effects in a 400 MHz Microprocessor," IEEE Journal of Solid-State Circuits, Vol. 33 No. 4, pp. 662-665, Apr. 

1998. 

[2] P. J. Restle, A. Deutsch , "Designing the Best Clock Distribution Network", Symposium on VLSI Circuits Digest 

of Technical Papers, June '98, pp. 2-5, Honolulu, HI 

[3] A. E. Ruehli, "Inductance Calculations in a Complex Integrated Circuit Environment", IBM J. Res. Develop., 

Vol. 16, pp. 470-481, Sept. 1972. 

[4] M. Kaman, F. Wang, J. White, "Recent Improvements for Fast Inductance Extraction and Simulation", In Digest 

of Electr. Perf. Electronic Packaging, Vol 7, pp. 281-284, Oct. 1998, West Point, NY. 

[5] A. E. Ruehli, "Equivalent Circuit Models for Three Dimensional Multi-Conductor Systems", IEEE Trans. of 

Microwave Theory and Techniques, MTT-22 (3), pp. 216-221 March, 1974. 

[6] J. N. Burghartz, A. E. Ruehli, K. A. Jerkins, M. Soyuer, D. Nguyen-Ngoc, "Novel Substrate Contact for High-Q 

Silicon-Integrated Spiral Inductors", IEDM Technical Digest, Dec. 1997, pp. 55-58. 

[7] D. Edelstein et. al., "Full Copper Wiring in a Sub-0.25 lm CMOS ULSI Technology", IEEE Inter. Electron 

Device Meeting Tech. Dig. pp. 773-6, Dec. 1997 

[8] A. Deutsch, G. V. Kopcsay, P. Restle, et al, "When are Transmission-Line Effects Important for On Chip 

Interconnections?", IEEE Trans. Microwave Theory Tech. (USA) Vol. 45, No. 10, pt. 2, pp. 1836-46, Oct. 1997. 

[9] Yehia Massoud, Steve Majors, Tareq Bustami, Jacob White, "Layout Techniques for Minimizing On-Chip 

Interconnect Self-Inductance", Proceedings of Design Automation Conf. pp 566-571, June 1998, San Fransico CA.

DAC'99, pages 910-914 

Interconnect Analysis: From 3-D Structures to Circuit Models 

M. Kamon. N. Marques. Y. Massoud. L. Silveira. J. White 

Research Laboratory of Electronics, Massachusetts Institute of Technology 

Cambridge, MA 02139 

Abstract 

In this survey paper we describe the combination of: discretized integral formulations, 

sparsification techniques, and krylov-subspace based model-order reduction that has led to robust 

tools for automatic generation of macromodels that represent the distributed RLC effects in 3-D 

interconnect. A few computational results are presented, mostly to point out the problems yet to 

be addressed. 

References 

[1] B. Gieseke, et al. "A 600Mhz Superscalar RISC Microprocessor with Out-ofOrder Exectution" ISSCC 97, pp. 

176-177 San Francisco, 1997. 

[2] M. Kamon, M. Tsuk, C. Smithhisler, J. White, “Efficient Techniques for Inductance Extraction of Complex 3-D 

Geometries," Proc. Int. Conf. on Computer-Aided Design, Santa Clara, California, November 1992, pp. 438-442.** 

[3] S. M. Rao, D. R. Wilton, and A. W. Glisson. Electromagnetic scattering by surfaces of arbitrary shape. IEEE 

Trans. Antennas Propagat., AP-30(3):409-418, May 1997. 

[4] S. Kapur and J. Zhao,"A fast method of moments solver for Efficient parameter extraction of MCMs" Design 

Automation Conference, 1997 pp. 141-146. 

[5] A. E. Ruehli, “Equivalent circuit models for three-dimensional multiconductor systems", IEEE Transactions on 

Microwave Theory and Techniques, vol. 22, no. 3, pp. 216-221, March 1974. 

[6] M. Kamon, N. Marques, L. M. Silveira and J. White, “Automatic generation of Accurate Cir- 

cuit Models of 3-D Interconnect", IEEE Transactions on Components, Packaging, and Manufacturing Technology - 

Part B: Advanced Packaging, August, 1998, vol. 21, no. 3, pp. 225-240 

[7] J. Barnes and P. Hut. A hierarchical O(N logN) force-calculation algorithm. Nature, 324:446-449, 1986. 

[8] L. Greengard and V. Rokhlin. A fast algorithm for particle simulations. J. Comput. Phys., 73:325-348, 1987. 

[9] R. W. Hockney and J. W. Eastwood, Computer simulation using particles. New York: Adam 

Hilger, 1988. 

[10] V. Rokhlin, “Rapid solution of integral equation of classical potential theory," J. Comput. Phys., vol. 60, pp. 

187-207, 1985. 

[11] W. Hackbusch and Z. P. Nowak, “On the Fast Matrix Multiplication in the Boundary Element Method by Panel 

Clustering," Numer. Math. 54, pp. 463-491, 1989. 

[12] K. Nabors, J. White, “A Fast Multipole Algorithm for Capacitance Extraction of Complex 3-D Geometries" 

Proc. Custom Int. Circuits Conf., San Diego, California, May 1989, p21.7.1-21.7.4.** 

[13] K. Nabors and J. White, “Fastcap: A multipole accelerated 3-D capacitance extraction program," IEEE 

Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 10, pp. 1447-1459, November 

1991. 

[14] M. Kamon, M. J. Tsuk, and J. White, “FastHenry, A Multipole-Accelerated 3-D Inductance Extraction 

Program," Proceedings of the 30th Design Automation Conference, Dallas, June 1993.** 

[15] M. Bachtold, J.G. Korvink, H. Baltes, “The Adaptive, Multipole-Accelerated BEM for the Computation of 

Electrostatic Forces," Proc. CAD for MEMS, Zurich, 1997, pp. 14. 

[16] K. Nabors, F. T. Korsmeyer, F. T. Leighton, and J. White. Preconditioned, adaptive, multipole-accelerated 

iterative methods for three-dimensional first-kind integral equations of potential theory. SIAM J. Sci. Statist. 

Comput., 15(3):713-735, 1994. 

[17] L. Greengard, V. Rokhlin, “A New Version of the Fast Multipole Method for the Laplace Equation in Three 

Dimensions," Acta Numerica, 1997, pp. 229-269. 

[18] A. Brandt and A. A. Lubrecht, “Multilevel matrix multiplication and fast solution of integral equations," J. 

Comp. Phys., vol. 90, pp. 348-370, 1990.

[19] J. R. Phillips and J. K. White, “Efficient capacitance extraction of 3D structures using generalized pre-corrected 

FFT methods," in Proceedings IEEE 3rd topical meeting on electrical performance of electronic packaging, 

November 1994. 

[20] G. Beylkin, R. Coifman, and V. Rokhlin. Fast wavelet transforms and numerical algorithms. Comm. Pure Appl. 

Math., XLIV:141-183, 1991. 

[21] W. Shi, J. Liu, N. Kakani, and T. Yu, A Fast Hierarchical Algorithm for 3-D Capacitance Extraction 

Proceeding of the 29th Design Automation Conference, San Francisco, CA, June, 1997, pp. 212-217. 

[22] J. Tausch and J. White “Precondition and Fast Summation Techniques for First-Kind Boundary Integral 

Equations" Third IMACS International Symposion on Iterative Methods in Scientific Computation, Jackson Hole 

WY, Jul 9-12, 1997 

[23] L. T.Pillage and R. A. Rohrer. Asymptotic Waveform Evaluation for Timing Analysis. IEEE Trans. CAD, 

9(4):352-366, April 1990. 

[24] Eli Chiprout and Michael Nakhla. Generalized Moment-Matching Methods for Transient Analysis of 

Interconnect Networks. In 29th ACM/IEEE Design Automation Conference, pages 201-206, Anaheim, California, 

June 1992. 

[25] J. E. Bracken, V. Raghavan, and R. A. Rohrer. Interconnect Simulation with Asymptotic Waveform Evaluation. 

IEEE Trans. Circuits Syst., 39(11):869-878, November 1992 

[26] J. R. Phillips, E. Chiprout, and D. D. Ling, "Efficient full-wave electromagnetic analysis via model-order 

reduction of fast integral transforms," Proceedings of the 33rd Design Automation Conference, Las Vegas, NV, June 

1996. 

[27] Peter Feldmann and Roland W. Freund, “Efficient linear circuit analysis by Padé approximation via the 

Lanczos process", in EURO-DAC'94 with EURO-VHDL'94, September 1994. 

[28] K. Gallivan, E. Grimme, and P. Van Dooren. Asymptotic Waveform Evaluation via a Lanczos Method. Applied 

Mathematics Letters, 7(5):75-80, 1994 

[29] L. Miguel Silveira, M. Kamon and J. White, “Efficient Reduced-Order Modeling of Frequency-Dependent 

Coupling Inductances associated with 3-D Interconnect Structures", Proceedings of the 32nd Design Automation 

Conference, pp. 376-380, San Francisco, CA, June, 1995.** 

[30] J. E. Bracken. Passive modeling of linear interconnect networks. IEEE Trans. on Circuits and Systems, (Part I: 

Fundamental Theory and Applications), to appear 

[31] A. Odabasioglu, M. Celik, and L. Pileggi. PRIMA: Passive Reduced-Order Interconnect Macromodeling 

Algorithm. IEEE Conference on ComputerAided Design, San Jose, CA, 1997 

[32] Y. Massoud and J. White, “Simulation and Modeling of the Effect of Substrate Conductivity on Coupling 

Inductance," Proc. Int. Electron Devices Meeting, Washington D.C., December 1995.** 

[33] J. Wang, J. Tausch, and J. White, “A Wide Frequency Range Surface Integral Formulation for 3-D Inductance 

and Resistance Extraction," To appear International Conference on Modeling and Simulation of Microsystems, 

Semiconductors, Sensors and Actuators, San Juan, April 1999 

[34] K. Gallivan, E. Grimme, and P. Van Dooren, “Multi-point Padé approximants of large-scale systems via a twosided 

rational Krylov algorithm", in 33rd IEEE Conference on Decision and Control, Lake Buena Vista, FL, 

December 1994. 

[35] Ibrahim M. Elfadel and D. D. Ling, “A block rational Arnoldi algorithm for multipoint passive model-order 

reduction of multiport RLC networks", in International Conference on Computer Aided-Design, San Jose, 

California, November 1997.

DAC'99, pages 915-920 

IC Analyses Including Extracted Inductance Models† 

Michael W. Beattie, Lawrence T. Pileggi 

Carnegie Mellon University, Dept. of ECE, Pittsburgh, PA 15213 

Abstract 

IC inductance extraction generally produces either port inductances based on simplified current 

path assumptions or a complete partial inductance matrix. Combining either of these results with 

the IC interconnect resistance and capacitance models significantly complicates most IC design 

and verification methodologies. In this tutorial paper we will review some of the analysis and 

verification problems associated with on-chip inductance, and present a subset of recent results 

for partially addressing the challenges which lie ahead. 

Keywords: Interconnect; Inductance; Model Order Reduction. 

References 

[1] K. Kerns, I. Wemple, A. Yang, Stable and Efficient Reduction of Substrate Model Network using Congruence 

Transformation, Proc. ICCAD 1995 (Nov. 1995). 

[2] A. Odabasioglu, M. Celik, L. Pileggi, PRIMA: Passive Reduced-order Interconnect Macromodeling Algorithm, 

Proc. ICCAD 1997 (Nov. 1997). 

[3] L. Pillage, R. Rohrer, Asymptotic Waveform Evaluation for Timing Analysis, IEEE. Trans. Computer-Aided 

Design, 9, No. 4 (Apr. 1990). 

[4] L. Silveria, M. Kamon, J. White, Efficient Reduced-Order Modeling of Frequency-Dependent Coupling 

Inductance Associated with 3-D Interconnect Structure, Proc. 32nd DAC (June 1995). 

[5] A. Deutsch, et al., Modeling and characterization of long on–chip interconnections for high–performance 

microprocessors, IBM J. Res. Dev., 39, No. 5, pg. 547–567 (Sept. 1995). 

[6] F. Grover, Inductance Calculations, Dover Publications, New York (1946). 

[7] M. Kamon, M. Tsuk, J. White, FASTHENRY: A Multipole Accelerated 3–D Inductance Extraction Program, 

IEEE Trans. Microwave Theory and Techniques, 42, No. 9, pp. 1750–1758 (Sept. 1994). 

[8] B. Krauter, L. Pileggi, Generating Sparse Partial Inductance Matrices with Guaranteed Stability, Proc. ICCAD 

1996 (Nov. 1996). 

[9] M. Beattie, L. Alatan, L. Pileggi, Equipotential Shells for Efficient Partial Inductance Extraction, Proc. 1998 

IEDM (Dec. 1998). 

[10] E. Rosa, The Self and Mutual Inductance of Linear Conductors, Bulletin of the National Bureau of Standards, 

4, pp. 301-344 (1908). 

[11]A. Ruehli, Inductance Calculations in a Complex Integrated Circuit Environment, IBM J. Res. Dev., 16, No. 5, 

pg. 470-481 (Sept. 1972). 

[12]W. Weeks, L. Wu, M. McAllister, A. Singh, Resistive and Inductive Skin Effect in Rectangular Conductors, 

IBM J. Res. Dev., 23, No. 6, pg. 652-660 (Nov. 1979). 

[13] R. Arunachalam, F. Dartu and L. Pileggi, CMOS Gate Delay Models for General RLC Loading, Proc. ICCD 

1997 (Oct. 1997). 

[14]M. Kamon, N. Marques, L. Silveira, J. White, Generating Reduced Order Models via PEEC for Capturing Skin 

and Proximity Effects, Proc. 6th Meeting on Electr. Perform. of Electr. Packaging, San Jose (Nov. 1997). 

[15]D. Bailey, B. Benschneider, Clocking Design and Analysis for a 600–MHz Alpha Microprocessor, IEEE J. 

Solid–State Circuits, 33, No. 11 (Nov. 1998).

DAC'99, pages 921-926 

On-chip Inductance Issues in Multiconductor Systems 

Shannon V. Morton 

Alpha Development Group, Compaq Computer Corporation, Shrewsbury, MA 01545 

ABSTRACT 

As the family of Alpha microprocessors continues to scale into more advanced technologies with 

very high frequency edge rates and multiple layers of interconnect, the issue of characterizing 

inductive effects and providing a chip-wide design methodology becomes an increasingly 

complex problem. To address this issue, a test chip has been fabricated to evaluate various 

conductor configurations and verify the correctness of the simulation approach. The 

implementation of and results from this test chip are presented in this paper. Furthermore the 

analysis has been extended to the upcoming EV7 microprocessor, and important aspects of the 

derivation of its design methodology, as pertains to these inductive effects, are discussed. 

Keywords: Alpha microprocessor, semiconductor, interconnect, buses, inductance, resistance, 

capacitance, RLC, noise, cross-talk, transmission line. 

REFERENCES 

[1] H. Fair, D. Bailey, “Clocking Design and Analysis for a 600 MHz Alpha Microprocessor”, ISSCC Digest of 

Technical Papers, Feb 1998, pp. 398-399. 

[2] P. Gronowski, W. Bowhill, R. Preston, M. Gowan, R. Allmon, “High-Performance Microprocessor Design”, 

IEEE Journal of Solid-State Circuits, May 1998, pp. 676-686. 

[3] Y. I. Ismail, E. G. Friedman, J. L. Neves, “Figures of Merit to Characterize the Importance of On-Chip 

Inductance”, DAC’98, June 1998, pp. 560-565. 

[4] B. A. Gieseke et al., “A 600MHz Superscalar RISC Microprocessor with Out-Of-Order Execution”, ISSCC 

Digest of Technical Papers, Feb 1997, pp. 176-177.

DAC'99, pages 927-932 

A Methodology for Accurate Performance Evaluation in Architecture Exploration 

George Hadjiyiannis, Pietro Russo, Srinivas Devadas 

Laboratory for Computer Science, Massachusetts Institute of Technology 

Cambridge, MA 02139, USA 

Abstract 

We present a system that automatically generates a cycle-accurate and bit-true Instruction Level 

Simulator (ILS) and a hardware implementation model given a description of a target processor. 

An ILS can be used to obtain a cycle count for a given program running on the target 

architecture, while the cycle length, die size, and power consumption can be obtained from the 

hardware implementation model. These figures allow us to accurately and rapidly evaluate target 

architectures within an architecture exploration methodology for system-level synthesis. 

In an architecture exploration scheme, both the ILS and the hardware model must be generated 

automatically, else a substantial programming and hardware design effort has to be expended in 

each design iteration. Our system uses the ISDL machine description language to support the 

automatic generation of the ILS and the hardware synthesis model, as well as other related tools. 

References 

[1] G. Hadjiyiannis, S. Hanono, and S. Devadas. ISDL: An Instruction Set Description Language for Retargetability. 

In Proceedings of the Design Automation Conference, pages 299–302, June 1997. 

[2] S. Hanono and S. Devadas. Instruction Selection, Resource Allocation, and Scheduling in the AVIV 

Retargetable Code Generator. In Proceedings of the Design Automation Conference, pages 510–515, 1998. 

[3] G. Hadjiyiannis, S. Hanono, and S. Devadas. ISDL: An Instruction Set Description Language for Retargetability. 

Technical report, Massachusetts Institute of Technology, 1996. (http://www.ee.princeton.edu/spam/pubs/ISDL- 

TR.html). 

[4] G. I. Hadjiyiannis. ISDL: Instruction Set Description Language - Version 1.0. MIT Laboratory for Computer 

Science, July 1998. (http://www.caa.lcs.mit.edu/˜ghi/PostScript/isdl manual.ps). 

[5] P. Marwedel. The MIMOLA Design System: Tools for the Design of Digital Processors. In Proceedings of the 

21th Design Automation Conference, pages 587–593, 1984. 

[6] G. Zimmermann. The MIMOLA Design System: A computer Aided Digital Processor Design Method. In 

Proceedings of the 16th Design Automation Conference, pages 53–58, 1979. 

[7] A. Fauth, J. Van Praet, and M. Freericks. Describing Instruction Sets Using nML (Extended Version). Technical 

report, Technische Universität Berlin and IMEC, Berlin (Germany)/Leuven (Belgium), 1995. 

[8] D. Lanneer et al. CHESS: Retargetable Code Generation for Embedded DSP Processors. In Code Generation for 

Embedded Processors. Kluwer Academic Publishers, 1995. 

[9] M. A. Hartoog et al. Generation of Software Tools from Processor Descriptions for Hardware/Software 

Codesign. In Proceedings of the Design Automation Conference, pages 303–306, 1997. 

[10] V. Zivojnovic, S. Pees, and H. Meyr. LISA – Machine Description Language and Generic Machine Model for 

HW/SW Co-Design. In Proceedings of 1996 IEEE Workshop on VLSI Signal Processing, 1996. 

[11] J. C. Gyllenhaal,W.W. Hwu, and B. R. Rau. HMDES Version 2.0 Specification. Technical Report IMPACT- 

96-3, University of Illinois, Urbana, 1996. 

[12] V. Kathail, M. S. Schlansker, and B. R. Rau. HPL PlayDoh Architecture Specification: Version 1.0. Technical 

Report HPL-93-80, Hewlett-Packard Laboratories, 1994.

DAC'99, pages 933-938 

LISA - Machine Description Language for Cycle-Accurate Models 

of Programmable DSP Architectures 

Stefan Pees 1 , Andreas Hoffmann 1 , Vojin Zivojnovic 2 , Heinrich Meyr 1 

1 Integrated Signal Processing Systems, Aachen University of Technology, Aachen, Germany 

2 AXYS Design Automation, Inc., Irvine, CA, USA 

Abstract 

This paper presents the machine description language LISA for the generation of bit-and cycle 

accurate models of DSP processors. Based on a behavioral operation description, the 

architectural details and pipeline operations of modern DSP processors can be covered. Beyond 

the behavioral model, LISA descriptions include other architecture-related information like the 

instruction set. The information provided by LISA models enables automatic generation of 

simulators and assemblers which are essential elements of DSP software development 

environments. in order to proof the applicability of our approach, a realized model of the Texas 

Instruments TMS320C6201 DSP is presented and derived LISA code examples are given. 

References 

[1] V. Zivojnovic, S. Pees, and H. Meyr, "LISA - machine description language and generic machine model for 

HW/SW co-design," in Proceedings of the IEEE Workshop on VLSI Signal Processing, (San Francisco), Oct. 1996. 

[2] Texas Instruments, TMS320C62x/C67x CPU and Instruction Set Reference Guide, Mar. 1998. 

[3] J. Rowson, "Hardware/ Software co-simulation," in Proc. of the ACM/IEEE Design Automation Conference 

(DAC), 1994. 

[4] D. Bradlee, R. Henry, and S. Eggers, "The Marion system for retargetable instruction scheduling," in Proc. ACM 

SIGPLAN'91 Conference on Programming Language Design and Implementation, Toronto, Canada, pp. 229- 

240,1991. 

[5] B. Rau, "VLIW compilation driven by a machine description database," in Proc. 2nd Code Generation 

Workshop, Leuven, Belgium, 1996. 

[6] A. Fauth, J. Van Praet, and M. Freericks, "Describing instruction set processors using nML," in Proc. European 

Design and Test Conf, Paris, Mar. 1995. 

[7] M. Hartoog, J. Rowson, et al., "Generation of software tools from processor descriptions for hardware/software 

codesign," in Proc. of the ACM/IEEE Design Automation Conference (DAC), Jun. 1997. 

[8] W. Geurts, D. Lanneer, et al., "Design of DSP systems with Chess/Checkers," in 2nd Int. Workshop on Code 

Generation for Embedded Processors, (Leuven), Mar. 1996. 

[9] G. Hadjiyiannis, S. Hanono, and S. Devadas, "ISDL: An instruction set description language for retargetability," 

in Proc. o f the ACM/IEEE Design Automation Conference (DAC), Jun. 1997. 

[10] V. Kathail, M. Schlansker, and B. Rau, "HPL PlayDoh Architecture Specification: Version 1.0," in HP 

Laboratories Technical Report HPL-93-80, Mar. 1994. 

[11] A. Halambi, P. Grun, et al., "EXPRESSION: A language for architecture exploration through compiler/ 

simulator retarget ability," in Proceedings of the European Conference on Design, Automation and Test (DATE), 

Mar. 1999. 

[12] C. Siska, "A processor description language supporting retargetable multi-pipeline dsp program development 

tools," in Proceedings of the International Symposium on System Synthesis (ISSS), Dec. 1998. 

[13] S. Pees, V. Zivojnovic, A. Ropers, and H. Meyr, "Fast Simulation of the TI TMS 320C54x DSP," in Proc. Int. 

Con f . on Signal Processing Application and Technology (ICSPAT), (San Diego), pp. 995-999, Sep. 1997. 

[14] http://www.ert.rwth-aachen.de/lisa/lisa.html.

DAC'99, pages 939-944 

Exploiting Intellectual Properties in ASIP Designs for Embedded DSP Software 

Hoon Choi, Ju Hwan Yi, Jong-Yeol Lee, In-Cheol Park, and Chong-Min Kyung 

Department of Electrical Engineering, 

Korea Advanced Institute of Science and Technology, Taejon, Korea 

Abstract 

The growing requirements on the correct design of a high-performance system in a short time 

force us to use IP's in many designs. In this paper, we propose a new approach to select the 

optimal set of IP's and interfaces to make the application program meet the performance 

constraints in ASIP designs. The proposed approach selects IP's with considering interfaces and 

supports concurrent execution of parts of task in kernel as software code with others in IP's, 

while the previous state-of-the-art approaches do not consider IP's and interfaces simultaneously 

and cannot support the concurrent execution. The experimental results on real applications show 

that the proposed approach is effective in making application programs meet the performance 

constraints using IP's. 

References 

[1] M. Keating, “A Financial Model for Design Reuse,” http://www.synopsys.com/roi/, Sept. 1998. 

[2] R. Passerone, J. A. Rowson and A. Sangiovanni-Vincentelli, “Automatic Synthesis of Interfaces between 

Incompatible Protocols,” 35th Design Automation Conference, pp. 8-13, 1998. 

[3] J. Smith and G. De Micheli, “Automated Composition of Hardware Components,” 35th Design Automation 

Conference, pp. 14-19, 1998. 

[4] K. S. Chung, R. K. Gupta and C. L. Liu, “An Algorithm for Synthesis of System-Level Interface Circuits,” 

International Conference on Computer-Aided Design, pp. 442-447, 1996. 

[5] S. Narayan and D. Gajski, “Interfacing Incompatible Protocols using Interface Process Generation,” 32nd Design 

Automation Conference, pp. 468-473, 1995. 

[6] R. B. Ortega, L. Lavagno and G. Borriello, “Models and Methods for HW/SW Intellectual Property Interfacing,” 

NATO ASI Proceedings on System Synthesis, 1998. 

[7] P. Chou, R. B. Ortega, G. Borriello, “Synthesis of the Hardware/Software Interface in Microcontroller-Based 

Systems,” International Conference on Computer-Aided Design, pp. 488-495, 1992. 

[8] A. Alomary, T. Nakata, Y. Honma, M. Imai and N. Hikichi, “An ASIP Instruction Set Optimization Algorithm 

with Functional Module Sharing Constraint,” International Conference on Computer-Aided Design, pp. 526-532, 

1993. 

[9] H. Choi, I.-C. Park, S. H. Hwang and C.-M. Kyung, “Synthesis of Application Specific Instructions for 

Embedded DSP Software,” International Conference on Computer-Aided Design, pp. 665-671, 1998. 

[10] H. A. Taha, Operations Research, Prentice Hall, 1997, Chapter 9, pp. 367-373

DAC'99, pages 945-950 

MAELSTROM: Efficient Simulation-Based Synthesis for Custom Analog Cells 

Michael Krasnicki, Rodney Phelps, Rob A. Rutenbar, L. Richard Carley 


Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 

Abstract 

Analog synthesis tools have failed to migrate into mainstream use primarily because of 

difficulties in reconciling the simplified models required for synthesis with the industrialstrength 

simulation environments required for validation. MAELSTROM is a new approach that 

synthesizes a circuit using the same simulation environment created to validate the circuit. We 

introduce a novel genetic/ annealing optimizer, and leverage network parallelism to achieve 

efficient simulator-in-the-loop analog synthesis. 

REFERENCES 

[1] E. Ochotta, R.A. Rutenbar, L.R. Carley, “Synthesis of High-Performance Analog Circuits and ASTRX/OBLX,” 

IEEE Trans. CAD, vol. 15, no. 3, March 1996. 

[2] K.S. Kundert, The Designer’s Guide to SPICE & SPECTRE, Kluwer Academic Publishers, Kluwer Academic 


[3] E. Ochotta, T. Mukherjee, R.A. Rutenbar, L.R. Carley, Practical Synthesis of High-Performance Analog 

Circuits, Kluwer Academic Publishers, 1998. 

[4] M. Degrauwe et al., “Towards an analog system design environment,” IEEE JSSC, vol. sc-24, no. 3, June 1989. 

[5] H.Y. Koh, C.H. Sequin, and P.R. Gray, “OPASYN: a compiler for MOS operational amplifiers,” IEEE Trans. 

CAD, vol. 9, no. 2, Feb. 1990. 

[6] G. Gielen, et al., “Analog circuit design optimization based on symbolic simulation and simulated annealing,” 

IEEE JSSC, vol. 25, June 1990. 

[7] F. Leyn, W. Daems, G. Gielen, W. Sansen, “A Behavioral Signal Path Modeling Methodology for Qualitative 

Insight in and Efficient Sizing of CMOS Opamps,” Proc. ACM/IEEE ICCAD, 1997. 

[8] P. C. Maulik, L. R. Carley, and R. A. Rutenbar, “Integer Programming Based Topology Selection of Cell Level 

Analog Circuits,” IEEE Trans. CAD, vol. 14, no. 4, April 1995. 

[9] W. Kruiskamp and D. Leenaerts, “DARWIN: CMOS Opamp Synthesis by Means of a Genetic Algorithm,” 

Proc. 32nd ACM/IEEE DAC, 1995. 

[10] R. Harjani, R.A. Rutenbar and L.R. Carley, “OASYS: a framework for analog circuit synthesis,” IEEE Trans. 

CAD, vol. 8, no. 12, Dec. 1989. 

[11] B.J. Sheu, et al., “A Knowledge-Based Approach to Analog IC Design,” IEEE Trans. Circuits and Systems, 

CAS-35(2):256-258, 1988. 

[12] E. Berkcan, et al., “Analog Compilation Based on Successive Decompositions,” Proc. of the 25th IEEE DAC, 

pp. 369-375, 1988. 

[13] J. P. Harvey, et al., “STAIC: An Interactive Framework for Synthesizing CMOS and BiCMOS Analog 

Circuits,” IEEE Trans. CAD, Nov. 1992. 

[14]C. Makris and C. Toumazou, “Analog IC Design Automation Part II--Automated CIrcuit Correction by 

Qualitative Reasoning,” IEEE Trans. CAD, vol. 14, no. 2, Feb. 1995. 

[15]A. Torralba, J. Chavez and L. Franquelo, “FASY: A Fuzzy-Logic Based Tool for Analog Synthesis,” IEEE 

Trans. CAD, vol. 15, no. 7, July 996. 

[16]G. Gielen, P. Wambacq, and W. Sansen, “Symbolic ANalysis Methods and Applications for Analog Circuits: A 

Tutorial Overview, “ Proc. IEEE, vol. 82, no. 2, Feb., 1990. 

[17] C.J. Shi, X. Tan, “Symbolic Analysis of Large Analog Circuits with Determinant Decision Diagrams,” Proc. 

ACM/IEEE ICCAD, 1997. 

[18]Q. Yu and C. Sechen, “A Unified Approach to the Approximate Symbolic Analysis of Large Analog Integrated 

Circuits,” IEEE Trans. Circuits and Sys., vol. 43, no. 8, August 1996. 

[19] F. Medeiro, F.V. Fernandez, R. Dominguez-Castro and A. Rodriguez-Vasquez, “ A Statistical Optimization 

Based Approach for Automated Sizing of Analog Cells,” Proc. ACM/IEEE ICCAD, 1994.

[20] S. Kirkpatrick, C.D. Gelatt, M.P. Vecchi, “Optimization by simulated annealing,” Science, vol. 220, no. 4598, 

13 May 183. 

[21]L. T. Pillage and R.A. Rohrer, “Asymptotic Waveform Evaluation for Timing Analysis,” IEEE Trans. CAD, 

vol. 9. no. 4, April 1990. 

[22]W. Nye, et al., “DELIGHT.SPICE: an optimization-based system for the design of integrated circuits,” IEEE 

Trans. CAD, vol. 7, April 1988. 

[23]M. Krasnicki, “Generalized Analog Circuit Synthesis,” M.S. Thesis, Dept. of ECE, Carnegie Mellon, Dec. 1997. 

[24] K. Nakamura and L.R. Carley, “A current-based positive-feedback technique for efficient cascode 

bootstrapping,” Proc. VLSI Circuits Symposium, June 1991. 

[25] J.H. Holland. Adaptation in Nature and Artificial Systems, University of Michigan Press, Ann Arbor, 1975. 

[26] S. W. Mahfoud and D.E. Goldberg, “Parallel Recombinative Simulated Annealing: A Genetic Algorithm,” 

Parallel Computing, vol. 21, 1995. 

[27] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, V. Sunderam. PVM: Parallel Virtual Machine A 

User’s Guide and Tutorial for Network Parallel Computing. MIT Press, 1994. 

[28]T. Mukherjee, L.R. Carley, R.A. Rutenbar, “Synthesis of Manufacturable Analog Circuits,” Proc. ACM/IEEE 

ICCAD, 1994.

DAC'99, pages 951-957 

Behavioral Synthesis of Analog Systems using Two-Layered Design Space Exploration 

Alex Doboli, Adrian Nunez-Aldana, Nagu Dhanwada, Sree Ganesan, and Ranga Vemuri 

Laboratory for Digital Design Environments, Department of ECECS, 

University of Cincinnati, Cincinnati, OH 45221 

Abstract 

This paper presents a novel approach for synthesis of analog systems from behavioral VHDL- 

AMS specifications. We implemented this approach in the VASE behavioral-synthesis tool. The 

synthesis process produces a netlist of electronic components that are selected from a component 

library and sized such that the overall area is minimized and the rest of the performance 

constraints such as power, slew-rate, bandwidth, etc. are met. The gap between system level 

specifications and implementations is bridged using a hierarchically-organized, design-space 

exploration methodology. Our methodology performs a two-layered synthesis, the first being 

architecture generation, and the other component synthesis and constraint transformation. For 

architecture generation we suggest a branch-and-bound algorithm, while component synthesis 

and constraint transformation use a Genetic Algorithm based heuristic method. Crucial to the 

success of our exploration methodology is a fast and accurate performance estimation engine that 

embeds technology process parameters, SPICE models for basic circuits and performance 

composition equations. We present a telecommunication application as an example to illustrate 

our synthesis methodology, and show that constraint-satisfying designs can be synthesized in a 

short time and with a reduced designer effort. 

References 

[1] “IEEE Standard VHDL Language Reference Manual (Integrated with VHDLAMS changes)”, IEEE Std.1076.1. 

[2] B.G.Arsintescu, E. Charbon, E. Malavasi, U. Choudhury,W.H. Kao, “GeneralAC Constraint Transformation for 

Analog ICs”, Proc. of the 35th Design Automation Conference, pp.38-43, 1998. 

[3] P. Campisi, “ACMOSAnalog Cell Library forAnalog Synthesis Systems”,Master of Science Thesis, University 

of Cincinnati, 1998. 

[4] L.R. Carley, G. Gielen, R. Rutenbar, W. Sansen, “Synthesis Tools for Mixed-Signal ICs: Progress on Frontend 

and Backend Strategies”, Proc. of the 33 rd Design Automation Conference, pp.298-303, 1996. 

[5] H. Chang et al, “A Top-Down Constraint Driven Methodology for Analog Integrated Circuits”, Kluwer 

Academic, 1997. 

[6] J. M. Cohn et al, “KOAN/ANAGRAM II: New Tools for Device-Level Analog Placement and Routing”, IEEE 

JSSC, Vol.26, Nr.3, March 1991. 

[7] N.R. Dhanwada, A. Nunez, R. Vemuri, “Hierarchical Constraint Transformation using Directed Interval Search 

for Analog Synthesis” , Proceedings of DATE’99, pp.328-335, 1999. 

[8] A. Doboli, R. Vemuri, “The Definition of a VHDL-AMS Subset for Behavioral Synthesis of Analog Systems”, 

IEEE/VIUF BMAS’98, 1998. 

[9] A. Doboli, A. Nunez-Aldana, N. Dhanwada, R. Vemuri, “VHIF - A Hierarchical Representation for Behavioral 

Synthesis of Analog Systems from VHDL-AMS”, Technical Report, DDEL, University of Cincinnati, April 1998. 

[10] A. Doboli, R. Vemuri, “A VHDL-AMS Compiler and Architecture Generator for Behavioral Synthesis of 

Analog Systems”, Proceedings of DATE’99, pp.338-345, 1999. 

[11] S. Donnay et al, “Using Top-Down CAD Tools for Mixed Analog/Digital ASICs: a Practical Design Case”, 

Analog Integrated Circuits and Signal Processing, pp.101-117, 1996. 

[12] S. Franco, “Design with Operational Amplifiers and Analog Integrated Circuits”, McGraw Hill, 1988. 

[13] M. Gen, R. Cheng, “Genetic Algorithms and Engineering Design”, John Wiley & Sons, 1997. 

[14] G. Gielen, H. Walscharts, W. Sansen, “Analog Circuit Design Optimization Based on Symbolic Simulation and 

Simulated Annealing”, IEEE Trans on Solid-State Circuits, Vol.25, No.3, pp.707-713, June 1990. 

[15] E. Horowitz, S. Sahni, “Fundamentals of Computer Algorithms”, Computer Science Press, 1985.

[16] D. Leenaerts, “Application of Interval Analysis for Circuit Design”, IEEE Transactions of Circuits and 

Systems, Vol.37, No.6, pp.803-807, June 1990. 

[17] M.del Mar Hershenson,S.Boyd,T.Lee,“CMOSOperational Amplifier Design and Optimization via Geometric 

Programming”,Proc. 1st Int’l Workshop on Design of Mixed-Mode Integrated Circuits and Applications, 1997. 

[18] A. Nunez and R. Vemuri, “An Analog Performance Estimator for Improving the Effectiveness of CMOS 

Analog System Circuit Synthesis”, Proceedings of DATE’99, pp.406-411, 1999. 

[19] W. Nye, D. Riley, A. Sangiovanni-Vincentelli, A. Tits, “DELIGHT.SPICE: an optimization-based system for 

the design of integrated circuits”, IEEE Transaction on CAD, vol.7, No.4, pp.501-519, April 1988. 

[20] E. Ochotta, R. Rutenbar, R. Carley, “ASTRX/OBLX: Tools for Rapid Synthesis of High-Performance Analog 

Circuits”, Proc. of the 31st ACM/IEEE Design Automation Conference, pp.24-30, 1994.

DAC'99, pages 958-963 

Circuit Complexity Reduction for Symbolic Analysis of Analog Integrated Circuits 

Walter Daems Georges Gielen Willy Sansen 

Katholieke Universiteit Leuven, Department of Electrical Engineering, ESAT-MICAS 

B-3001 Heverlee, Belgium 

Abstract 

This paper presents a method to reduce the complexity of a linear or linearized (small-signal) 

analog circuit. The reduction technique, based on quality-error ranking, can be used as a standard 

reduction engine that ensures the validity of the resulting network model in a specific (set of) 

design point(s) within a given frequency range and a given magnitude and phase error. It can 

also be used as an analysis engine to extract symbolic expressions for poles and zeroes. The 

reduction technique is driven by analysis of the signal flow graph associated with the network 

model. Experimental results show the effectiveness of the approach. 

References 

[1] G.E. Alderson and P.M. Lin, “Computer generation of symbolic network functions: A new theory and 

implementation”, IEEE Trans. on Circuit Theory, vol. 20, no. (1), pp. 48–56, January 1973. 

[2] S.J. Seda, G.R. Degrauwe, and W. Fichtner, “A symbolic analysis tool for analog circuit design automation”, in 

Proc. IEEE/ACM ICCAD, Santa Clara, 1988, pp. 488–491. 

[3] G. Gielen, H. Walscharts, and W. Sansen, “ISAAC: a symbolic simulator for analog integrated circuits”, IEEE J. 

Solid-State Circuits, vol. 24, no. 6, pp. 1587–1597, Dec. 1989. 

[4] F.V. Fernández, A. Rodríguez-Vázquez, J.-L. Huertas, and G. Gielen, Symbolic Analysis Techniques: 

applications to analog design, IEEE Press, 1997. 

[5] C.A. Desoer and E.S. Kuh, Basic Circuit Theory, McGraw-Hill, California, 1969. 

[6] P. Wambacq, F.V. Fernández, G. Gielen, W. Sansen, and A. Rodríguez-Vázquez, “Efficient symbolic 

computation of approximated small-signal characteristics”, IEEE J. Solid-State Circuits, vol. 30, no. 3, pp. 327–330, 

Mar. 1995. 

[7] F.V. Fernández, A. Rodríguez-Vázquez, and J.-L. Huertas, “Interactive ac modeling and characterization of 

analog circuits via symbolic analysis”, Kluwer J. Analog Integrated Circuits and Signal Processing, vol. 1, pp. 183– 

208, Nov. 1991. 

[8] Q. Yu and C. Sechen, “Efficient approximation of symbolic network functions using matroid intersection 

algorithms”, in Proc. 3rd Workshop on Symbolic Methods and Applications to Circuit Design, Sevilla, Oct. 1994, 

pp. 261–227. 

[9] Q. Yu and C. Sechen, “Approximate symbolic analysis of large analog integrated circuits”, in Proc. IEEE/ACM 

ICCAD, 1994, pp. 664–671. 

[10] R. Sommer, E. Hennig, G. Dröge, and E.-H. Horneber, “Equation-based symbolic approximation by matrix 

reduction with quantitative error prediction”, Alta Frequenza - Rivista Di Elettronica, vol. 5, no. 6, pp. 317–325, 

Nov. 1993. 

[11] S. Mason, “Feedback theory—some properties of signal flow graphs”, in Proc. IRE, Sept. 1953, pp. 1144– 

1156. 

[12] C.L. Coates, “Flow-graph solutions of linear algebraic equations”, in IRE Trans. Circuit Theory, June 1959, 

vol. 6, pp. 170–187.

DAC'99, pages 964-969 

Cycle and Phase Accurate DSP Modeling and Integration for HW/SW Co-Verification 

Lisa Guerra*, Joachim Fitzner**, Dipankar Talukdar*, Chris Schläger**, Bassam Tabbara+, 

Vojin Zivojnovic** 

*Conexant Systems, Newport Beach, CA 92660, USA 

**?AXYS GmbH, 52134 Herzogenrath, Germany 

+UC Berkeley EECS Dept., Berkeley, CA 94720, USA 

ABSTRACT 

We present our practical experience in the modeling and integration of cycle/phase-accurate 

instruction set architecture (ISA) models of digital signal processors (DSPs) with other hardware 

and software components. A common approach to the modeling of processors for HW/SW coverification 

relies on instruction-accurate ISA models combined (i.e. wrapped) with the bus 

interface models (BIM) that generate the clock/phase-accurate timing at the component's 

interface pins. However, for DSPs and new microprocessors with complex architectural features 

this approach is from our perspective not acceptable. The additional extensive modeling of the 

pipeline and other architectural details in the BIM would force us to develop two detailed 

processor models with a complex BIM API between them. We therefore propose an alternative 

approach in which the processor ISAs themselves are modeled in a full cycle/phase-accurate 

fashion. The bus interface model is then reduced to just modeling the connection to the pins. Our 

models have been integrated into a number of cycle-based and event-driven system simulation 

environments. We present one such experience in incorporating these models into a VHDL 

environment. The accuracy has been verified cycle-by-cycle against the gate/RTL level models. 

Multi-processor debugging and observability into the precise cycle-accurate processor state is 

provided. The use of co-verification models in place of the RTL resulted in system speedups up 

to 10 times, with the cycle-accurate ISA models themselves reaching performances of up to 

123K cycles/sec. 

REFERENCES 

[1] T. Albrecht, J. Notbauer, S. Rohringer, “HW/SW CoVerification Performance Estimation & Benchmark for a 24 

Embedded RISC Core Design,” DAC, pp. 808-811, 1998. 

[2] F. Balarin, M. Chiodo, P. Guisto, H. Hsieh, A. Jurecska, L. Lavagno, C. Passerone, A. Sangiovanni-Vincentelli, 

B. Tabbara, “Hardware-Software Co-Design of Embedded Systems: The POLIS Approach,” Kluwer Academic 


[3] D. Becker, R. Singh, S. Tell, "An engineering environment for hardware/software co-simulation," DAC, pp. 129- 

134, 1992. 

[4] W.T. Chang, A. Kalavade, E. Lee, "Effective heterogenous design and co-simulation," NATO Advanced Study 

Institute Workshop on Hardware/software codesign, June 1995. 

[5] S. Coumeri, D. Thomas, "A simulation environment for hardware-software codesign," ICCD, pp. 58-63, 1995. 

[6] D. Ditzel, A. Berenbaum, “Using CAD tools in the design of CRISP,” IEEE Design & Test, 21-31, June 1987. 

[7] R. Earnshaw, L. Smith, K. Welton, " Challenges in crossdevelopment," IEEE Micro, pp. 28-36, July/Aug. 1997. 

[8] R. Gupta, C. Coelho, G. De Micheli, “Synthesis and simulation of digital systems containing interacting 

hardware and software components.” DAC, pp. 225-230, 1992. 

[9] R. Klein, “Miami: A hardware software co-simulation environment,” IEEE Int’l workshop on rapid system 

Prototyping, pp. 173-77, 1996 

[10] B. Lin, K. Van Rompaey, S. Vercauteren, D. Verkest, I. Bolsens, H. De Man, "Designing single chip systems," 

ASIC , 1996. 

[11] G. Maturana, J. Ball, J. Gee, A. Iyer, J. M. O’Connor, “Incas: A cycle accurate model of UltraSPARC,” ICCD, 

pp. 130-135, 1995. 

[12] J. Rowson, "HW/SW co-simulation," DAC, pp. 439-440, 1994.

[13] B. Schnaider, E. Yogev, "Software development in a hardware simulation environment," DAC, pp. 684-689, 

1996. 

[14] Synopsys Eagle tool. http://www.synopsys.com/products/hwsw/. 

[15] V. Zivojnovic, H. Meyr, "Compiled HW/SW co-simulation,” DAC, pp. 690-695, 1996. 

[16] AXYS SuperSim simulators. http://www.axys.de/products.

DAC'99, pages 970-975 

A Study in Coverage-Driven Test Generation 

Mike Benjamin, 

STMicroelectronics, Bristol, BS32 4SQ, UK 

Daniel Geist, Alan Hartman, Yaron Wolfsthal 

IBM Science and Technology, Matam - Advanced Technology Center, 31905, Haifa, Israel 

Gerard Mas, Ralph Smeets 

STMicroelectronics, 38240, Meylan, France 

ABSTRACT 

One possible solution to the verification crisis is to bridge the gap between formal verification 

and simulation by using hybrid techniques. This paper presents a study of such a functional 

verification methodology that uses coverage of formal models to specify tests. This was applied 

to a modern superscalar microprocessor and the resulting tests were compared to tests generated 

using existing methods. The results showed some 50% improvement in transition coverage with 

less than a third the number of test instructions, demonstrating that hybrid techniques can 

significantly improve functional verification. 

Keywords: Functional verification, test generation, formal models, transition coverage 

REFERENCES 

[1] A. Aharon, D. Goodman, M. Levinger, Y. Lichtenstein, Y. Malka, C. Metzger, M. Molcho, and G. Shurek. Test 

program generation for functional verification of PowerPC processors in IBM. In 32nd Design Automation 

Conference, DAC 95, pages 279–285, 1995. 

[2] F.Casaubieilh, A.McIsaac, M.Benjamin, M.Bartley, F.Pogodolla, F.Rocheteau, M.Belhadj, J.Eggleton, G.Mas, 

G.Barrett, C.Berthet, Functional Verification Methodology of Chameleon Processor. In DAC 96: 33rd Design 

Automation Conference, June 1996, Las Vegas. 

[3] D.L.Dill, What’s Between Simulation and Formal Verification?. In DAC 98: 35th Design Automation 

Conference, June 1999, San Francisco 

[4] D. Geist, M. Farkas, A. Landver, Y. Lichtenstein, S. Ur, and Y. Wolfsthal, Coverage directed test generation 

using symbolic techniques. In FMCAD 96, Nov. 1996, Palo Alto. 

[5] R. C. Ho, C. H. Yang, M. A. Horowitz, and D. L. Dill. Architecture validation for processors. In International 

Symposium of Computer Architecture 1995, pages 404–413, 1995 

[6] Murø: URL:http://sprout.stanford.edu/dill/murphi.htm 

[7] 0-In Design Automation, URL:http://www.0-In.com/

DAC'99, pages 976-981 

IC Test Using the Energy Consumption Ratio 

Wanli Jiang, Bapiraju Vinnakota 


University of Minnesota, Minneapolis, MN 55455 

Abstract 

Dynamic-current based test techniques can potentially address the drawbacks of traditional and 

Iddq test methodologies. The quality of dynamic current based test is degraded by process 

variations in IC manufacture. The energy consumption ratio (ECR) is a new metric that improves 

the effectiveness of dynamic current test by reducing the impact of process variations by an order 

of magnitude. We address several issues of significant practical importance to an ECR-based test 

methodology. We use the ECR to test a low-voltage submicron IC with a microprocessor core. 

The ECR more than doubles the effectiveness of the dynamic current test already used to test the 

IC. The fault coverage of the ECR is greater than that offered by any other test, including Iddq.We 

develop a logic-level fault simulation tool for the ECR and techniques to set the threshold for an 

ECR-based test process. Our results demonstrate that the ECR offers the potential to be a highquality 

low-cost test methodology. To the best of our knowledge, this is the first dynamic-current 

based test technique to be validated with manufactured ICs. 

References 

[1] S. Chakravarty, P.J. Thadikaran, “Simulation and Generation of I DDQ Tests for Bridging Faults in Combinational 

Circuits,” IEEE Trans. Comput., Vol. 45, No. 10, pp. 1131–1140, Oct., 1996. 

[2] T. Chen, I.N. Hajj, et al, “An Efficient I DDQ Test Generation Scheme for Bridging Faults in CMOS Digital 

Circuits,” Digest of Papers, IEEE Int. Workshop on I DDQ Testing, pp. 74–78, 1996. 

[3] J.T. Chang, E.J. McCluskey, “Detecting Bridging Faults in Dynamic CMOS Circuits,” Digest of Papers, IEEE 

Int. Workshop on I DDQ Testing, pp. 106–109, 1997. 

[4] M. Dalpasso, M. Favalli, and P. Olivo, “I DDQ Test Invalidation by Break faults,” Electronics Letters, Vol. 32, No. 

11, pp. 994–995, 1996. 

[5] T.W. Williams, R. Kapur, et al, “I DDQ Testing for High Performance CMOS—The Next Ten Years,” Proc. 

European Design and Test Conf., pp. 578–583, 1996. 

[6] M. Sachdev, “Deep Sub-micron I DDQ Testing: Issues and Solutions,” Proc. European Design and Test Conf., pp. 

271–278, 1997. 

[7] J. Beasely, H. Ramamurthy, J. Ramirez-Angulo, and M. Deyong, “Idd Pulse Response Testing: A Unified 

Approach to Testing Digital and Analogue ICs,” Electronics Letters, pp. 2101–2103, Nov. 25 1993. 

[8] S.-T. Su, R. Makki, and T. Nagle, “Transient Power Supply Current Monitoring—A New Test Method for 

CMOS VLSI Circuits,” Journal of Electronic Testing: Theory and Applications, pp. 23–43, Feb. 1995. 

[9] J. A. Segura, M. Roca, D. Mateo, and A. Rubio, “An Approach to Dynamic Power Consumption Testing of 

CMOS ICs,” Proc. IEEE VLSI Test Sympos., pp. 95–100, April 1995. 

[10] J.F. Plusquellic, D.M. Chiarulli and S.P. Levitan, “Digital Integrated Circuit Testing Using Transient Signal 

Analysis,” IEEE ITC, pp. 481–490, 1996. 

[11] A. Walker, P.K. Lala, “An Approach for Detecting Bridging Fault-induced Delay Faults in Static CMOS 

Circuits Using Dynamic Power Supply Current Monitoring,” Digest of Papers, IEEE Int. Workshop on IDDQ 

Testing, pp. 73–77, 1997. 

[12] Y. Min, Z. Zhao, and Z. Li, “IDDT Testing,” Proc. Asian Test Sympos., pp. 378–382, 1997. 

[13] Bapiraju Vinnakota, “Monitoring Power Dissipation for Fault Detection”, Proc. IEEE VLSI Test Sympos., pp. 

483–488, Apr., 1996. 

[14] Bapiraju Vinnakota, Wanli Jiang, D. Sun, “Process-Tolerant Test with Energy Consumption Ratio”, Proc. 

IEEE ITC98, Oct., 1998. 

[15] J. Lin et al, “A Cell-based Power Estimation in CMOS Combinational Circuits,” Proc. IEEE Int. Conf. 

Computer-Aided Design, pp.304–309, 1994.

[16] H. Sarin, and A. McNelly, “A Power Modeling and Characterization Method for Logic Simulation,” Proc. 

IEEE Custom Integrated Circuits Conf., pp.363–366, 1995. 

[17] http://www.mosis.org, Mosis Web Site.

DAC'99, pages 982-987 

Design Strategy of On-Chip Inductors for Highly Integrated RF Systems 

C. Patrick Yue 

T-Span Systems Corporation, Palo Alto, CA 94301 

S. Simon Wong 

Stanford University, Center for Integrated Systems, Stanford, CA 94305 

ABSTRACT 

This paper describes a physical model for spiral inductors on silicon which is suitable for circuit 

simulation and layout optimization. Key issues related to inductor modeling such as skin effect 

and silicon substrate loss are discussed. An effective ground shield is devised to reduce substrate 

loss and noise coupling. A practical design methodology based on the trade-off between the 

series resistance and oxide capacitance of an inductor is presented. This method is applied to 

optimize inductors in state-of-the-art processes with multilevel interconnects. The impact of 

interconnect scaling, copper metallization and low-K dielectric on the achievable inductor 

quality factor is studied. 

Keywords: Spiral inductor, quality factor, skin effect, substrate loss, substrate coupling, 

patterned ground shield, interconnects 

REFERENCES 

[1] N.M. Nguyen and R.G. Meyer, “Si IC-compatible inductors and LC passive filters,” IEEE Journal of Solid-State 

Circuits, vol. 25, no. 4, pp. 1028-?1030, August 1990. 

[2] D. Lovelace, N. Camilleri, and G. Kannell, “Silicon MMIC inductor modeling for high volume, low cost 

applications,” Microwave Journal, pp. 60-71, August 1994. 

[3] J. Crols, P. Kinget, J. Craninckx, and M.S.J. Steyaert, “An analytical model of planar inductors on lowly doped 

silicon substrates for high frequency analog design up to 3 GHz,” in 1996 Symposium on VLSI Circuits Digest of 

Technical Papers, pp. 28?-29, June 1996. 

[4] C.P. Yue, C. Ryu, J. Lau, T.H. Lee, and S.S. Wong, “A physical model for planar spiral inductors on silicon,” in 

1996 International Electron Devices Meeting Technical Digest, pp. 155?-158, December 1996. 

[5] J.R. Long and M.A. Copeland, “The modeling, characterization, and design of monolithic inductors for silicon 

RF IC’s,” Journal of Solid-State Circuits, vol. 32, no. 3, pp. 357?-369, March 1997. 

[6] A.M. Niknejad and R.G. Meyer, “Analysis and optimization of monolithic inductors and transformers for RF 

ICs,” in Proceedings of the IEEE 1997 Custom Integrated Circuits Conference, pp. 375?-378, May 1997. 

[7] J.Y.-C. Chang, A.A. Abidi, and M. Gaitan, “Large suspended inductors on silicon and their use in a 2-mm 

CMOS RF amplifier,” IEEE Electron Device Letters, vol. 14, no.5, pp. 246-248, May 1993. 

[8] K.B. Ashby, I.A. Koullias, W.C. Finley, J.J. Bastek, and S. Moinian, “High Q inductors for wireless applications 

in a complementary silicon bipolar process,” IEEE Journal of Solid-State Circuits, vol. 31, no. 1, pp. 4-?9, January 

1996. 

[9] J.N. Burghartz, M. Soyuer, and K.A. Jenkins, “Integrated RF and microwave components in BiCMOS 

technology,” IEEE Transactions on Electron Devices, vol. 43, no. 9, pp. 1559?-1570, September 1996. 

[10] C.P. Yue and S.S. Wong, “On-chip spiral inductors with patterned ground shields for Si-based RF IC’s,” IEEE 

Journal of Solid-State Circuits, vol. 33, no. 5, pp.743?-752, May 1998. 

[11] T.D. Stetzler, I.G. Post, J.H. Havens, and M. Koyama, “A 2.7?4.5 V single chip GSM transceiver RF integrated 

circuit,” IEEE Journal of Solid-State Circuits, vol. 30, no. 12, pp. 1421-?1429, December 1995. 

[12] R.G. Meyer, W.D. Mack, and J.J.E.M. Hagreraats “A 2.5-GHz BiCMOS transceiver for wireless LAN’s,” IEEE 

Journal of Solid-State Circuits, vol. 32, no. 12, pp. 2097-?2104, December 1997. 

[13] D.K. Shaeffer, A.R. Shahani, S.S. Mohan, H. Samavati, H. Rategh, M.M. Hershenson, M. Xu, C.P. Yue, D. 

Eddleman, and T.H. Lee, "A 115-mW, 0.5-µm CMOS GPS receiver with wide dynamic-range active filters," IEEE 

Journal of Solid-State Circuits, vol. 33, no. 12, pp. 2219-?2231, December 1998. 

[14] F.W. Grover, Inductance Calculations, Princeton, New Jersey: Van Nostrand, 1946. Reprinted by New York, 

New York: Dover Publications, 1962.

[15] H.M. Greenhouse, “Design of planar rectangular microelectronic inductors,” IEEE Transactions on Parts, 

Hybrids, and Packing, vol. PHP-10, no. 2, pp. 101?-109, June 1974. 

[16] Maxwell 2D Parameter Extractor User’s Reference, Ansoft Corporation, 1997. 

[17] C.P. Yue and S.S. Wong, “A study on substrate effects of silicon-based RF passive components,” in 1999 MTT- 

S International Microwave Symposium Digest, June 1999. 

[18] National Technology Roadmap for Semiconductors, SIA, 1997.

DAC'99, pages 988-993 

The Simulation and Design of Integrated Inductors 

N.R. Belk 1 , M.R.Frei 2 , M. Tsai 2 , A.J. Becker 2 , K.L. Tokuda 2 

1 Bell Laboratories, Lucent Technologies, Holmdel NJ 07733 

2 Bell Laboratories, Lucent Technologies, Murray Hill, NJ 07974 

ABSTRACT 

At present there are two common types of integrated circuit inductor simulation tools. The first 

type is based on the Greenhouse methods[1], and obtains a solution in a fraction of a second; 

however, because it does not use solutions of the inductor charge and current distributions, it has 

limited accuracy. The second type, method of moments (MoM) solvers, determines the charge 

and current variations by decomposing the inductor into thousands of sub elements and solving a 

matrix. However, this process takes between minutes and hours to obtain a reasonably accurate 

solution. In this paper, we present a series of algorithms for solving inductors, of radius small 

compared to the wave length of the electrical signal, that equal or exceed the accuracy of MoM 

solvers, but obtain those solutions in roughly 1 second. 

REFERENCES 

[1]E. Pettenpaul, et al., IEEE trans.on microwave theory and techniques,36(2):294-304, Feb 1988. 

[2]Sommerfeld, A. 1949. Partial Differential Equations in Physics. New York:Acedemic Press. 

[3]Born, M. and E. Wolf. 1980 Principles of Optics. New York:Pergamon Press. 

[4]Bellman, R., and G. M. Wing. 1975. An Introduction to invariant Imbedding. New York:John Wiley & Sons. 

[5]Chew, W.C., 1995 Waves and Fields in Inhomogeneous Media. IEEE Press Piscataway, NJ. 

[6]R.L. Remke and G.A. Burdick, Spiral Inductors for Hybrid and MicroWave Applications, in Proc. 24th Electron 

Components Conf. (Washington, DC) May 1974, pp. 152-161. 

[7]U.A. Shivastava, Fast and Accurate Algorithms of Self and Mutual Inductances of Rectangular Conductors, 8th 

Ann. Int. elect. Packag. Conf., IEPS-8, pp. 488-507, Nov.1988.

DAC'99, pages 994-998 

Optimization of Inductor Circuits via Geometric Programming 

Maria del Mar Hershenson, Sunderarajan S. Mohan, Stephen P. Boyd, Thomas H. Lee 

Electrical Engineering Department, Stanford University, Stanford CA 94305 

Abstract 

We present an efficient method for optimal design and synthesis of CMOS inductors for use in 

RF circuits. This method uses the the physical dimensions of the inductor as the design 

parameters and handles a variety of specifications including fixed value of inductance, minimum 

self-resonant frequency, minimum quality factor, etc. Geometric constraints that can be handled 

include maximum and minimum values for every design parameter and a limit on total area. 

Our method is based on formulating the design problem as a special type of optimization 

problem called geometric programming, for which powerful efficient interior-point methods 

have recently been developed. This allows us to solve the inductor synthesis problem globally 

and extremely efficiently.Also, we can rapidly compute globally optimal trade-off curves 

between competing objectives such as quality factor and total inductor area. 

We have fabricated a number of inductors designed by the method, and found good agreement 

between the experimental data and the specifications predicted by our method. 

References 

[1] C. P. Yue et al. A physical model for planar spiral inductors on silicon. In Proceedings IEEE IEDM’96, 1996. 

[2] R. J. Duffin, E. L. Peterson, and C. Zener. Geometric Programming — Theory and Applications. Wiley, 1967. 

[3] C. P. Yue and S. S. Wong. On-chip spiral inductors with patterned ground shields for Si-based RF IC’s. IEEE 

Journal of solid-state circuits, 33(5):743–752, May 1998. 

[4] J. R. Long and M. A. Copeland. The modeling, characterization, and design of monolithic inductors for silicon 

RF IC’s. IEEE Journal of Solid-State Circuits, 32(3):357–369, March 1997. 

[5] S. S. Mohan et al. Simple accurate expressions for planar spiral inductances. Submitted to IEEE Journal of 

Solid-State Circuits, http://smirc.stanford.edu/, 1999. 

[6] Thomas H. Lee. The design of CMOS radio-frequency integrated circuits. Cambridge University Press, 1998. 

[7] M. Hershenson, S. Boyd, and T. H. Lee. GPCAD: A tool for CMOS op-amp synthesis. In Proceedings of the 

IEEE/ACMInternational Conference on Computer Aided Design, San Jose, CA, November 1998.

DAC'99, pages 999-1000 

Panel: What is the Proper System on Chip Design Methodology? 

Chair: Richard Goering - EE Times, Felton, CA 

Panel Members: Pierre Bricaud, James G. Dougherty, Steve Glaser, Michael Keating, 

Robert Payne, Davoud Samani 

Over the past year two distinct answers have emerged regarding SoC design methodologies. On 

the one hand, it is posited in the Reuse Methodology Manual, that a logic synthesis-based design 

methodology can be used effectively to develop system chips. An alternative methodology 

focuses on integration (or "reference") platforms and the customization of the basic applicationspecific 

platform through the addition of selected SW and/or HW IP blocks. This panel session 

will debate the merits of these seemingly incompatible proposed SoC methodologies.

An Efficient Lyapunov Equation-Based Approach for ... - Lirmm

Create successful ePaper yourself

Delete template?

Save as template?