# HiPRIME: Hierarchical and Passivity Reserved Interconnect Macromodeling Engine for RLKC Power Delivery

Yahong Cao, Yu-Min Lee, Tsung-Hao Chen and Charlie Chung-Ping Chen ( ycao, yu-min, tchen )@cae.wisc.edu, chen@engr.wisc.edu Department of Electrical and Computer Engineering University of Wisconsin at Madison Madison, WI 53706

# ABSTRACT

This paper proposes a general hierarchical analysis methodology, HiPRIME, to efficiently analyze RLKC power delivery systems. After partitioning the circuits into blocks, we develop and apply the IEKS (Improved Extended Krylov Subspace) method to build the Multi-port Norton Equivalent circuits which transform all the internal sources to Norton current sources at ports. Since there is no active elements inside the Norton circuits, passive or realizable model order reduction techniques such as PRIMA can be applied. To further reduce the top-level hierarchy runtime, we develop a second-level model reduction algorithm and prove its passivity. Experimental results show 400-700X runtime improvement with less than 0.2% error.

# 1. CATEGORIES & SUBJECT DESCRIPTORS

B. HardwareB.8 Performance and ReliabilityB.8.2 Performance Analysis and Design Aids

# 2. GENERAL TERMS

Algorithms, Design, Measurement, Performance, Theory, Reliability, Verification

# 3. INTRODUCTION

With the UDSM (Ultra Deep Sub-Micron) technology, several features about today's chips (faster operating frequencies, larger amount of transistors, smaller feature size and lower power voltage) have pushed the power delivery noise analysis onto the list of the designer's high-priority concerns. Basically, power delivery noise consists of IR-drop, Ldi/dt drop and resonance fluctuations. IR-drop has been widely discussed and extensively studied in the literature [7], [8] and [9]. Due to the roaring clock frequency, increasing current consumption, and even the clock gating feature, Ldi/dt noise is quickly emerging into another power fluctuation concern

DAC 2002, June 10-14, 2002, New Orleans, Louisiana, USA.

Copyright 2002 ACM 1-58113-461-4/02/0006 ...\$5.00.

[8]. Power delivery noise, which causes the power voltage greater or lower than the ideal value, can degrade the performance severely and even make the gate function wrong. Therefore, extensive analysis of RLKC power delivery systems are required to ensure they meet the targeted performance and reliability goals.

Generally speaking, one of the major difficulties for power delivery analysis is size explosion. Tens of millions of devices and parasitics are required to be modeled and simulated for over a long time period. The traditional circuit simulation engine such as SPICE [11] can not fulfill the demanding task in a timely manner. For this reason, hierarchical simulation algorithms have been proposed by David, Sachin, et al [8].

Model order reduction techniques have also been extensively studied in the literature [6]. Starting from AWE (Asymptotic Waveform Evaluation) [5] to PRIMA (Passive Reduction Interconnect Macromodeling Algorithm), model order reduction techniques have been successfully extended to consider inductance effects in a reasonable accuracy. Later, [7] develops an EKS (Extended Krylov Subspace) method to simulate large scale power delivery circuits with many current sources. To resolve the source waveform modeling issues, EKS has to perform moment shifting procedures to recover the proper moments.

In this paper, we propose a novel hierarchical power delivery analysis methodology. The contributions of our method are as follows: First, we establish a novel hierarchical power delivery macromodeling methodology, which integrates multiple-port Norton equivalent theorem with model order reduction algorithm to generate compact and accurate model and achieve significant runtime improvement. Secondly, we enhance the EKS method such that it no longer needs to perform moment shifting for source waveform modeling. Therefore, highly accurate simulation results are observed. Finally, to further reduce runtime, we develop a multiple level passive model reduction algorithm and prove its passivity.

The outline of this paper is as follows. In section 2, we introduce the basic power delivery network modeling, circuit formulations and model order reduction. In section 3, we present our hierarchical and passive order-reduced macromodeling methodology. In the last section, we present the experimental results.

# 4. PRELIMINARY

RLKC elements are applied to model power delivery system as shown in Figure 1. To reduce the simulation runtime, we decouple the linear simulation from the nonlinear

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

simulation. Once the nonlinear simulation has been done, we use current sources and capacitors to model gate current consumption and the diffusion and gate capacitance, respectively. Therefore, the power grid analysis problem is reduced to simulate a linear RLKC network with linear time-varying current sources and measure the voltage drop at each grid. A linear RLKC circuit can be represented as a set of Modi-



Figure 1: Modeling of the power grid network

fied Nodal Analysis (MNA) circuit equations as following:

$$Gx + C\frac{d}{dt}x = Bu,\tag{1}$$

where x is MNA variable vector, u denotes a vector of the port voltage sources and internal current sources.

 $G = \begin{bmatrix} N & E \\ -E^T & 0 \end{bmatrix}$ ,  $C = \begin{bmatrix} Q & 0 \\ 0 & H \end{bmatrix}$ ,  $X = \begin{bmatrix} v \\ i \end{bmatrix}$  where v and i corresponds to the voltages at nodes and branch currents flowing through inductors and voltage sources. G and C matrices represent the conductance and susceptance matrices respectively. N, Q and H contain the stamping of the resistors, conductors and inductors. Note that H contains both self and mutual inductors (K elements). E corresponds to MNA current variables' contribution to the KCL equations.

Model order-reduction generates an analytic model which is a compact description of original circuits by matching their moments or poles. Circuit equations as shown in Equation 1 can be transformed to Laplace domain.

$$GX + CXs = BU \tag{2}$$

To illustrate the idea of moment matching, we expand both sides of the Equation (2) in a Taylor series around zero frequency

$$(G + sC)(m_0 + m_1s + m_2s^2 + \cdots) = B(u_0 + u_1s + u_2s^2 + \cdots)$$

where  $m_i$  and  $u_i$ , the coefficients of the  $i_{th}$  term in the Taylor series, are known as the  $i_{th}$  moment of x and u respectively. The basic idea of moment matching is to represent the finite unknown moments of the left hand side of the above equation in terms of the known moments of the right hand side. In PRIMA, the sources are assumed to be impulse sources in the moment matching process in order to preserve the input-output transfer characteristics. The impulse sources are constant in the frequency domain. So the Taylor expansion can be modified as

$$(G+sC)(m_0+m_1s+m_2s^2+\cdots) = Bu_0$$

This produces an iterative relationship between the moments of the x(s) and u(s):  $Gm_0 = Bu_0, Gm_i + Cm_{i-1} =$ 0. This explicit moment matching method is seldom used for the reason that it has numerical stability problem, especially in the higher order iterations. To avoid the numerical errors, a set of orthogonal bases is built to span the same subspace as spanned by the finite moments of x(s). The orthogonal bases V, or Krylov subspace, of matrix  $A = -G^{-1}C$  and matrix  $R = G^{-1}B$  is defined as  $K_q(A,R) = colsp(R,AR,A^2R...A^{q-1}R)$ . The dimension of the full description circuit(G,C,B) is reduced because the rank of V is much smaller than the original matrix A. This order-reduced model can be obtained by projecting the original model (G,C,B) onto this Krylov subspace using congruent transformation. The reduced MNA matrices are denoted as  $\widetilde{G} = V^T G V$ ,  $\widetilde{C} = V^T C V$  and  $\widetilde{B} = V^T B$ . Thus the compact model in the time domain are represented in MNA equations as:

$$\widetilde{G}\widetilde{x} + \widetilde{C}\frac{d}{dt}\widetilde{x} = \widetilde{B}u$$

# 5. HIERARCHICAL AND PASSIVE ORDER-REDUCED MACROMODELING

Our hierarchical and passive model order reduction analysis consists of three steps. First we partition power grid networks into multiple blocks. Each block may contain RLC interconnect network and multiple internal switching currents. The second step is to generate Norton equivalent order-reduced model for each block. This step is divided into three phases: The first phrase is to find the passive order-reduced model for the RLC interconnect network of this block. The second phase is to find the Norton equivalent current of the internal current sources. In the last phase of the second step, we attach the Norton equivalent currents at the ports of the order-reduced RLC model. At the third step, we apply integration algorithm to the macromodels generated by the first two steps and apply top level simulation.

The outline and flowchart of our algorithm are shown in Figure 2 and Figure 3, respectively. In the following subsections, we will discuss step A2.1, step A2.2 and step A3 in Figure 2.

- Algorithm: HiPRIME (Hierarchical and Passivity Reserved Interconnect Macromodeling Engine)
  A1. Partition the given circuit into multiple blocks
  A2. For each block, we generate multiport Norton equivalent order reduced circuits by the following procedure:
  - A2.1 Zero all the active sources and perform passive model order reduction for the linear circuit using any passivity guaranteed model reduction algorithm such as PRIMA.
  - **A2.2** Activate all sources and short all the ports nodes to ground and find out the Norton equivalent sources at each port by IEKS or SPICE simulation.
  - **A2.3** Form the Norton Equivalent circuit by attaching the the Norton equivalent sources at each port to the reduced circuit generated by the model reduction algorithms in Step A2.1.
- A3. Form the integrated circuit by combining all the reduced modules. Perform higher level of model order reduction such as IEKS or PRIMA when necessary.

Figure 2: HiPRIME Algorithm



Figure 3: Flowchart for the hierarchical and passivity reserved interconnect macromodeling engine

# 5.1 Passive Reduced-Order Macromodeling of RLC Networks

After a power grid network is partitioned into multiple blocks, each block may contain passive RLC interconnect and internal switching current sources. In order to obtain a passive order-reduced model, we simply disconnect all the internal current sources to make this block a passive RLC network. The effect of these current sources on grid voltages will be considered later. We may apply a conventional passive model order reduction algorithm, PRIMA, to each simplified block. The MNA equations for the RLC interconnect network of the *i*th block is as:

$$G_i X_i + C_i \frac{d}{dt} X_i = B_i u_i \tag{3}$$

where  $u_i$  is the port voltage vector of the  $i_{th}$  block. The result of PRIMA is that, by the transfer matrix  $V_i$ ,  $G_i$ ,  $C_i$  and  $B_i$  are transformed into  $\tilde{G}_i, \tilde{C}_i$  and  $\tilde{B}_i$ , whose dimension are reduced. The compact MNA equation to describe this block is as:

$$\tilde{G}_i \tilde{X}_i + \tilde{C}_i \frac{d}{dt} \tilde{X}_i = \tilde{B}_i u_i$$

The benefits of model order reduction after partition include: 1) Model order reduction algorithms handle much smaller input circuit size, which means the limit of memory might be eased; 2) It also makes parallel order reduction for different blocks possible, by which the speed is improved.

# 5.2 Efficient Way of Finding the Norton Equivalent Current

In this section, we consider the effects of the internal current sources on grid voltages which are ignored in the previous procedure. Norton equivalent theory is used to find out the equivalent current sources at the ports and replace all the internal current sources so that the port responses of each block are preserved. To distinguish the port voltage sources from the internal current sources, we modify Equation (3) as

$$G_i X_i + C_i \frac{d}{dt} X_i = [B_i B'_i] \begin{bmatrix} v_i \\ i_{gi} \end{bmatrix}$$
(4)

where  $v_i$  and  $i_{g_i}$  denote the independent voltage sources and the internal switching current sources in the *i*th block respectively.  $B_i$  and  $B'_i$  denote the position of the voltage sources and the current sources relative to the whole network. The procedure of calculating the equivalent at the ports is illustrated in Figure 4. As shown, the port currents



# Figure 4: Finding the equivalent current of internal sources

with the port voltages set to zeros are the Norton equivalent current sources. Also as we know, port currents can be obtained by  $i_{Ni} = B_i^T X_i$ . To solve Equation (4) with the voltage sources  $u_i$  set to zeros, different algorithms can be applied. In our algorithm, we applied IEKS, an improved version of EKS such that no moment shifting is necessary which will be described in next session.

### 5.2.1 Improved Extended Krylov Subspace

Developed by Janet, et al. at [7], EKS directly calculates the orthogonalized moments of the response when multiple sources are turned on at the same time. Therefore, unlike PRIMA whose runtime is heavily dependent on the port number, the runtime of EKS is independent of that. EKS models a Piece-Wise-Linear(PWL) independent source as a sum of delayed ramps in the Laplace domain:

$$u(s) = \frac{1}{s^2} \sum_{i=1}^{K} r_i exp(-\beta_i s)$$

This expression contains 1/s and  $1/s^2$  terms. Unfortunately the traditional Krylov subspace methods start the moment matching from the  $0_{th}$  moment. EKS extends the Krylov subspace by shifting the moments toward right in the frequency spectrum.

This moment shifting in EKS is tedious and error-prone. Fortunately, we develop an improved moment calculation method which ensure the  $-1_{st}$  and  $-2_{nd}$  order moments are all zero for arbitrary finite time PWL waveform and hence the moment shifting process can be removed. Since for simulation purpose we are only interested for a specific time period, the finite-time assumption is quite general. We believe this procedure is numerically more sound than the original method.



Figure 5: Waveform of the source

LEMMA 1. Given a finite-time PWL source, IEKS calculates a moment representation with  $-1_{st}$  and  $-2_{nd}$  order moments to be zero.

PROOF. Given an finite-time PWL source  $u_i(t)$  as described in Figure 5, we have

$$u_{i}(t) = \sum_{j=0}^{K_{i}} \left\{ [a_{i,j} + \gamma_{i,j}(t - \tau_{i,j})] \operatorname{E}_{(t - \tau_{i,j})} - [a_{i,j+1} + \gamma_{i,j}(t - \tau_{i,j+1})] \operatorname{E}_{(t - \tau_{i,j+1})} \right\}$$
(5)

where  $\gamma_{i,j} = (a_{i,j+1} - a_{i,j})/(\tau_{i,j+1} - \tau_{i,j})$ , and  $E_{(t-\tau_{i,j})}$  is the unit-step function with  $\tau_{i,j}$  delay. By taking the Laplace transform of Equation (5) and Taylor expansion, we now have

$$\mathcal{L}(u_{i}(t))$$

$$= \frac{1}{s^{2}} \sum_{j=0}^{K_{i}} \left\{ a_{i,j} s \sum_{l=0}^{\infty} (-1)^{l} \frac{\tau_{i,j}^{l}}{l!} s^{l} + \gamma_{i,j} \sum_{l=0}^{\infty} (-1)^{l} \frac{\tau_{i,j}^{l}}{l!} s^{l} - a_{i,j+1} s \sum_{l=0}^{\infty} (-1)^{l} \frac{\tau_{i,j+1}^{l}}{l!} s^{l} - \gamma_{i} \sum_{l=0}^{\infty} (-1)^{l} \frac{\tau_{i,j+1}^{l}}{l!} s^{l} \right\}$$

Let  $\tilde{u}_{i,j}$  denote the coefficient of the  $s^j$  term. The moment representation of  $\mathcal{L}(u_i(t))$  can be simplified to

$$\mathcal{L}(u_i(t)) = \left\{ \tilde{u}_{i,-2}s^{-2} + \tilde{u}_{i,-1}s^{-1} + \tilde{u}_{i,0} + \tilde{u}_{i,1}s + \tilde{u}_{i,2}s^2 + \dots + \tilde{u}_{i,m}s^m + \dots \right\}$$
(6)

When calculating the first two coefficients, we conclude

$$\begin{split} \tilde{u}_{i,-2} &= \sum_{j=0}^{K_i} (\gamma_{i,j} - \gamma_{i,j}) = 0\\ \tilde{u}_{i,-1} &= \sum_{j=0}^{K_i} (a_{i,j} - \gamma_{i,j}\tau_{i,j} - a_{i,j+1} + \gamma_{i,j}\tau_{i,j+1})\\ &= \sum_{j=0}^{K_i} (a_{i,j} - a_{i,j+1} - \gamma_{i,j}(\tau_{i,j} - \tau_{i,j+1})) = 0 \end{split}$$

The first two terms are eliminated and Equation (6) can be rewritten as a moment representation starting from the  $0_{th}$  moment:

$$\mathcal{L}(u_i(t)) = \{ \tilde{u}_{i,0} + \tilde{u}_{i,1}s + \tilde{u}_{i,2}s^2 + \dots + \tilde{u}_{i,m}s^m + \dots \}$$

Table 1 summarizes the process of IEKS to calculate the first  ${\cal M}$  moments.

**Algorithm:** IEKS Moments Calculating Algorithm **Input:** N PWL sources,  $\{a_{i,0...(K_i+1)}, \tau_{i,0...(K_i+1)}\}_{i=1}^N$ M, the number of moments to calculate

**Output:**  $\mathbf{u}_{i,m} = \{ \begin{bmatrix} u_{i,1} & u_{i,2} & \cdots & u_{i,M} \end{bmatrix} \}_{i=1}^{N},$ the first M moments of the N PWL sources

$$\begin{aligned} \text{for } i &= 1: N \\ \text{for } i &= 1: N \\ \text{for } j &= 0: K_i \\ \gamma_{i,j} &= \frac{(a_{i,j+1} - a_{i,j})}{(\tau_{i,j+1} - \tau_{i,j})} \\ \text{end} \\ \text{for } m &= 1: M \\ \text{for } j &= 0: K_i + 1 \\ \beta_{i,j}^{(m)} &= \frac{(-\tau_{i,j})^m}{m!} \\ \text{end} \\ u_{i,m-1} &= (a_{i,0} - \gamma_{i,0} \frac{\tau_{i,0}}{m}) \beta_{i,0}^{(m-1)} - \sum_{j=0}^{K_i - 1} (\gamma_{i,j} - \gamma_{i,j+1}) \beta_{i,j+1}^{(m)} \\ &- (a_{i,K_i+1} - \gamma_{i,K_i} \frac{\tau_{i,K_i+1}}{m}) \beta_{i,K_i+1}^{(m-1)} \\ \text{end} \\ \text{end} \\ \text{end} \\ \text{End} \end{aligned}$$

#### Table 1: IEKS Moments Calculating Algorithm

#### 5.2.2 System solution by IEKS

IEKS generates a system transform matrix  $V_i$ , by which the original system is transformed into a compact description.

$$\tilde{G}_{i}\tilde{X}_{i} + \tilde{C}_{i}\frac{d}{dt}\tilde{X}_{i} = \begin{bmatrix}\tilde{B}_{i}\tilde{B}'_{i}\end{bmatrix}\begin{bmatrix}0\\i_{gi}\end{bmatrix}$$
(7)

The compact form can be solved quickly in the time domain by standard integration algorithms. The solution of the original system can be recovered by  $X_i = V\tilde{X}_i$ . But we are only interested about the port currents in the solution. The port currents are directly obtained by  $i_{Ni} = B^T X_i = B_i^T \tilde{X}_i$ .

### 5.3 Macromodel Integration and Top Level Reduced-Model Simulation

After the A2 step of HiPRIME (Figure 2), we transform a block consisting of RLC segments and internal PWL currents into a passive order reduced block with current sources attached only at the ports. The new macromodel of each block is illustrated in Figure 6. The port response of the original block is preserved. We generate such macromodel



#### Figure 6: Equivalent circuit for each block

for each block and the entire network is integrated by combining these macromodels together. Considering each block as a node of the integrated network, we stamp them into the MNA equation of the entire network. The combination of *i*th block and *j*th are given in Equation (8).

- ~

$$\begin{bmatrix} \tilde{G}_i & 0 & -\tilde{B}_{ij} \\ 0 & \tilde{G}_j & -\tilde{B}_{ji} \\ \tilde{B}_{ij}^T & \tilde{B}_{ji}^T & 0 \end{bmatrix} \begin{bmatrix} \tilde{X}_i \\ \tilde{X}_j \\ u_{ij} \end{bmatrix} + \begin{bmatrix} \tilde{C}_i & 0 & 0 \\ 0 & \tilde{C}_j & 0 \\ 0 & 0 & 0 \end{bmatrix} \frac{d}{dt}$$
$$\begin{bmatrix} \tilde{X}_i \\ \tilde{X}_j \\ u_{ij} \end{bmatrix} = \begin{bmatrix} \tilde{B}_{ii} & 0 & 0 & 0 \\ 0 & \tilde{B}_{jj} & 0 & 0 \\ 0 & 0 & E_i^T & E_j^T \end{bmatrix} \begin{bmatrix} u_i \\ u_j \\ i_{Ni} \\ i_{Nj} \end{bmatrix}$$
(8)

In this formulation,  $u_{ij}$  denotes the voltages at the common ports belonging to both the  $i_{th}$  and  $j_{th}$  blocks.  $u_i$  and  $u_j$ stand for the voltages at the ports of the  $i_{th}$  and  $j_{th}$  blocks, which are not connected with each other.  $B_{ii}$  and  $B_{jj}$  denote the connection of the internal nodes to the ports which are exclusive of each other.  $B_{ij}$  denotes the connection of internal nodes of *i*th block to *j*th block and  $B_{ii}$  denotes the connection of internal nodes of *j*th block to *i*th block.  $i_{Ni}$ and  $i_{Ni}$  stand for the equivalent port currents, which are extracted from the inside of the  $i_{th}$  and  $j_{th}$  block respectively.  $E_i$  and  $E_j$  record the connection of the internal nodes to the equivalent port currents for ith block and jth block.

Given this glued macromodels in Equation (8), we can further apply model order reduction and simulation techniques such as PRIMA or IEKS to the top level to save runtime. There may be more than 2 hierarchical levels and the higher level model order reduction is introduced as the following. Let us define some block system matrices for Г

ith and *j*th blocks. Define 
$$G' = \begin{bmatrix} G_i & 0 & -B_{ij} \\ 0 & \tilde{G}_j & -\tilde{B}_{ji} \\ \tilde{B}_{ij}^T & \tilde{B}_{ji}^T & 0 \end{bmatrix}$$
,  $C' =$ 

$$\begin{bmatrix} C_i & 0 & 0\\ 0 & C_j & 0\\ 0 & 0 & 0 \end{bmatrix} \text{ and } B' = \begin{bmatrix} \widetilde{B}_{ii} & 0\\ 0 & \widetilde{B}_{jj}\\ 0 & 0 \end{bmatrix}, \text{ with the internal}$$

current sources disconnected, the system is described as

$$G'\begin{bmatrix} \tilde{X}_i\\ \tilde{X}_j\\ u_{ij} \end{bmatrix} + C'\frac{d}{dt}\begin{bmatrix} \tilde{X}_i\\ \tilde{X}_i\\ u_{ij} \end{bmatrix} = B'\begin{bmatrix} u_i\\ u_j \end{bmatrix}$$
(9)

Use  $\begin{bmatrix} V_1 V_2 V_3 \end{bmatrix}^T$  represent the orthogonal basis of the subspace spanned by the moments of  $\begin{bmatrix} \widetilde{X}_i \widetilde{X}_j u_{ij} \end{bmatrix}^T$  and denote  $\widetilde{G}' = \begin{bmatrix} V_1 \\ V_2 \\ V_3 \end{bmatrix}^T G' \begin{bmatrix} V_1 \\ V_2 \\ V_3 \end{bmatrix}, \quad \widetilde{C}' = \begin{bmatrix} V_1 \\ V_2 \\ V_3 \end{bmatrix}^T C' \begin{bmatrix} V_1 \\ V_2 \\ V_3 \end{bmatrix}$ and  $\widetilde{B}' = \begin{bmatrix} V_1 \\ V_2 \\ V_3 \end{bmatrix}^T B'$ , the *MNA* equation generated by the

higher level order-reduction is noted as

$$\widetilde{G}'\widetilde{Z} + \widetilde{C}' \frac{d}{dt}\widetilde{Z} = \widetilde{B}' \left[ \begin{array}{c} u_i \\ u_j \end{array} \right]$$

#### 5.4 **Preservation of Passivity**

In order to apply PRIMA or EKS to the already reduced model as described in previous session, we need to be sure that the higher level order-reduced models are also passive. The following theorem gives us the warrant.

THEOREM 1. During the hierarchical order reduction, the passivity of the higher level order-reduced macromodel is also preserved. That is to say, the transfer function of the higher level order-reduced system Y(s) satisfies

- 1.  $Y(s^*) = Y^*(s)$  for all complex s.
- 2. Y(s) is a positive matrix, that means,  $Z^{\star T}(Y(s) +$  $Y^{T}(s^{\star}))Z \succeq 0$  for any complex s satisfying  $Re(s) \succ 0$ and for any complex vector z.

PROOF. With the impulse voltages active at ports and applying Laplace transform to Equation (10), the transfer function can be obtained as

$$Y(s) = \widetilde{B'}^T (\widetilde{G'} + s \cdot \widetilde{C'})^{-1} \widetilde{B'}$$

For the system matrices are all real, the first condition is met naturally. To prove the second condition is also met, we start from  $Z^{\star T}(Y(s) + Y^T(s^{\star}))Z$ 

$$= Z^{\star T} (\widetilde{B'}^{T} (\widetilde{G'} + s \cdot \widetilde{C'})^{-1} \widetilde{B'} + \widetilde{B'}^{T} (\widetilde{G'} + s^{\star} \cdot \widetilde{C'})^{-T} \widetilde{B'}) Z$$
  
Setting  $w = (\widetilde{G'} + s^{\star} \cdot \widetilde{C'})^{-T} \widetilde{B'} z$  and  $s = j \varpi + \sigma$  yields  
 $Z^{\star T} (Y(s) + Y^{T}(s^{\star})) Z$   
 $= w^{\star T} \left[ (\widetilde{G'} + (j \varpi + \sigma) \cdot \widetilde{C'}) + (\widetilde{G'} + (j \varpi - \sigma) \cdot \widetilde{C'})^{T} \right] w$   
 $= w^{\star T} \left( \widetilde{G'} + \widetilde{G'}^{T} \right) w + w^{\star T} \cdot \sigma \cdot \left( \widetilde{C'} + \widetilde{C'}^{T} \right) w$   
Since  $C$ , and  $C$ , are summatic  $C^{T} + C = 2C$ , and  $C^{T} + C$ 

Since  $C_i$  and  $C_j$  are symmetric,  $C_i^T + C_i = 2C_i$  and  $C_j^T + C_j$  $C_j = 2C_j$ . Along with the fact that  $C_i$  and  $C_j$  are nonnegative definite, it yields  $w^{\star T} \cdot \sigma \cdot \left(\widetilde{C'} + \widetilde{C'}^T\right) w$ 

 $= \sigma \cdot w^{\star T} V_1^T V_i^T 2C_i V_i V_1 w + \sigma \cdot w^{\star T} V_2^T V_j^T 2C_j V_j V_2 w \succeq 0$ for any complex vector z and positive  $\sigma$ . Since  $N_i$  and  $N_j$ are symmetric nonnegative definite matrices, we have  $w^{\star T} \left( \widetilde{G'} + \widetilde{G'}^T \right) w$ 

$$= w^{\star T} V_1^T V_i^T (G_i^T + G_i) V_i V_1 w + w^{\star T} V_2^T V_j^T (G_j^T + G_j) V_j V_2 w$$
  
$$= w^{\star T} V_1^T V_i^T \begin{bmatrix} 2N_i & 0\\ 0 & 0 \end{bmatrix} V_i V_1 w$$
  
$$+ w^{\star T} V_2^T V_j^T \begin{bmatrix} 2N_j & 0\\ 0 & 0 \end{bmatrix} V_j V_2 w \succeq 0$$

for any complex vector z. From the above, we can conclude that the second passivity condition is satisfied.  $\Box$ 

#### 6. **EXPERIMENTAL RESULTS AND FUTURE** WORK

In this section, the speed and accuracy of HiPRIME and IEKS are demonstrated and compared with other methods. We use mesh networks to model the power delivery network, which consist of lumped RC/RLC segments with many current sources attached inside. The typical values for each lumped RC/RLC segment are  $R = 0.2\Omega$ , L = 1.0pH and C = 0.024 fF.



Figure 7: Accuracy analysis of RC circuit case: (a) waveform result of HiPRIME, IEKS and Back Euler (b) error spectrum of HiPRIME and IEKS



Figure 8: Accuracy analysis of RLC circuit case: (a) waveform result of HiPRIME, IEKS and Back Euler (b) error spectrum of HiPRIME and IEKS

The voltage waveforms of HiPRIME are compared with those of IEKS and Back Euler. The cases of RC circuit and RLC circuit are tested and the results are shown in Figure 7 and Figure 8 respectively. In the RC case, the waveforms are indistinguishable and the errors for 80% time intervals are within 0.001%. In the RLC case, the waveforms by HiPRIME and IEKS approximate the one by Back Euler method very well and the errors for 50% time intervals are within 0.001%.

We implement IEKS in C code and test it on a PIII 933MHZ machine. The result is compared with our time domain solver InductWise [10] and Spice. Table 2 summarizes the runtime results and the runtime comparisons are shown in Figure 9. Significant speed improvement, 700X faster than Spice, is observed and the same tendency that the speed up increases with larger circuit size is shown.

| Circuit<br>Size | IEKS<br>(s) | InductWise<br>(s) | Speedup<br>(X) | Spice<br>(s) | Speedup<br>(X) |
|-----------------|-------------|-------------------|----------------|--------------|----------------|
| 7861            | 1.46        | 14.76             | 10.1           | 697.13       | 477.48         |
| 14081           | 3.88        | 29.77             | 7.67           | 2728.18      | 703.14         |
| 43541           | 13.49       | 107.05            | 7.93           | _            | -              |
| 89201           | 35.33       | 244.95            | 6.93           | _            | _              |

Table 2: Runtime of IEKS, InductWise and Spice



Figure 9: Runtime Analysis of (a) IEKS, Induct-Wise and Spice (b) IEKS and InductWise

We also implement HiPRIME, flat IEKS and Back Euler in Matlab and test them on Sun Ultrasparc V. Table 3 and 4 summarize the runtime results and the runtime comparisons are shown in Figures 10 and 11 for RC and RLC circuits respectively. From the figures, we can see the tendency that the speed up is more impressive when the circuit size increases.

| Circuit<br>Size | HiPRIME<br>(s) | IEKS<br>(s) | Speedup<br>(X) | Back Euler<br>(s) | Speedup<br>(X) |
|-----------------|----------------|-------------|----------------|-------------------|----------------|
| 203             | 2.52           | 0.89        | 0.36           | 17.1              | 6.79           |
| 803             | 2.98           | 1.97        | 0.66           | 78.3              | 26.3           |
| 2403            | 4.39           | 6.17        | 1.40           | 288.9             | 65.8           |
| 5003            | 7.93           | 13.18       | 1.66           | 760.1             | 95.8           |

Table 3: Runtime of RC circuit case

| Circuit<br>Size | HiPRIME<br>(s) | IEKS<br>(s) | Speedup<br>(X) | Back Euler<br>(s) | Speedup<br>(X) |
|-----------------|----------------|-------------|----------------|-------------------|----------------|
| 443             | 2.76           | 1.29        | 0.47           | 29.5              | 10.7           |
| 1883            | 3.87           | 5.08        | 1.31           | 129.8             | 33.5           |
| 3843            | 6.72           | 12.94       | 1.92           | 276.9             | 41.2           |
| 5803            | 11.81          | 27.15       | 2.29           | 427.2             | 36.6           |

#### Table 4: Runtime of RLC circuit case

Since the runtime of PRIMA is proportional to number of ports, we plan to investigate realizable model order reduction algorithm whose runtime is ports size independent. Also we would like to point out that the focus of this paper is not on the partition algorithms. However, a good



Figure 10: Runtime analysis of RC circuit case: (a) runtime of HiPRIME, IEKS and Back Euler (b) runtime of HiPRIME and IEKS



Figure 11: Runtime analysis of RLC circuit case: (a) runtime of HiPRIME, IEKS and Back Euler (b) runtime of HiPRIME and IEKS

partition algorithm is important for the performance of the hierarchical and passive model order reduction algorithm.

### 7. ACKNOWLEDGEMENTS

This work is partially supported by NSF Grant CCR-0093309, Intel Corp., and Faraday Inc.

## 8. **REFERENCES**

- L. O. Chua, C. Desoer and E. Kuh, *Linear and Nonlinear Circuits*, McGRAW-HILL Book Co., 1987.
- [2] J. Cong and C. K. Koh, Interconnect Layout Optimization Under Higher-Order RLC Model, IEEE/ACM International Conference on Computer-Aided Design, pp. 713-720, CA, Nov. 1997.
- [3] P. Feldman and R. W. Freund, Efficient linear circuit analysis by Pade approximation via the Lanczos process, IEEE Trans. Computer-Aided Design, vol. 14, no. 5, pp 639-649, May 1995.
- [4] Q. J. Yu and E. S. Kuh, Exact moment model of transmission lines and application to interconnect delay estimation, IEEE Trans. on VLSI Systems. vol.3, pp. 311-322, June 1995.
- [5] L.T. Pillage, R.A. Rohrer and C. Visweswariah, *Electronic Circuit and System Simulation Methods*, McGRAW-HILL Book Co.
- [6] A. Odabasioglu, Mustafa Celik and L.T. Pillage, PRIMA: Passive Reduced-order Interconnect Macromodeling Algorithm, ICCAD, pp. 58-65, 1997.
- [7] Janet M. Wang and Tuyen V. Nguyen, Extended Krylov Method for Reduced Order Analysis of linear Circuits with Multiple Sources, DAC, 2000.
- [8] Min Zhao, Rajendran V. Panda, Sachin S. Sapatnekar, Tim Edwardsand, Rajat Chaudhry and David Blaauw, *Hierarchical* Analysis of Power Distribution Networks, DAC, 2000.
- [9] Sani R. Nassif and Joseph N. Kozhaya, Fast Power Grid Analysis, DAC, 2000.
- [10] Tsung-hao Chen and Charlei Chung-ping Chen, Efficient Large-Scale Power Grid Analysis Based on Preconditioned Krylov-Subspace Iterative Methods, DAC, 2001. http:// vlsi.ece.wisc.edu/Tools.htm
- [11] Nagel, L.S., SPICE2: A Computer Program to Simulate Semiconductor Circuits, ERL Memo ERL-M520, University of CAliforniaa, Berkeley, 1975.