DC IR-Drop Analysis of Multilayered Power Distribution Network by Discontinuous Galerkin Method With Thermal Effects Incorporated

Due to the temperature-dependent resistivity of power distribution network (PDN) interconnects, a wiser and necessary strategy is to proceed the electrical–thermal cosimulation in order to include the thermal effects caused by Joule heating. As a natural domain decomposition method (DDM), in this article, a discontinuous Galerkin (DG) method is proposed to facilitate the steady-state electrical and thermal coanalysis. With the intention to avoid solving a globally coupled steady-state matrix system equations resulted from the implicit numerical flux in DG, the block Thomas method is deployed to solve the entire domain in a subdomain-by-subdomain scheme. As a direct solver, the block Thomas method is free of convergence problem frequently occurring in iterative methods, such as block Gauss–Seidel method. The capability of the proposed DG method in handling multiscale and complex 3-D PDNs is validated by several representative examples.

(PDN) [1], while the ever-rising integration capacity increases the current density, which brought significant challenges to the design of PDNs in order to achieve the intended power integrity (PI) as well as the signal integrity (SI) performance. Moreover, the nonideal resistance of PDN interconnects further deteriorates the PI performance due to the Joule heating, where the thermal effects not only increase the dc voltage drop but also possibly put the high-speed signals under risk. Thereby, it is excessively necessary and important to implement the electrical-thermal cosimulation [2] in order to maintain the performance of the whole integrated circuits (ICs). With this objective in mind, various approaches have been proposed in the past few years.
The first kind of approaches for the dc IR-drop analysis is based on equivalent circuit models [3], [4] in terms of finitedifference discretization, in which the discretized Laplace equation governing the potential is first transformed into a 2-D one, and then, it is further approximated by a resistive network. However, the IR-drop is lack of consideration of the thermal effect.
To capture the thermal effects, soon after, numerical algorithms, such as finite volume method (FVM) [5] and finiteelement method (FEM) [6]- [8], are proposed to solve the electrical and the thermal equations together. Due to the temperature-dependent material properties, the two equations are solved in an iterative scheme.
In this article, a discontinuous Galerkin (DG) method [9]- [11] is proposed to conduct the rigorous dc-IR drop analysis with thermal effects included. As the combination of FVM and FEM, the DG method inherits the advantages of FVM and FEM. Like FVM [12], solutions in the adjacent subdomains communicate with each other by the numerical flux, and it is thus a natural domain decomposition method (DDM); on the other hand, it is capable of flexibly making use of high-order basis functions to approximate the solution, and thereby such as FEM [13], it can achieve high-order accuracy. However, for the dc-IR drop analysis, both the electrical and the thermal equations are in a steady state, and the numerical flux used for information exchange is in an implicit form, consequently resulting in a globally coupled matrix equation, which dramatically complicates the problem. To beat this deficiency, the whole computational domain is first tessellated into a number of nonoverlapping subdomains in a nested This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ sequence, and then, the system matrix equation can be formulated into a block tridiagonal matrix system where the upper and the lower blocks resulted from the incoming numerical flux represent couplings between the neighboring subdomains, and finally, the block Thomas method [14] is applied to solve the implicitly coupled matrix system in a subdomain-bysubdomain scheme. Thereby, the advantage possessing by the explicit DG algorithm is kept, namely, it only needs to solve a number of smaller matrix equations instead of a global one.

A. Electrical Equation
Based on the vector form of Ohm's law, for the dc-IR drop analysis, the current density J in a conductor is defined as where σ is the electrical conductivity of the conductor, and the electric field E can be expressed as a function of electrical potential φ, namely With (2), (1) can be reformulated into Further resorting to the equation of continuity of electric charge, the current density J should satisfies the divergencefree condition given by Based on the governing equations in (3) and (4), the current density as well as the potential distribution over the whole PDN can be uniquely determined in terms of the following boundary conditions: where dc is the Dirichlet boundary surface with voltage already given, R is the impedance boundary surface connecting with an external load such as decoupling capacitors, match network, chips, and memories, R L is the resistance value of the external equivalent load, and S is the area of the cross section of the impedance boundary surface. In order to solve (3) and (4) and y the proposed DG method, the computational domain is first decomposed into a number of nonoverlapping and conformal subdomains k (k = 1, . . . , K ), and each subdomain is further teared into a series of tetrahedrons i k (i = 1, . . . , K k ); then, the Galerkin test is applied to (3) and (4) pertinent to the kth subdomain, which leads to the following two new formulations: where ϕ k p and ψ k q are the pth and the qth nodal basis functions for the potential and the current density, respectively;n is an unit normal vector pointing from the present subdomain k to its neighboring subdomain l, andJ k andφ k are called numerical flux defined by [15] n ·J k = C 10 (n · J k +n · J l ) + C 11 (n · J k −n · J l ) In this article, the upwind flux is employed, where the parametric coefficients are defined as: C 10 = 0.5, C 11 = 0, C 12 = −4, C 20 = 0.5, C 21 = 0, and C 22 = 0.
To obtain a fully discrete matrix system, next, the potential φ k and the current density J k are further approximated by nodal basis functions where P k and Q k ξ denote the total number of basis functions for the expansion of potential φ k and current density J k ξ (ξ = x, y, z) in the kth subdomain, respectively; γ k p and ν k ξ,q are the corresponding expansion coefficients to be determined. Substituting (7)-(10) into (5) and (6), the following matrix equations can be reached: with f k dc rising from the Dirichlet boundary condition, and where the general definitions of matrix entries mentioned earlier are given by As the dc current flowing through metal conductors, it will be accompanied by Joule heating that is evaluated by Thus, the temperature of conductors will rise, which consequently alters the resistivity of conductors in terms of the relation given by with ρ 0 denoting the resistivity at temperature T 0 and α representing the conductor's temperature coefficient. As a result, to get the dc IR drop with thermal effects included, the thermal equation must be solved simultaneously.

B. Thermal Equation
Though the temperature is a function of time, only the temperature at the steady state is more of interest, and thus, the steady-state thermal equation is going to be solved, which is defined as where q is the thermal flux, κ is the thermal conductivity, Q is the thermal source given in (31), and q and T are subjected to the boundary conditions given by with T a denoting the ambient temperature and h representing the convection coefficient.
Similar to the DG formulation process for the electrical equation, the Galerkin test is applied to (33) and (34) whereφ k p andψ k ξ,q are the pth and qth nodal basis functions for T and q ξ (ξ = x, y, z) in the kth subdomain, respectively; the other two termsq k andT k are termed numerical flux defined by [15]- [17] n ·q k =Č 10 n · q k +n · q l +Č 11 n · q k −n · q l +Č 12 T k − T l (39) In this article, the upwind flux is employed [15], i.e., C 10 = 0.5,Č 11 = 0,Č 12 = −4, andČ 20 = 0.5,Č 21 = 0, andČ 22 = 0. Next, substituting (39) and (40) into (37) and (38) and further expanding the unknowns by the nodal basis functions: representing the unknown expansion coefficients, the following matrix equations can be reached after a few mathematical manipulations: where the definitions of matrices as well as the corresponding entries involved in (41) and (42) can be conveniently derived by permuting the basis functions and related material parameters given in the electrical equations from (11) to (30).

C. Electrical-Thermal Coanalysis
Due to the temperature-dependent electric conductivity, the electrical matrix equations (11) and (12) couple with the thermal equations (41) and (42), resulting in a nonlinear matrix system. To solve it accurately and efficiently, the electrical and the thermal equations will be iteratively solved using the fixed point method. The pesudocode of the solving scheme is shown in the following.
Step 1 : Solve the electrical equation Step 2 : Calculate Joule Heating Step 3 : Solve the thermal equation Step 4 : Update the resistivity Step 5 : Calculate L 2 relative norm error δρ If δρ<ζ 0 then Output solutions of interests Exit End If Step 6 : i = i + 1 End For

D. Block Thomas Method
Due to the implicit form of the numerical flux, the established matrix equations in (11), (12), (41), and (42) will couple with matrix equations in the adjacent subdomains, which results in a huge matrix system, e.g., the dimension of the electrical matrix equation is around K k=1 P k + Q k To solve the whole matrix system in a subdomain-bysubdomain scheme, either the block Gauss-Seidel method or the block Thomas method can be referred. The block Gauss-Seidel method is an iterative method, and thus, its convergence cannot be guaranteed if the formulated matrix equation is not symmetrically positive definite or diagonally dominant. Instead, the block Thomas method is a direct solver [18], and thereby, it is free of convergence problem. To implement the block Thomas method, the teared subdomains are arranged in a nested sequence, which results in a serial coupling scheme. As a result, a block tridiagonal matrix pattern is formed.
To have an intuitive insight into the tridiagonal matrix system formulation, the matrix equations (11) and (12) for the electrical potential are rewritten as a compact form Because of the nested subdomains, the whole matrix system can be written as a block tridiagonal one, that is To proceed the block Thomas method, the matrix on the lefthand side should be written as the production between a lower and a upper block tridiagonal matrix in terms of LU decomposition, namely ⎡ where Instead of solving (48) straightforwardly, the solution υ can be obtained in two steps: 1) solving L Bυ = f first and 2) then solving U B υ =υ. The pseudocode of the block Thomas method is provided as follows.

Solving Scheme:
Block Thomas Method Solve F 12 : It is noted that the whole domain is solved by a subdomainby-subdomain strategy, thus dramatically decreasing the dimension of the matrix equation, i.e., the dimension of the matrix equation to be solved now is P k + Q k x + Q k y + Q k z by P k + Q k x + Q k y + Q k z , which no doubt significantly lowers the computational request.
The abovementioned formulations are also valid for the thermal matrix equations in (41) and (42), and the details will not be repeated here due to the limited space.  For all examples, the first-order nodal basis functions are used, i.e., the basis functions are located at four vertices of each tetrahedron. In addition, the ground planes in the following examples are virtual ones, and the electric potential on them is assumed to be zero.

A. Thin Power Plane
With the aim to verify the accuracy of the proposed algorithm, a thin metal plane shown in Fig. 1 is investigated. A voltage source V DD is placed at the left-end surface of the plane, while the right-end surface of the plane is attached to a resistive load with R L = 0.01458 . In addition, the convection boundary condition is applied at the top and the bottom surfaces with convection coefficient h = 500 W/(m 2 · K). The resistivity ρ 0 , the thermal conductivity κ, the temperature coefficient α, and the ambient temperature T a are set as 1.8 × 10 −8 m, 400 W/(m · K), 0.0039 K −1 , and 300 K, respectively. The plane is split into 20 subdomains along the length direction, which results in 92 410 and 92 500 unknowns for the electrical and the thermal matrix system, respectively.
The accuracy of the proposed DG algorithm is validated by comparing it with those obtained from FEM simulation. As shown in Fig. 2, the results are consistent with each other excellently. Moreover, the results without thermal effects are also given and compared with the analytical results calculated by V drop = V DD (R/R + R L ) with R = ρ 0 l/S and S denoting the area of the cross section. The relative error between the   proposed algorithm and the reference solution is also plotted in Fig. 3.
To have a basic insight into the convergence rate of the iterative scheme, the number of iterations to reach the intended tolerance is shown in Fig. 4. As can be seen, it requires more iterations with the increase of V DD , which is due to that more Joule heating is generated and thus alters the resistivity of the conductor more dramatically. The corresponding CPU time and the memory consumption (V DD = 10 V case) for the proposed DG method are around 3 min and 600 M, while FEM takes around 10 mi with memory cost less than 100 M. As can be seen, although the proposed algorithm has better CPU efficiency, its memory cost is also higher than FEM, which is due to that the matrixF in (50) is no longer a sparse one.

B. Swiss Cheese Effect
A large number of holes are formed around vias passing through the power planes, and this "Swiss cheese" effect, as shown in Fig. 5, can seriously deteriorate the PI performance of the power plane. The narrower space between the  adjacent voids significantly increases the current density, thus resulting in a large amount of Joule heating. Consequently, the impedance of the power plane becomes larger as well, which no doubt is detrimental to the dc IR-drop performance.
To verify the abovementioned statement, a voltage source with V DD = 1 V is attached to the left-end surface, while the right end is connected to a resistive load with R L = 0.01458 ; the convection boundary condition is applied simultaneously to the top and the bottom surface with convection coefficient h = 500 W/(m 2 · K). The whole plane is teared into 20 subdomains along the length direction, resulting in 1 35 094 unknowns for the electrical equation and 1 35 216 unknowns for the thermal equation. First, the voltage drop is characterized based on the proposed electrical-thermal cosimulation algorithm, as shown in Fig. 6. For accuracy validation, the reference result calculated by FEM is also given. Excellent consistency is obviously noted. To have a better insight into the impact of thermal effect on the resistance, the voltage drop excluding thermal effect is also calculated by the proposed DG method with the result shown in Fig. 7. The maximum voltage drop considering thermal effect is increased from 0.1528 to  0.1876 V, i.e., 23% more voltage drop is sacrificed due to Joule heating. Second, the temperature profile of the power plane obtained from the proposed DG algorithm is also presented in Fig. 8. As can been seen, the temperature in the proximity of holes is obviously higher. The maximum temperature is around 388 K, while the maximum temperature without "Swiss cheese" effect obtained in the first example is around 343 K, which further reveals the negative impact of "Swiss cheese" effect. With regard to the accuracy, the simulation results by FEM are also provided for verification. Again, very good agreements are achieved. For this example, the CPU time and the memory cost of the proposed algorithm are around 4 min and 800 M, respectively, whereas FEM takes about 12 min with memory cost less than 100 M.

C. PDN Composed of 12 Vertically Stacked Power Planes
In the third example, a 3-D PDN with 12 layer power planes is investigated. As shown in Fig. 9, the vertically stacked power planes are chained by vias in gray color.  The length, the width, and the thickness of each power plane are 14.5, 5.08, and 0.0508, respectively; the radius and the height of vias are 0.127 and 0.508 mm, respectively. The material properties are the same as those in the first example. The voltage source V DD = 1 V is placed over the top surface of the black via, the green vias are connected with resistive loads with R L = 0.5 , and the air convection boundary with h = 500 W/(m 2 · K) is placed over the bottom surface of power plane in the first layer. The entire domain is tesselated into 24 subdomains along the vertical direction in a nested sequence, which produces 7 21 783 unknowns for the electrical equation and 7 21 796 unknowns for the thermal equation. The electrical-thermal cosimulation takes five iterations to reach the predefined stop criterion δρ = 10 −3 with convergence scheme shown in Fig. 10. The corresponding CPU time is around 4 h and 20 min, while the memory cost is about 12 G. The 3-D potential distribution and the 3-D temperature profile are plotted in Figs. 11 and 12, respectively. It is noted that the temperature of the via attached to the voltage source is obviously higher than others due to the higher current density, and the maximum IR-drop is 45.6 mV, while the IR-drop without thermal effect is 33.5 mV, i.e., resulting in 36.1% more voltage loss.

D. PDN Composed of 15 Layers of Vertically Stacked Power Planes
As the last example, a global PDN for a 3-D IC is studied, in which 15 layers of power planes are vertically stacked along the thickness direction, and a number of resistive loads are placed at each power plane, as shown in Fig. 13, in which the red vias are power vias used to chain the power planes together, whereas the green vias are signal vias connecting to   the CPU or the memory dies. The length, the width, and the thickness of each power plane are 13.05, 7.08, and 0.0508 mm, respectively; the radius and the height of power vias are 0.127 and 0.508 mm, whereas the radius and the height of the signal vias are 0.127 and 0.381 mm, respectively. To consider the load effects of dies, all signal vias are terminated by five parallel connected resistive loads with R L = 50 . To analyze the dc-IR drop, a 1-V voltage source used as the dc supply is attached to the top surface of the black via, and the air convection cooling with convection coefficient = 100 W/(m 2 ·K) is placed over the bottom surface of the power  plane in the first layer (bottom layer). The whole structure is decomposed into 30 subdomains along the thickness direction, which introduces 8 62 326 unknowns for the electrical equation and 8 62 336 unknowns for the thermal equation, but the maximum dimension of the subdomain matrix system is 44 612 × 44 612. In Figs. 14 and 15, the calculated voltage drop and the temperature distribution are shown, where the maximum IR-drop is 18.1 mV (the IR-drop without thermal effect is 14.8 mV and 22.3% more loss is introduced due to Joule heating) and the lowest and the highest temperature are increased to 354 and 379 K, respectively. The required iteration number with the desired accuracy σ ρ = 10 −4 is 5, and the convergence scheme is shown in Fig. 16. The corresponding CPU time for the proposed DG method is about 6 h and 10 min, and the memory cost is around 15 GB.

IV. CONCLUSION
A DG method is developed in this article to investigate the dc IR-drop of ICs, where the impact of thermal effects due to Joule heating is included. The temperature-dependent material property results in a nonlinear matrix system, which is iteratively solved by the fixed point method. Due to the implicit numerical flux, the unknowns couple with each other in different subdomains, which causes a huge matrix system. To keep the advantage of DG in solving the entire domain in a subdomain-by-subdomain scheme, the block Thomas method is resorted, which is a direct solver, and thus, it is free of convergence issues. The proposed approach is benchmarked by four examples by way of comparing with FEM reference and/or analytical solutions.